Binary library packages using C++ modules

Posted on March 30, 2024 by ethouris

0. Introduction

As I wrote previously about C++ modules, I have found out that the Standard Committee has likely completely ignored the today practice of software development when designing the C++ modules, just like before with “export templates”: C++ cannot exist without the environment allowing to turn the language statements into the executable code. And that the only one used today is clumsy and prone to ABI compatibility problems.

As you know, the first C++ compiler has been created as a translator to C, which then was compiled using the C compiler, as it was easiest to do. Later compilers have been translating C++ directly, but they were still producing the same output as the C compiler: an object file containing variables and functions. Distribution of libraries was also done the same way: whatever can be aligned to a variable or function was put into the library, and everything else was provided by header files.

It wasn’t the best system even for C language, but in case of C++ it caused much more problems. Some things had to be emulated, some generated special way, even new features were required in the linker, like “strong linkage” support (C language uses vague linkage only). But the most important problem was when the shared libraries were introduced, suddenly language entities have been split into “static” and “dynamic” part, which not only isn’t defined in the language, but no development tools are even capable of checking the resulting ABI incompatibilities (it’s possible only through special tools that maintain access to multiple versions of the source code that should guard the compatibility version line).

God knows that C++ needs the new modularization system, and urgently. But this requires to create at least some new standard for linkage, libraries, distribution, and – what is most problematic for C++ – the compiler-independent (although no problem with platform-dependent) format for the interface.

1. The present situation

The software distribution and dependencies were done with the only modularization system available, that is, the one created for C language, which relies on the following rules:

There is a library file that contains two types of entities, mapped to a named symbol:
- A variable (saved space of particular size, possibly with some initial contents)
- A function (executable code fragment)
As an interface part, there’s distributed a header file containing language-defined entities, which should include symbols that should be mapped to entities found in the library
The compiling part uses language entities provided by the header file, which may leave named references to other symbols to be provided “elsewhere” (that is, by a library file)
Once the single parts are compiled, the linker should create the target by providing entities from the library to the object files that refer to them by name.

There are also several different forms in which the target can be provided:

An archive with objects with assigned symbolic names, which may also contain references to symbols not provided in this archive (aka “static library”)
A self-executable file with entities assigned to symbols, which can be used in two forms:
- as an executable file, if it provides the main execution entry
- a shared library otherwise

All of these forms can contain linkage entities – variables or functions – which may contain references to other variables or functions that have to be found possibly in other libraries used for linkage. The linkage can be done as a part of the build process, in order to resolve entities already there, or done by the OS-provided tools when running the executable file.

2. Problems before C++

As I said, this is also not the best modularization system for the C language – just in case of the C language it wasn’t seen as a problem. There were actually two reasons for it: the things provided in the header files were minor and little influential to the overall build process, and second was, well, that the C language programmers are – as my experience shows – used to just live with problems, or dodge them by using adhesive tape, instead of solving them.

The C language exists since 1978 year and many problems were identified already in the beginning of 1980, and lots of them have still not been solved (see the Wikipedia page for C23 to see what kind of problems they solved “after all these years”, and see that there’s still no function that returns true when two strings are equal). No different is the linkage and modularization system. C++ has been provided with lots of resolution for many problems (although not that new features didn’t bring in new problems). But it’s still improving. I hope that making you aware of how many problems haven’t been solved for such a long time, and that they still do apply to C++ just because it uses the same modularization system, can convince you how urgently the new and improved modularization system is needed for C++. Even the C language might benefit from it.

For starters then: there do exist entity types in the C language that can be used in the sources, influence the result of compiling, but cannot be provided anyhow in the library files:

Preprocessor macros
Global constants (as compile-time constants – that is, for example, which defines a global array size)
Structures
Inline functions

Let’s say that the preprocessor is kind of another story. The macro defined as #define ESIZE 64 and defined as #define END } are treated the same way by the language, both are allowed and both can be placed in the header file and this way be shared between the compiled source files. Provided in the first place the global compile-time constants and inline functions by C++ have given an opportunity to take this out of the preprocessor and hence increase a chance to provide them anywhere else than header files – but this didn’t happen so far. But then, as inline functions were added to C as well – although I’m not sure if it changed anything except the hint for inlining because C still uses only the vague linkage mode – this could be also done without the use of preprocessor.

As libraries feature only variables and functions, they never are complete. That itself wasn’t a problem, until the shared libraries came to life. For static linkage there’s no difference which parts are in the libraries and which have to be completed by information provided in the header – in result everything was in one place. With shared libraries things got different: entities provided in the header became a “static part”, and those in the library – “dynamic part”. The difference was that:

The “static part” are things that are taken by the compiler when compiling a single object file and things provided by it are “encoded inside” (“hardcoded” I would say). The header file only allows them all to be encoded the same way (although the compiler is incapable of checking if you have provided entities tied to particular symbols using the same header file – not even mentioning compiling with different environment of macros, compiler options etc.).
The “dynamic part” are things contained in the libraries – variables and functions – that are linked to the requiring executable file when it is being executed. This happens at a completely different time than building the requiring target file and it is not guaranteed to access always the same library file (which is also an advantage known as “separated upgrade”).

A short example: If you have a function defined in the source file, and the header provides only its signature, it’s dynamic. The same function provided in the header file with inline modifier is static. A class definition provided in the public header of the library is completely static. Every method defined inside it is static. Unless you only announce the method so that it is defined in the source file separately, and this way it’s dynamic. Do you have any language marker as a modifier to mark these things? Of course not. The language and the compiler are not even aware of these things. But they do exist and every programmer must be aware of it. That’s why the “pimple” pattern is so popular. Not only because it speeds up compiling during development. Also because you can freely change whatever you want in the “implementation” class – add or delete fields, change them, add or modify methods inside, whatever. Only one thing you can’t change: the existing methods in the “interface class”, which is the only visible class in the public header file. As the implementation class isn’t provided in the public header file, it’s the dynamic part of the library. Of course, the “pimple” pattern causes then problems with the basic feature of OO programming: derivation. But without this, classes are prone to ABI compatibility problems.

Imagine that you have a library that provides a structure in the header file. Your application is using this header and this structure, as the library provides a function that fills in the fields of this structure. You call this function in your application. The library is shared and it links to your application during execution time. Now you get the new version of the library and you install this in your system. You run your application and it crashes. Why? Because the developer of the library has added two extra fields in the structure in this new version. The function that fills this in is referring to these fields. But your object is the old version of the structure from before the upgrade, so your object is smaller and the fields that this function tries to fill in refer to a memory already outside of this object (note that what really happens here is memory override and a crash is only one of the possibilities here).

How do you think, how easy it is to make such a problem? It looks extremely easy. It doesn’t happen so often only because today developers are well aware of it. There are also already tools that allow to verify problems like this – just like there are lint tools, code style tools, coverage, and all that. But there’s nothing defined on the language level and nothing that the compiler can make you aware of.

But the fact that problems are not trivial to avoid is only one side of the story. The other is: imagine that this structure is somehow added to the library. There are known fields – even if not their types as such, then at least their size and hence the offset, as well as the size of the structure. And when fields are referred to, there aren’t just assembly code commands with appropriate immediate mode offset to refer to the particular field, but – just like with variables and functions – there’s a reference to appropriate name symbol in the library. This way, if your application allocates the object of this structure, it allocates the size as specified by the library, uses fields’ offsets as specified in the library using constant values provided in the library information. It doesn’t mean that all ABI compatibility problems disappear, but at least they are easier to track. There still are rules for it, but they are way lighter and easier to avoid: in this case all you are not allowed to do is to remove fields, or change the type of existing fields, but you can still add new ones or reorder existing ones. And the worst thing that can happen if you violate this rule is that you have a runtime linkage error because your application refers to a nonexistent field. It doesn’t compare to a problem of memory override that would be caused by this situation today.

3. Adjustment to C++

There are things that the standard committee gives no damn of, and then they had to be solved with adhesive tape by the compiler vendors. It simply starts with a very simple thing: the first vendor of the compiler provides his own way how to do things, and then there are other compiler vendors that, volens nolens, adjust to the existing situation, and that’s how the “standard” comes to life. Fortunately, there’s no need for a fully portable standard for it – things must be compatible only within the frames of a single operating system. But still, some standard must be defined, or otherwise libraries compiled by one compiler cannot be used by applications compiled with a different compiler.

In the early days, I remember that in the 1990 years, there were two different compilers on Windows, one provided by Microsoft, the other provided by Borland (that company is a kind of different story in itself, but that’s not important here), there was also Watcom, and at some point MinGW was screwed in, each one having a different system for mangling C++ entities. At some point all have likely aligned to the Microsoft’s definition, as without it it was not possible to distribute software dependencies. Early library vendors were also providing binary packages with alternative “Microsoft version” and “Borland version”. On Linux things were looking much better since the beginning because it has only one compiler available: GCC (“GNU C Compiler” initially, but after adding support for C++ and Fortran it was renamed to “GNU Compiler Collection”), and this one has been defining the standard here. I’m not sure how it was in case of other compiler vendors on other POSIX systems (like Sun’s C++ compiler, for example), but as gcc has been available on all POSIX systems, it has de facto dictated the standard there.

What is even funnier, this standard was changing, as changes were needed with the evolving language, making this way libraries compiled with the earlier standard incompatible – and so on. Things look much better now, but I have still never heard about any standard committee to standardize it. Pity, but that’s our reality. The standard for this has been defined by:

gcc development team, in case of Linux (clang followed, for example)
Microsoft in case of Windows
Apple in case of Mac (not sure which compiler they were using before clang – might be that since clang they have even adjusted to the latest standard provided by GCC, I didn’t check)

What things were these C++-specific things? For example:

The way to encode the function-like or global-variable C++ entity identifier as a linker symbol name (“mangling”)
The way to encode “physical” part of a class, that is, the “class characteristic object” (containing, at least in the current C++ standard, the VTable and the RTTI data).

4. Linker features required

Let’s start with this most general feature: The linker should feature immediate constants.

I’m not an expert of the linker, and I think an appropriate feature is provided already, but it’s not in use by compilers. It’s about having a value assigned to a symbol that is initialized and the value itself will be filled as a replacement of a symbol, not the alleged address of the entity (as it is for functions and variables, even in read-only flavor). Such a value should be also able to be used and interpreted by the linker itself, if this defines, for example, a size of some other data.

Here is required such a possibility that when an object is referred to, its address is replaced by the linker, but when you need to reach out to one of the fields of primitive type (integer or pointer) there’s an offset used likely with some address that is defined by the compiler directly in the assembly code (“hardcoded”). With this feature, this offset, instead of being directly encoded, should also use a symbol replacement. However this time the linker would have to replace it not with the address of the entity, but with the content that is written in the symbol definition.

Similar, although simpler thing is with the size of the structure. Whenever you use the sizeof (MyStruct) expression, it should refer to an appropriate symbol in the object file with a constant that contains the value being this size. This constant is then directly replaced in every place where the reference to this symbol is used. It might even have an appropriate naming convention, as long as linkers support appropriate types of symbols, so this is not just a “constant”, but a constant being a size of the structure, and therefore it holds an appropriate naming convention and name mangling.

This can be then extended to compile-time constants. You theoretically compile-in things here, but still, there’s no need to physically read them from memory if it can be provided already as an immediate value.

A bigger problem is that constants in the language can be used by way different ways than the address-based entities. Offsets in structures, their size, or just constants to be used in some evaluations are piece of cake. Worse things are such as, for example, when your constant defines a size of an array, which is a global variable. Hence the linker would have to provide also a possibility to not only support constants, but also to use constants when resolving the variable: linker must also “link itself” by referring to the size of the variable that is provided also by the constant found at a given symbol.

Even bigger problem is with inline functions and templates. But then, very few of them are to be solved by a linker. From the linker we need only these things:

Have a symbolic entity type, which define the convention that the name is using, and also how this should be understood
Support immediate constants, where the linker fills in the contents of the symbol, not its address
Support global variables with symbol-defined size

5. C++ entities that could use new linker features

It doesn’t mean that C++ compiler would have to make use of these new linker features in the new modularization. Whether such a linker exists, it depends on the system, and it only allows or not to provide particular language features, such as sharable classes or templates. Might be that many of them could be provided in the future. Such as:

External classes. This is a class that has this size and field layout defined in the library. As such, it will go with less restrictions referring to the ABI compatibility.
Compile-time external constants. They do exist today, but the compiler can generate a constant symbol that resolves to an immediate value.

This can also extend to functions, if these are used with them, as well with their signature. The functions themselves are referred in the linker by its address only, but the size and layout of the stack frame for arguments is dependent on the signature. This can also be encoded by appropriate constants so that the call form remains compatible even if, for example, someone replaces int32_t with int64_t in one of arguments. This doesn’t free the library developer from any compat requirements, but at least gives them more flexibility.

6. C++ interface form files

And here is the thing absolutely required for module: a standardized interface file format, which will be portable, at least in the frames of a single operating system. That is, it may contain some variants that might be specific for the system, but this format is at least interchangeable in a single operating system and platform, simply just as portable the binary packages can be. It’s not easy to define as it would have to be a mixture of C++-language specific things and platform-specific things.

The first biggest problem is the preprocessor. This could be at best got rid of, but it’s not that easy. The preprocessor macros are still used in header files and translation of projects to the form not using it won’t be that simple. They are being used for different purposes, and not all of them could be allowed in this form. You might want for example:

To reshape a constant per the application’s need.
To enable or disable a library feature – the reason being that an enabled feature turns on a dependency on some external library, or that it requires some specific support from the operating system that is not always available.
To turn on or reshape some development-specific thing, for which the library support isn’t needed, or is provided anyway as it’s considered cheap.

This is what I know that is being used in various projects, but this list definitely isn’t complete, it’s just to give you idea. This list should be better completed by reviewing various different projects to see how they are using preprocessor and especially -D option in the compiler. Interesting is the case when appropriate -D option is used for the compile command to compile the library user code (not the library itself!), which influences on the results of interpretation of this library’s header file. Every such case better be got rid of or somehow replaced, for example:

Any specific value should be contained in the library specific configuration so that they can be specific for the installation.
If there is a feature that could be enabled or disabled for the whole library, then this should be provided for both the library itself and its interface. The final library form may contain also some manifest file that enumerates the features, of which it is expected that a compatible library must have the list of features identical with the requiring library user or otherwise linkage is not possible. This should be also included in the format because this feature list should be attached to the library. Something like this is being used today by providing versioning to functions in the library, but this is also using some linker tricks to achieve it. This time it would have to be an official feature of the linker, coordinated by the features in the interface file.
This kind of configuration could be provided in the runtime configuration, but if there is any reason to hardcode it in the target, this could also in perspective use the immediate constant entity feature in the linker. This way there should be some method to transform the compiler option into a constant – likely something like a symbol that is provided with some defined value and can be replaced by a compiler option.

Anyway, the influence of the preprocessor macros on the form of the interface must be completely eliminated. The definition of C++ modules should make things motivating enough, I’m not sure up to which grade it is so, but then if you have a module-instrumented C++ source file, there should be the following restrictions applying to it:

The use of any preprocessor directive is restricted to the use in the implementation only.
Module interfaces can provide a preprocessor macro, but it must be syntactically complete. That is, they can resolve to a single instruction, single declaration or a single value, but never a fragment of one.
The use of #include in the module-instrumented source files is only allowed in the Global Module Fragment. Symbols provided by it can be only used in the implementation. Reexporting any parts of it, including when a symbol is a part of the definition provided explicitly in the module, require that the export declaration be explicitly declared.
Preprocessor directives may be used for conditionally including some part of the declaration, but then this information must be added to the feature list (so that a binary package with this thing included is different to the binary package with this thing excluded, both in the interface and in the implementation).

With these restrictions on, the module interface extracted from the source file should be free of any preprocessor dependencies – that is, the use of -D option in the compiler doesn’t influence anything in the interface.

Next, every language entity as defined in this interface must have its form appropriately encoded: the name, the signature, encoded built-in types, encoded references to library-defined or user-defined types. This is something that must be agreed upon by all compiler vendors. It’s so because this is the kind of immediate form that is understood directly by the compiler after it parses and interprets the contents of a header file. Every compiler does it and each one is generating some form of the internal database from it. The thing here is to have an interchangeable format of this database so that it can be loaded quickly by the compiler, without the need to preprocess and parse the header files with all indirect includes inside.

I know that actually the compiler doesn’t distinguish between header file and the source file, but then there is already known a practice of so-called “precompiled headers”, which is a “half-compiled” form of a header. This is a good place to start, but can’t really be a final solution. That’s also because, for example, even on Linux gcc and clang use different formats, different methods of creating them, different filename extensions – because it was never thought of as for any other use than for the current build process. This time we need something that will be produced identically by every compiler on the platform, so that it is interchangeable, and can be distributed together with libraries instead of header files.

For example: such things as structures, classes (stating that we don’t provide method bodies), or even class templates, should look the same everywhere, regardless of the platform and compiler vendor. This is the language-specific entity. On the other hand, it should have such a form that every compiler should be able to turn it quickly into its internal database as being prepared to build the rest of it when compiling the source code.

A little bit different story is with the inline functions. Inline functions could be provided with their out-of-line version (I think in the new modularization system it should be required, today it’s optional), but the inline version must be completely contained in the interface (that’s why inline functions shall not be allowed to be recursive, and multiple inline functions shall also not be mutually recursive). For inline functions it must be predicted that at the place of their call they will be “dissolved” into the calling environment. Therefore it’s unlikely that they can be compiled into the target assembly code. Probably some fragments of it can, as long as this can be framed independently, but normally the compiler should do optimizations on the code after the inline function has been dissolved (for example: there’s usually no argument passing on the stack, argument values are taken directly from the code where they were evaluated, the writing to the variable being returned is using direct writing to the variable assigned etc.). That’s why there has to be rather created some smart binary coding method to which the function body can be translated, and the compiler will still have to translate it to its assembly code. This binary code need not be portable in general, only within the frames of the particular operating system – so it can be without problems specific to the target system and machine platform. But still it’s unlikely to simply be the same form of assembly code as the compiler would do with normal functions. Not only because of required simplifications that would be specific to the place of usage (alleged call), but also that likely this same would have to be done to function templates, which’s details after compiling into a normal code would be dependent on the type parameters.

7. Binding together

As I mentioned already the binary code for inline functions, you might ask: would it be possible to have a smart compiler that translates it to the assembly code on the fly, which would be called by the linker? Not sure how complicated this is, but if this is possible, you may simply have a C++-specific dynamic linker, which in perspective can free us all from all problems known as “ABI compatibility” – or at least most of them, while shifting all the remaining problems to the category of “linker errors” (which is a large improvement towards the undefined behavior).

If you have a function template, for example, its final form needs to be a function containing the assembly code to execute, but some specifics need to be resolved depending on the size and specifics of some types used as template parameters. Templates, however – in distinction to preprocessor macros – don’t have syntax-specific dependencies; all their properties are at the C++ level, and as such, can be so encoded. If you have a specific inline function to be expanded, you use the general method to do it, as for all functions. If a type size, or specific builtin operation – this should be defined in the binary code standard for the inline functions. All this must be so simple that finally this can be done quickly by the subroutine called by the dynamic linker.

But that’s not all. What we currently have is that we have libraries, they are being opened when an application needs it, it’s loaded into memory (text only), and then there are per-application instances. The latter must be created anyway, but as for the “text” part we don’t really need to have the whole library in the memory – only text entities that are in use.

This, however, means that the library would not be the separate namespace, as it is today. This is unfortunately a method that potentially allows to provide a symbol of the same name, but by different libraries, while the potential conflict would not be properly prevented from and it’s hard to detect. This time you might want to require particular module contained in the library package and only require its specific version, maybe possibly somehow the variant (as a set of features – mentioned above). Those should be able to coexist, as long as the system can identify symbols provided by particular variant of the library. So, this way, you may have a specific library and a symbol provided by this very library package, but it is known from upside, which of the libraries package provides this symbol (because it is encoded in the interface files that the target was requesting) and that symbol is loaded to the memory if not yet present.

Once having this, the system can simply use single C++ “module form files” that provide both interface and implementation in one file, and the unit of caching in the system is still the single linker entity as required by the application. As long as it is possible to do any dependency tracking between entities, the system must load only those entities that are in use. The system also records the time when the entity was last open and when the next application requires access to the same entity, it will wink into the already cached entity, or will load the entity anew, if the library file is newer. Anyway, loaded and cached in memory are single entities, not whole libraries.

This way we don’t even need library files anymore – that is, the *.a and *.so files. Every module is simply compiled into module form files, which finally consist of interface and implementation, but provided as a single file. When compiling a library user, the request to the given module is resolved to opening the module form file, in order to reach out to its interface, but the exactly same module file is then used for linkage – both static and shared.

8. Considerations

Not that I have even made any feasibility study on that. This is only a pure concept, which can be also implemented partially. Most important thing for modules, as I mentioned in the previous article, is that the transition into modules can be done smoothly, according to the procedure I have shown. This means that there must exist initially a method to provide just “pure” module interface files with the traditional library files, but also single module form files containing the interface and implementation in one, just as well as generate a header file out of the binary interface file (yes, including “decoding” the inline functions).

I admit, I just sketched the concept and haven’t even checked if all these things are possible to be done, but I believe they just require a bit work and general development, which includes a bit giving up of the market competition, and most of the work would have to be done by the compiler vendors. But I do believe that it’s worth it.

Posted in Uncategorized | Tagged cpp, programming | Leave a comment

Format string considered not exactly that harmless

Posted on November 27, 2023 by ethouris

Introduction

I have written this article basing on getting unexpectedly involved with the formatting feature added to C++20 and also earlier defined as the {fmt} library, which has been looking since the very beginning as an unfortunate attempt to revive the C’s printf function in the new light. Note then that as a software developer I am interested mainly in how particular solution improves productivity and I evaluate solutions exclusively with that regard.

Since software development activities consist of three types of activities:

Writing the code the first time
Reading the code someone else has written
Fixing the code someone else has written

Having something to be written “shorter” helps only in the efficiency for the first one, which happens exactly once in the lifetime of the software, while the other two happen and repeat multiple times. That’s simply mathematics and the matter of lying to yourself. There’s also a well known term in the software development for solutions that help in the first one, but make life hard in the others: the “write-only software”.

Productivity improves then, even if you have a solution that seems hard for the first one, but then clears your way out forever for the next two ones. An example of that case is when you are using longer or shorter names of the local variables, especially those that last the lifetime of a long function. If they are short, it’s easy to write, but then it’s a hell of the one who will be fixing it. Conversely, if they are long, it’s tough for the writer (although good dev tools offer some help here to expand them), but the code looks clearer for those who will be fixing it.

I decided to make this introduction first because all solutions using the formatting specifier in the string is being always explained by its enthusiasts that “it is short” and therefore “easy to write”. So, yes, this exactly refers to activity 1. Without mentioning the truth about the rest.

Some historical background

How did this exactly happen that the C language got that printf function?

Because someone wanted this to be a function, not a language-builtin feature (which is a good idea in general – Pascal did it as a language feature and it resulted in an inconsistent and limitedly useful feature), and once so decided, they could only do it in the frames in which the language capabilities allow it. Wherever those are lacking, some replacement must be done. That’s more-less it.

Because, what exactly did the other languages have at that time?

Let’s make a short example: we have data for a rectangle: left, top, right, bottom. We have variables with these exactly names that carry these data. Now you want to display the origin point as cartesian coordinates (so, x=left, y=bottom) and their sizes (width=right-left, height=top-bottom) and it should look like: origin=x,y dimensions=width x height. We also have a color in RGB format and we display it as color=#RRGGBB (hex).

Fortran:

write(*, *) 'origin=', left, ',', bottom, ' dimensions=', (right-left), ' x ', (top-bottom)
write(*, '(Z6)') 'color=#', color

BASIC:

As VisualBasic is kinda too evolved, and it still does have many original concepts, it got too complicated for historical explanation, so let’s use the Commodore C64 version, also similar was that on Atari XE. Note here that the semicolon was different to comma in that comma was adding a space between printed parameters, while semicolon makes them just glued together, and also left alone at the end prevented the EOL to be added.

PRINT "origin="; left; ","; bottom; " dimensions="; (right-left); " x "; (top-bottom)
PRINT "color=#", HEX$(color)

Pascal:

Writeln("origin=", left, ",", bottom, " dimensions=", (right-left), " x ", (top-bottom));
Writeln("color=#", HexStr(color, 6));

(Let’s not forget also that formatting a floating-point value in Pascal is done by using value:width:precision expression inside the Writeln arguments).

Ok, but now let’s turn to something more from today. So let’s try scripting languages. All of them have such a feature as “interpolated string”, that is, you specify everything you need in a string. So let’s try now…

Tcl:

puts "origin=$left,$bottom dimensions=[expr {$right-$left}] x [expr {$top-$bottom}]"
puts "color=#[format %06X $color]"

Of course not every language is capable of having such a feature as interpolated string, so let me show you also how it would look like without it. The puts instruction in Tcl gets only one string to print (actually Tcl is known of that in this language “everything is a string”, so that’s not a big deal for it anyway), but there’s also a concat instruction to glue in values, so this can be used instead. So, let me show you how it would look like if there is no string interpolation and let’s imagine also that concat is the only way where you can interpret the value as string:

puts [concat "origin=" $left "," $bottom " dimensions=" [expr {$right-$left}] " x " [expr {$top-$bottom}]
puts [concat "color=#" [format %06X $color]]

Of course, let’s not forget that the format command in Tcl has been created after the C’s printf function, but we are using here just the simple format specifier for a single value. This isn’t the only way to do it in Tcl because we have both possibilities – either this way, or the same way as in C’s printf by simply using puts [format "%d %d %d" $a $b $c], which looks similar to printf("%d %d %d\n", a, b, c);.

Ok, I may also mention Python, for which all variations are available, but you still can do simply:

print("origin=", left, ",", bottom, " dimensions=", (right-left), " x ", (top-bottom))
print("color=#", format(color, '02X'))

Not that this is the only way available, the string formatting like in printf or interpolated strings (f-strings in Python) are also available.

So, what do all these things have in common?

The way of formatting the printable string is to glue all values together, while format configuration details are specified directly by the value being formatted.

Suffices to say that when the new file stream system was being developed for C++, the first thing to do was to return to the old, good method of formatting, which predated the C language, stating that C++ already has appropriate capabilities to provide them in the “userspace” (not as a language-builtin feature):

cout << "origin=" << left << "," << bottom << " dimensions=" << (right-left) << " x " << (top-bottom) << endl;
cout << "color=#" << hex << setfill('0') << setw(6) << color << endl;

Did people like it?

Well, some did, but not all. This C++ version was attempting to be free of formatting string, but then one annoying treat of this is that the formatting flags modify the stream state. This is not only annoying for the programmer, but also causes performance problems.

In the meantime, in other languages many derivative works have been developed basing on the original idea of the format string, slightly improved, without the most important problems of the C’s printf version, but still a format string. I personally find that obnoxious.

Sinking in C

Now, why has C introduced this kind of formatting? There’s simply one and the only reason: because of this language’s limitations. Of course, it could be just as well added as a language feature, as it was done with many other languages before, but there were good reasons not to do it this way. Even if you had to resolve to a concept-mixing weird style of a format string.

The following limitations of the C language have influenced this solution:

Inability to handle dynamic strings. This is actually something that lasts up until today and it doesn’t seem anyone willing to solve it. This means that you can’t simply have a format function that would get a string and some kind of value because someone would have to maintain the lifecycle of the dynamic string produced this way
Inability to handle the protocol of various types of data passed to a function. When passing, every type allowed to be represented as value is passed some standard way – usually aligned to 16 or 32 bits, 64-bit integers and double as two 32-bit slices, etc. – and the structure of the passed arguments must be known from upside by the function that extracts them.

In effect, this was already not easy to implement (this call actually mixes static strings and something that could only be a dynamically allocated string, which’s lifetime must be controlled), while already looks clumsy:

print("Height: ", format(d, "0.4f"), " index: ", format(i, "02d"), "\n", NULL);

so it was decided it will be easier to do this way:

printf("Height: %04.f index: %02d\n", d, i);

Variadic functions in C is something that was carried over from “K&R” C to ANSI C, after the latter has introduced function signatures. This introduced “fixed convention”: what values of what types are passed, is exactly as the function has declared, and type conversions are also done if needed. “K&R” C had only “forced convention”, that is, whatever parameters were passed to a call, it was taken as a good deal, and the stack frame was constructed basing on the actual value types of the passed parameters. In case of variadic function the convention is fixed, but only up to the last explicit parameter, all the other parameters use the “forced convention”.

The “forced convention” is the only way to pass parameters of possibly different kind to the same function. The problem is, the called function just gets the stack frame with parameters, but it knows nothing about its size, size of the particular parameters nor how to interpret them. Therefore there are exactly three known conventions, how to use variadic functions in C:

Everything you pass to a function must be a pointer to the same kind of type, and you always pass NULL as the last parameter. This is the model of the execl* family functions. A similar method is often used for declaring arrays so that you can pass this array then as a single pointer without the size. A subvariant of that method is to provide values with tags: the last fixed argument is the first tag value, after which there’s expected a specific value, and then either next such sequence, or a termination tag. Theoretically this could be even provided with some syntactic sugar that avoids the need of this termination value by using a preprocessor macro, but this requires a variadic macro feature, which has been invented only in the C and C++ standards in 2011.
The the last fixed argument provides a complete information of every next passed argument. Their number and characteristics (size to grab and the way to interpret) must match exactly the types of values being passed. This is the model of all “formatting” functions, be it of the kind of printf or strftime.
The function is variadic, but it expects actually exactly one parameter after the last fixed argument. From the fixed arguments it should be determined what type of the value should have been passed.

What’s interesting is that the string in quotes is the only sensible way to provide the type information for the case 2 because… this is the only type of array that got automatically added a single 0 value at their end, that is, the NUL character. The NUL-terminated string itself is actually another thing referring to the C language limitations also dated at “K&R” times and it exists only because this was the only way to pass the string by passing a pointer to an array without the need to pass also its size explicitly (in case of arrays of integers 0 value is as good as any other). There wasn’t any other merit to create this because NUL-terminated string is more error prone and less performant than a combination of a pointer to an array and the explicit size (for example, in distinction to memcpy, the strcpy function can’t make use of the processor’s specific memory copying acceleration, it must simply copy byte-by-byte, which is especially a problem with 4-byte aligned machines).

As a slight digression, it could be nice to note that many languages, much younger than C, although scripting ones, have also repeated the same mistake by not requiring functions to have fixed signatures – these are Perl and Javascript. It seems to be less of a problem because you can’t have any crash or memory override in such languages, but still the increased opportunity for programmer mistakes have been supported. Somehow other languages, like Tcl or Python, could have fixed function signatures.

So, printf has used the approach 2: the format string should contain appropriate tags marked with the % character, and for every such entry there is requested an appropriate number of parameters from the call, usually one (there are some special cases, when arguments get ignored or 2 arguments are required, but that’s a technical detail). Basing on this, the user must just take care of that every percent-sign-entry corresponds with the appropriate arguments in the call. And also the type specifier in this entry should correspond with the type specifics of the argument so that the function grabs appropriate number of bytes from the call frame and interprets this correct way.

Just after first writing of this article it has come to my mind that actually they could still use a different approach: The convention 1, while requiring that the last fixed argument is a string that declares the first format. The formatting tag should be always at the end, and if it’s not present, there are no further arguments. It would look like this:

print("Height: %0.4f", d, " index: %02d", i, "\n");

I think that could be also an acceptable solution, but likely no one has thought it would be a good idea and of course for an unchecked method of passing parameters this could be more error prone. It definitely would suffer from the same problem as printf like you can’t print a string by printf(get_str()); because you risk having % characters there, you should use printf("%s", get_str()); (and passing an explicit string requires % to be written double). In here it would be even worse, you need to use print("%s", get_str(), "");. Forgetting the last empty string would cost you a crash because only the last untagged string terminates the argument list. Alternatively the list can be terminated by NULL, but then it’s not such a thing as very rarely used execl function, but one of the most often used one.

And here is the whole explanation. And no, it wasn’t any kind of “new approach” or “comfortable enough”, “better readability” or whatever other bullshit you can invent here. The true reason to invent such a sorry solution was just one: to meet half way with the limitations of the C language.

Note that it mixes up also two completely unrelated solutions: the format specification of a single value and the “string stencil filling”, that is, replacing tags in a tagged string with specified values (I’ll elaborate on that later). It’s only added because it’s more comfortable than any elaborated value formatting with default settings. In C++ you can write cout << "Height: " << d << " index: " << i << endl; and happily use default format, while if C resolved to format function as shown above, you can’t just simply pass d to it, you have to preformat it by calling format(d, "e"). Even if you have any default settings support and some ability to recognize the type (C11 has added such a feature), the best you could imagine is to call the same with an empty format argument.

Therefore people who have developed this solution are excused. They just worked with what they had at that time, and – what is extremely important to take into account – in the 1970 years and a bit later the high level compiled languages were a novelty (Fortran was merely a wrapper for the assembly language, while anything high-level, like Smalltalk, were interpreted, at best using a virtual machine). Just as well, the number of software solutions and the level of complexity for it were negligible – at least in comparison to what we have today. There was no experience – including bad experience – with particular solutions, and people simply tried to use what was available and had to get used to solutions that had no better alternative. Therefore also excused are people, who had to work with this for such a long time and have found it good enough, even if there are better alternatives already.

But preferring this solution over language-provided flags and gluing values in their right order reminds me of people from some deep provinces in some forgotten lands who are still eat meals made of dog meat (I didn’t try myself, but my grandpa, who survived WW2, told me that it stinks), even though they could just as well eat pork or beef. The same thing: people, who were doing it in the far past, did it because they had no other choice, except for starving. But even they could never prefer this over pork, beef or lamb!

But then, am I not simply trying to convince you about my personal preferences?

No. String formatting is objectively bad – and here I’m going to explain, why.

Short introduction to Esoteric Languages

So, you probably heard about esoteric programming languages – like Intercal, Befunge, Brainfuck, Unlambda, Smetana or Malbolge. Ah, I almost forgot – there is one more, which is even today still used to create software: Perl.

There are also esoteric algorithms. For example, there are esoteric sort algorithms, such as dropsort. Beats all algorithms because it has linear complexity. How does it work? Well, it copies the elements to the output, skipping those that are not in order. Who said that all elements must be preserved?

The general idea of esoteric languages is to implement specific language solutions that makes the programming harder, something that are the absolute opposite of being readable, error-avoiding, comfortable and useful. Intercal is a good example because it cumulates not just one, but multiple stupid ideas, among which there are some interesting ones:

The PLEASE instruction was required to be used for some number of instructions, otherwise the program is ill-formed as too unkind. But not too many because this way it might be also ill-formed as too fawning.
Instead of parentheses, Intercal uses single and double quotes. For example, (x-y*(a-(f-e))) in Intercal would be written as 'x-y*"a-'f-e'"'.
And a later addition, COME FROM, which is an inverted version of GO TO (note that in older languages lines had their explicit numbers, as the way to edit the source was only to enter and record a single line at a time – and GO TO referred to that line). It has to be defined in the jump destination location and should specify the line from which the jump should be done. In other words, when subsequent statements are executed, the next one is not, and the jump is made because this line is mentioned at the destination location with COME FROM.

Some specific category of esoteric languages make the languages being some kind of parody of the Turing Machine: Brainfuck, Doublefuck, or Befunge. Befunge is a one-feature language and actually extended version of Brainfuck, which contains only simple instructions of moving down the tape and increasing or decreasing integers on the tape. Befunge is a two-dimensional version of it, where the execution cursor moves through a matrix with instructions. You have additional instructions of changing direction also to horizontal and vertical, and the executed are instructions that are met on the matrix. In order to make a loop you simply have to organize your instructions so that the execution cursor runs in a rectangle path. So, that’s the method of relying on the text layout in the source code driven to absurd.

Why do such things exist? Why people work so hard to even create language specifications and implement compilers, which from the business point of view seems completely wasteful? Because this can serve as a warning and a physical proof that some ideas are just stupid and counter-productive, but without implementation you don’t have a proof. And beside playing around with ideas or intellectual amusement, they do have a value, which is: to openly declare (by sneering), what language features are bad because they contradict readability and usefulness, are most possibly error prone, and simply make the programming harder.

Some didn’t get the joke

Not that it didn’t inspire – unfortunately – creators of other languages. I have mentioned already Perl as one of the languages that prefers often crazy and nice looking statements to rule stability and usefulness, especially for people using also other languages. For example, you can do this in Perl:

if (is_higher($n))
{
    slip($x);
}

but this won’t work:

if (is_higher($n))
    slip($x);

Not that this is impossible without braces. You simply have to specify it this way:

    slip($x)
if (is_higher($n));

I wouldn’t even guess which function call is executed first.

It also has things like unless that can be used instead of if with not, and similar until as a complement for while, and many alternative ways to do the very same thing multiple different ways, all of them the same useful, all of them you have to know if you try to interpret someone else’s code.

It has even inspired me once to create another esoteric language (although I didn’t have enough time and lust to make something out of it and I abandoned this project) – I called it Legal – in which every instruction was a single sentence starting from an uppercase letter and ended with a dot. Instructions were grouped in paragraphs, and single instruction, especially with conditions, as sections. Because if you can write an instruction slip($x) if (is_higher($n)) then you can even go further and say something like Is the value greater than 5? If not, execute the statement mentioned in section 5., can’t you.

There are many bad language decisions that have been later repeated in other languages as well. For example someone didn’t get the joke from Befunge, or even more, Whitespace (in which the only characters you can use in the source code is space, tabulator and end of line, and beside this it bases on the same idea as Brainfuck) when creating the make tool, for which the configuration script, Makefile, had to distinguish between the tabulator and space, and depending on which was used as the very first character in the line, the following statement is interpreted as either just the script statement, or a shell command to execute. In my early programming days I was sometimes using Dos Navigator (its clone still exists in some incarnation) to edit files and I was once looking for an error a long time when it replaced all tabs with spaces in the makefile.

That was condemned a long time ago (likely that’s why some are trying to switch to ninja, also my agmake is free of that problem), but it still didn’t stop the creators of Python from making it rely on the source code formatting in the language correctness. Some people even praise this feature as something that “makes you keep the program source decently formatted” (playing idiots, of course, because the problem is not that you are forced to keep the program formatted, but that the meaning and syntactic correctness of the program depend on it).

That’s also not the only stupid thing that was developed in Python. There was a well known problem with multiple inheritance in C++ if a conflict has developed, when you have two identical method signatures in two different (and unrelated), but common derived classes. The most sensible solution (although not so simple to define in the language without having a deprecation period) was to disable this possibility and hide the method solely in the base class (so that doesn’t automatically get exposed in the interface) – that could be also a desired solution for C++. But no, Python not only allows this, but it also allows to specify in the class configuration, which parts of the base class should be how derived. Only to next recommend in the official documentation that multiple inheritance can lead to lots of problems and should be absolutely avoided. Congratulations.

These all above are proofs that even if some solution was already condemned as counter-productive in the software development, these treats still have fans that can’t live without them. When I looked for some tutorial how to print values in Python (which included also the methods I showed above), the author of this tutorial said that “format method of string” is actually the “best solution” (???). Yes, that’s exactly what he said. Doing print("a={} b={}" %a %b) is by him better than print(f"a={a} b={b}"). I wouldn’t believe unless I saw it with my own eyes. And it was so badly inspiring that someone has even created a library called {fmt}, which is doing the same thing for C++, and that has even been later submitted to the C++20 standard and it’s already available for the most compilers.

I wouldn’t even find it bad that people are so fixed on this string formatting in their own software (for open source projects I don’t really care that someone decided to use a counter-productive technique), except for the fact that in the C++20 and {fmt} the format configuration structure isn’t even a part of the public API, so there’s even no way to make any extensions to improve it, nor to reuse the old good C++ iostream manipulators to configure the output. Likely someone really doesn’t understand, why this string formatting is so counter-productive. Especially that C++ doesn’t suffer the C language limitations and there’s no reason to follow these solutions.

Those who know the history and still chose to repeat it

Many people even misunderstand, what the real reason is that printf is so counter-productive.

No, it’s not the matter of explicit type specifier, which can be solved by automatic type recognition. In today C it’s a minor problem since compilers check and report the wrong specification, including for other functions following this scheme. The biggest problems are contained in a factor generally named as “productivity”. Let’s try this example, written already in the {fmt} style without type specifiers, so that minor problem is taken care of:

print_to(output, "KXDrawTool: configured: color=#{:02X}{:02X}{:02X} thickness={} depth={}\n",
                    settings.color.r, settings.color.g, settings.color.b,
                    settings.thickness, settings.depth);

Let’s say, it’s not a problem if the format string contains up to 4 tags, and well dispersed throughout the formatter (good for you if you have an IDE that helps you here with highlighting). But the more parameters you have, the more they are far away from the place where you can find the declaration of its formatter specifier. That’s exactly the reason why I mentioned this COME FROM from Intercal.

A similar problem is in general in C and C++ (for the latter being even more of a burden) that functions and methods need to be usually declared twice – once for the header and once for the implementation – and they have to be kept manually in sync, and often not even copied 1-to-1 (due to namespace shortcuts, default parameters etc.). Modules in C++ were designed to help here, but I can’t see them quickly adopted (which is another story I described elsewhere).

So, take a look at this and tell me: what exactly should you do to identify, which of the function call parameters, are placed where in the formatted line? There’s no way you do this any other way but simply counting the format tags one by one and then applying this count to the variadic arguments. And why did I mention then the method declarations in C++? Because here you also have two separate things to be kept in sync: the format entry in the string and the value. Want to add one inside? You need to find, which exactly value precedes the place you introduced and then find – by counting – the same exactly place at the variadic arguments. The compiler will check the type specification, but won’t check if you mistook x for y, both being int.

Of course, in order to keep things a little less frustrating you can also indent every argument so that it starts in the position of the { character corresponding with this argument. And here is the real deal:

    print_to(output, "KXDrawTool: configured: "
            "color=#{:02X}{:02X}{:02X} thickness={} depth={}\n",
                    settings.color.r,         // |        |
                          settings.color.g,    //|        |
                                settings.color.b,  //     |
                                                 settings.thickness,
                                                          settings.picture_depth);

And now you can see, how this thing could be improved: follow the Chinese, mate!

The expressions for the values to be printed should be written top to bottom instead of left to right!

Otherwise, you can see, even this doesn’t exactly suffice because the more the argument list grows, the more lines separate the percent sign and the argument. Ok, some column highlighting could help, but this is annoying for some (for me it is) and also it’s a poor text formatting support to half-solve a language problem. Which can even expand to a real horror from some “C-based performance programmer”, who has placed once in the code something like this:

sprintf(output,"%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X,%02X", arr[0], arr[1], arr[2], arr[3], arr[4], arr[5], arr[6], arr[7], arr[8], arr[9], arr[10], arr[11], arr[12], arr[13], arr[14], arr[15], arr[16], arr[17], arr[18], arr[19], arr[20], arr[21], arr[22], arr[23], arr[24], arr[25], arr[26], arr[27], arr[28], arr[29], arr[30], arr[31], arr[32], arr[33], arr[34], arr[35], arr[36], arr[37], arr[38], arr[39]);

My boss, who saw this at that time, just asked “and this is certainly in order, right?”.

Yeah, I hear already screaming eagles lecturing me that “it is cleaner if I have a format string with just tagged locations for the values”!

First, it’s not “cleaner” – at best it is the same clean for simple examples using default format, like this:

print(output, "Rectangle : ", left, " ", top, " ", right, " ", bottom, "\n");

Making a pre-formatted single line doesn’t make it any cleaner, while already applying the eye-jumping problem (though still minor in this case):

print(output, "Rectangle: {} {} {} {}\n", left, top, right, bottom);

Of course, cleanest would be to have the interpolated string, but it’s not simple to add to C++ syntax (the expressions in the interpolations must be anyway parsed by the compiler, so such a feature would have to be added on the language level):

print(output, fs_"Rectangle: {left} {top} {right} {bottom}\n");

Second, if I wanted to have a “string template” to be filled with values replacing tags, then this is the example of the right solution:

print(output, "Rectangle %left %top %right %bottom\n",
               "top", top,
               "bottom", bottom,
               "left", left,
               "right", right);

This is a completely separate feature to formatting and there’s completely no reason in mixing them both in a single format string (this was done in C only because it was easiest to implement, not because it was anyhow useful). Moreover, if you need formatting, in this case it should be specified near the expression to be printed, not near the tag in the string:

print(output, "Rectangle %top %left %right %bottom\n",
               "top", setw(6), top,
               "bottom", setw(6), bottom,
               "left", setw(6), left,
               "right", setw(6), right);

I’ll expand on this idea below.

What the programmers actually need

You know what this operator<< for std::ostream actually is? It’s only a workaround for the lack of variadic templates, which were only added in C++11. Because if those have been added, it would have been very simple to define the “print” function, as known from BASIC, Python, and many other languages:

template <class Stream>
inline Stream& print_to(Stream& sout) { return sout;}

template <class Stream, class Arg1, class... Args>
inline Stream& print_to(Stream& sout, Arg1&& arg1, Args&&... args)
{
    sout.write_fmt(arg1); // In C++ today it's ostream::operator<<() that has no named alias
    return print_to(sout, args...);
}

When having this function, you can very easily write the above this way:

    print_to(output, "KXDrawTool: configured: color=#", setw(2), setfill('0'), hex, uppercase,
             settings.color.r, settings.color.g, settings.color.b, dec, setw(0),
			 " thickness=", settings.thickness,
             " depth=", settings.picture_depth, "\n");

Although I personally don’t like runtime state changes and it would be much better to do it this way:

    print_to(output, "KXDrawTool: configured: color=#",
             fmt(settings.color.r, setw(2), setfill('0'), hex, uppercase),
             fmt(settings.color.g, setw(2), setfill('0'), hex, uppercase),
             fmt(settings.color.b, setw(2), setfill('0'), hex, uppercase),
			 " thickness=", settings.thickness,
             " depth=", settings.picture_depth, "\n");

You can find here, how to define the above fmt. Note though that it has a poor performance, of course, especially in comparison to the C++20 format library (and the same {fmt} library available for earlier standards), but it’s just an idea that can be further improved. The poor performance comes mainly from iostream, especially when you want to save and restore format that this function is doing.

Note that if you find the formatting specifiers better when they are specified by a format string, you can still do it like this (not defined in the above fmt though, but it’s a possible feature) – that’s something very much like the format() function in Python:

    print_to(output, "KXDrawTool: configured: color=#",
             fmt(settings.color.r, "02X"),
             fmt(settings.color.g, "02X"),
             fmt(settings.color.b, "02X"),
			 " thickness=", settings.thickness,
             " depth=", settings.picture_depth, "\n");

Now you can see that:

every expression is visible exactly at the place where its value is printed
every expression has exactly one instance of itself in this instruction
the format specifier is attached directly to the expression it touches upon

But then, if you think that it would be “more readable” if you use format string tags, it could be still done this way:

    print_to(output, "KXDrawTool: configured: color=#%RR%GG%BB thickness=%thicks depth=%depth\n",
             arg("RR", fmt(settings.color.r, "02X")),
             arg("GG", fmt(settings.color.g, "02X")),
             arg("BB", fmt(settings.color.b, "02X")),
             arg("thicks", settings.thickness),
             arg("depth", settings.picture_depth));

Still, the main difference to the formatting library is that:

The tag-replacement feature is a separate thing to format specification – just a simple tag replacement, nothing more
Formatting specifiers are still in use, but they are tied to the value being printed and hence it’s a part of the tag specifier. You can still use any markers in the tag names that would suggest the way how they are intended to be formatted, but this is only an information for the reviewer, not a specification for the language.

You may even resolve to a crazy idea like this:

    print_to(output, "KXDrawTool: configured: color=#",
             settings.color.r %fmt("02X"),
             settings.color.g %fmt("02X"),
             settings.color.b %fmt("02X"),
			 " thickness=", settings.thickness,
             " depth=", settings.picture_depth, "\n");

What then I think would be the ideal solution?

My personal preference is the interpolated string. The above solution is not the same, but close enough. What exactly problems would I have to solve with the format specifiers using language primitives (instead of the string specifier)? I would like to have it specified shorter. But this could be then solved by having something like this before the printer:

    auto H02 = make_fmt(setw(2), setfill('0'), hex, uppercase);
    print_to(output, "KXDrawTool: configured: color=#",
             H02(settings.color.r), H02(settings.color.g), H02(settings.color.b),
			 " thickness=", settings.thickness,
             " depth=", settings.picture_depth, "\n");

Or, should that be more comfortable:

    auto H02 = make_fmt(setw(2), setfill('0'), hex, uppercase);
    print_to(output, "KXDrawTool: configured: color=#",
             settings.color.r %H02, settings.color.g %H02, settings.color.b %H02,
			 " thickness=", settings.thickness,
             " depth=", settings.picture_depth, "\n");

That allows additionally to have a predefined format for particular kind of data for the whole application that can be changed in the central place when needed. All those things are the only problems to solve I can imagine with this kind of formatting.

Yes, I can hear again people screaming that obviously if you have a value of type double and you want to print one with zero-filled 4 digits and precision 8 and one with precision 6 only, with this system you have to do:

print_to(output, "value: ", fmt(value, setfill('0'), setw(4), setprecision(8)),
         " confidence: ", fmt(confidence, setprecision(6)), endl);

or at best with string-specified formatters:

print_to(output, "value: ", fmt(value, "04.8f"),
         " confidence: ", fmt(confidence, ".8"), endl);

which would be much shorter to write even with printf:

fprintf(output, "value: %04.8f confidence: %0.6f\n", value, confidence);

and the same with {fmt} or std::format:

print("value: {:04.8} confidence: {:0.6}\n", value, confidence);

I can admit one thing – this is shorter. I cannot agree with a statement that this is anyhow better for the productivity, as I mentioned in the beginning.

On the other hand, how many times did you happen to use explicit format parameters, unless the need was because it is always required (as in sprintf) or you needed some library- or app-wise consistent formatting, for which you could use a preconfigured format? I can imagine a software that contains lots of printed and formatted floating-point values and strictly placed in a row hexadecimal values with equally 8 digits. But even if, isn’t it simpler to just create a simple function, say “HEX8”, which will format the given value this way (and while the function correctness will be checked by the compiler), than to use "%08X" every single time? Not even mentioning a case that once happened to me when I had an integer value of some logical type to be always formatted the same way that I decided to format as 4-digit hexadecimal value and then the project manager told me that actually it is required to be printed as 4-digit decimal…

So, even if I thought it is better for some reason to use these string-based format specifiers, the best I can imagine for this is:

format_to(output, "value: ", "04.8"_F, value, " confidence: ", "0.6"_F, confidence);

Here I used _F as a UDL applied to a string to turn it into a formatter, as "04.8"_F, value is shorter than specifying fmt(value, "04.8") or value %fmt("04.8").

Ah, of course, additionally it’s better to have a line with short defined expression symbols, without having them so elaborated inside the call. That’s what the tagging feature of the formatting library had to solve, right? You think it’s nice to have them this way, while using then some more elaborate expressions for the actual values:

format_to(output, "value: {value:04.8} confidence: {confidence:0.6}" ... );

If so, you can still use intermediate variables. Just above, not below the general line specifier, but that changes nothing. This can be significant if you use some complicated expression pattern, but if so – see below.

So many features!

So, I was so negligent with the printf function that I have only recently learned that it features argument positioning. That is, you can specify the number of argument in every format specifier, instead of relying on the argument order. This way, instead of counting the position of the percent sign up to ten and then the same in the argument list, you just have to read the 10 number in the specifier and then… again count to ten in the argument list. A perfect solution. And that’s only one of many things that are provided for the fmt library’s {N:} format.

You can also specify tag names, and then tag every argument passed after the format string so that it is known, which argument the format specifier touches upon. Nice, but how does it differ to specifying the intermediate variables before the printing instruction and then using them directly in the sequence of printable pieces? Of course, it can be useful to separate the format pattern from arguments – see below.

In the {fmt} library you can additionally nest the specification so that you can specify also the runtime value of the width or precision – by making it "{:{}}", width, value. Nice, but then the form looks like this:

int valp = 8, valw = 4, confp = 6;
print_to(output, "value: {:0{}.{}} confidence: {:.{}}\n", valw, valp, value, confp, confidence);

PFFF REALLY? And you can really quickly figure out here, which witches watch which watch? Then with the iostream manipulator style + helper fmt function I’d have this:

int valp = 8, valw = 4, confp = 6;
print_to(output, "value: ", fmt(value, setfill('0'), setw(valw), setprecision(valp)),
         " confidence: ", fmt(confidence, setprecision(confp)), endl);

When I call setw(N) or setprecision(N) I can also use the runtime value, and this function can be also overloaded with constexpr if need be. And it needs only to write fmt(value, setw(width)). And doesn’t require adding a format string that looks like expressions in Intercal. This is evidently nothing else than trying to make extra sorry workarounds to allow what with the use of function-like formatting tags comes naturally.

Provided that this is added to the C++20 standard, I’m really wondering what kind of guys are today in the C++ standard committee. I really think that I am kinda one of just a few youngest C++ programmers, even I am barely 50. Those, who have first learned C++, had the first contact with printing on the console as iostream and cout, and only later learned that “in that old, outdated C language there was some ugly printf, but no sane person uses it anymore”. I had this in the ’90s and the last thing I suspected to come in 2020 was some epigone to revive the dead printf horse in the new tatters. Even more astonishing that no one really perceives how counter-productive is the way of pre-specifying the format string to be filled with subsequent, or preselected, list of arguments, in general. Actually the only thing that the creators of {fmt} library can be proud of is that they have achieved an excellent performance by having the format string interpreted at compile time. Nice, but somehow that reminds me of C programmers who declare that they will never write software in C++.

Note that I never said that a formatting string with tags is useless in general. It is useful, but only in a form where it’s a feature on its own and not mixed with any formatted printing – and even if the formatted printing is also used, it’s in a different place. This is what I have written once myself when I had to do a simple JSON format printing (I didn’t know about the {fmt} library at that time):

    out << TemplateSubst(
            R"({ "level" : %level, "id" : "%name", )"
                R"("first" : {"time" : "+%basetime", "offset" : %baseoffset}, )"
                R"("last" : {"time" : "+%lasttime", "offset" : %lastoffset}, )"
                R"("details" : "%details", )"
                R"("repeated" : %repeated })",

            "level", fmt(Rep::level(firsterror.value)),
            "name", Rep::name(firsterror.value),
            "basetime", fmt(rel_basetime),
            "baseoffset", fmt(ers.baseoffset),
            "lasttime", fmt(rel_lasttime),
            "lastoffset", fmt(ers.lastoffset),
            "details", ers.details,
            "repeated", fmt(ers.repeated));

The very same thing could be done using the fmt::arg facility together with the formatter:

   fmt::format(
            R"({{ "level" : {level}, "id" : "{name}", )"
                R"("first" : {{"time" : "+{basetime}", "offset" : {baseoffset}}}, )"
                R"("last" : {{"time" : "+{lasttime}", "offset" : {lastoffset}}}, )"
                R"("details" : "{details}", )"
                R"("repeated" : {repeated} }})",
            fmt::arg("level", Rep::level(firsterror.value)),
            fmt::arg("name", Rep::name(firsterror.value)),
            fmt::arg("basetime", rel_basetime),
            fmt::arg("baseoffset", ers.baseoffset),
            fmt::arg("lasttime", rel_lasttime),
            fmt::arg("lastoffset", ers.lastoffset),
            fmt::arg("details", ers.details),
            fmt::arg("repeated", ers.repeated));

That could be tempting, especially due to that fmt::format gives me also a possibility to use compile-time interpretation of the format string and a nice speedup this way. But from the interface and visual presentation perspective the {fmt} version has still disadvantages for me because:

Requires additional explicit fmt::arg specification just because there can be also direct values here in the general case (and the necessity for fmt in the call of TemplateSubst are just because it’s written simple way and handles only strings)
If I require a format specifier for a particular value, I must specify it in the format string, not in the argument tag declaration
The use of {} in this formatting is very unfortunate as per the need of JSON format, messes up with the braces used for JSON, requires them to be escaped by doubling and makes the whole formatting string more messed up than the version with %tag specifiers

You may say that this last problem is a JSON-specific unfortunate case, but the truth is that JSON is currently one of the most popular data interchange format, and various different formats with hierarchized structure are also using braces (out of popular formats only YML could be a notable exception). I also know that % can be then problematic for other formats, but this can be then easily solved by customizing this character (also as a begin-end pair), which is easy if you have a facility, which’s only purpose is to fill the tagged text with a value replacement. The pursue of having a single multi-purpose solution in one, of which it’s at least just as well so easy to use them separately, has created limitations.

Just to sum up

So, again: As for me, in C++ the general case that is interesting for me as formatting specifier is to specify the values to be concatenated one after another. An interpolated string would be ideal, but in C++ you can only count on that much. The format string facility can provide some interesting features that can be sometimes useful, but these are just rare corner cases, while the use of this format string in most of the cases is counter-productive for the software development.

Therefore I value solutions where you have multiple options at hand, not just one, and exactly the one that is the worst choice for the majority of cases. Adding formatting tags support and formatting functions for intermediate calls in the C++20 format could solve this problem.

Posted in Uncategorized | Leave a comment

C++ modules – harmless considerations

Posted on October 15, 2022 by ethouris

As I have let myself criticize C++20 modules, maybe this time I’d try to make a draft of an idea, how modules in C++ could be added the way that would be in the form usable and desirable by the real C++ developers.

Change of plans!

Ok, I was trying to write this article first using the bottom-up method, starting from C-modules and explaining how to transit to the C++20 modules model, but I think top-down will be more interesting.

Before I start though, let me specify the overall goal of this whole project. This is not only to define what the final form of the source files should be, and how the build system should look like, but also how to get there from what we currently have. Therefore we need, of course, to define the build system using the source files defined already as modules (“module-instrumented” source files), just as well as to define it for the currently used source files with header files (‘module-uninstrumented”).

First of all, let’s consider, what exactly we want to have as modules in C++, why we need them, and what exactly the reason is to have them.

Today in the software development world the software in C++ is written in order to have two kind of targets:

applications, that is, executable files that can be run in the system environment
libraries, that is, reusable software packs that can be used by applications or libraries, or extended by other libraries

I think that this thing has been largely ignored by the C++ standard, just as well as naming explicitly “header files” and “implementation files”. Although for most of the time it was not a problem that the C++ standard was ignoring them, in case of modules this simply leads to create at best a “half-design”, something that would be then “possibly implemented” by compilers, as long as any of them can make any “full design” out of it, obviously one way incompatible with the other. Therefore let’s first try to look at the complete solution, something that could be, when ready, used in a daily software development job.

The software we are writing in files is rarely written as an application in a single file, so we have multiple files, each one compiled separately into an intermediate file, and then they all get linked together to produce the target. We need to have some sensible nomenclature, so I’ll try to reuse names that function already:

Target: an application or a library (the matter of shared libraries will be put aside for now).
Project: a set of definitions to compile single source files and link them together in order to produce a target. Note that a project may produce multiple output files, but in a single project there can be only one target; if you want to have multiple targets in your build system, then you have multiple projects.

Note also that I’m not trying to redesign the C++20 modules from scratch – I’m trying to reuse as much as possible, even preferably go strictly along with the C++20 module definition, maybe at best adding some small improvement ideas. So let’s consider an example:

Let’s say, I have a small project of an application that contains three files: m.cc, t.cc and u.cc. The m.cc file contains the main() function. How then does this function, or any other function inside m.cc call any function from t.cc or u.cc? For that we need the interface of these files.

But not through header files – they are the biggest PITA here. We want to simply mark the real entities that they will be accessible outside this file, that’s all. The modules give us this opportunity. Therefore we have this instruction in m.cc:

import .t;
import .u;

For now, ignore this dot before the t and u names – just pretend they aren’t there. This will be explained later.

Alright, but then what are these t and u here? This is the source file name without extension, sure, but binding it strictly to the filename doesn’t serve the language well. Therefore we want to be able to also declare it explicitly as a name that will be used the same way throughout the whole project. Still, a compiler must have some idea what to search for when this name is seen.

This name then designates a module form file.

There are several types of module form files, so we have:

A module interface form file is the result of compiling the interface part only, that is, the compiler only grabs the signatures of entities marked for export and makes a database out of it, which can be then used by other files.
A module template form file is a special form of a module interface form file, which contains alternative entities or their fragments, which’s form depend on expansion of a preprocessor macro. These files are the direct form that can be installed among system files, equally like today the header files, although if there’s no need to have any macro-selectable alternatives inside, the module interface form files can be used instead. If not, the module interface form file for the need of building a dependent project will be created as an intermediate build file basing on the module template form file.
A module form file is the result of compiling a single source file (which should contain a complete module or a complete partition). To create such a file you need to compile the source file, while already having compiled the module interface form file for this very module (if one is needed), as well as all dependent module interface form files. The interface file for this module will be included in the resulting module form file, and the derivation and dependency information about the included interfaces of other modules will be also recorded in this file’s manifest.

So, the names used in the import instructions designate a module interface form file.

The module form files are these intermediate files produced directly out of the source files. Note though that we need to be able to compile particular file without having the dependent module form file yet; in case of cyclic dependency between t and u it would be impossible – that’s why we need to produce first the module interface form files. Let’s define then that t and u use entities defined in them both (so they form a cyclic dependency) and finally m uses entities from t and u. Hence we need to first compile their interfaces only, so let’s imagine a set of required command lines for that:

c++ -mi t.cc  #> produces: t.cmi
c++ -mi u.cc  #> produces: u.cmi
c++ -mc m.cc  #> produces m.cm; requires t.cmi and u.cmi
c++ -mc t.cc  #> produces t.cm; requires u.cmi
c++ -mc u.cc  #> produces u.cm; requires t.cmi
c++ -ma m.cm -o myprog  #> produces executable myprog

Alright, but wait: why do we use only m.cm (module form file for m.cc) to produce the application? What about the t and u modules?

It’s simple. They are dependent modules, as declared by the import instruction. And yes, this instruction not only requires to load the interface of the imported module, but applies also a dependency on that module form file. Hence all the information about all other files to be “linked together” with m.cm (speaking the language of the current build definition) is already in there. And yes, you don’t have this way any “separate header file to #include and object file to link against”. You have imported a module – this makes your module dependent on that module, both in the interface part and the linkage part. And yes, every unit in the project is a module (note of course that there still must remain a possibility to link the *.o files because the libraries for older C++ standards and also for C language should be still usable).

For a traditional approach using the make tool we’d have then:

%.cmi: %.cc
   c++ -mi %<

# Now according to dependencies (generated by 'c++ -mMM'):
m.cm: m.cc t.cmi u.cmi
t.cm: t.cc u.cmi
u.cm: u.cc t.cmi
   c++ -mc $<

myprog: m.cm t.cm. u.cm
   c++ -ma m.cm -o myprog

Likely the rule for myprog can be as well generated – we don’t need the list of modules to be linked together, we just need the name of the main module and all deps will be there already, but make must know about them in order to update things the right way.

Ok, just to have an explanation what these options are, there would be a kinda hypothetical C++ compiler that can compile in modules. The following options are used for this c++ command (this is a complete list of options, some of them will be used in later examples – this includes also compiling “uninstrumented” source files, that is, those that do not use the module and export syntax):

-mc: create a module form file out of the source file. This is similar to gcc’s -c option that creates an object file.
-mi: create a module interface form file. There’s no sensible equivalent in gcc, maybe compiling a header file, which goes without specific options, but this option here is to be used for either header files directly, if the old source layout is in use, or export-instrumented source file. Also the closest equivalent of the module interface file is today the precompiled header file.
-ma: create an executable file out of the main module. This is similar to using gcc without build specific options when passing source or object files. Might be that the command line doesn’t need any option, but this is used here for making things clear.
-mheader is the option to specify the header file that is associated with the given source file in case when you compile an uninstrumented source file. This option is obligatory in this case, together with -mname
-mname allows to specify the module name for the currently compiled module. For uninstrumented sources this is obligatory, as there’s no other way to specify that name. For instrumented module sources this is optional to override the module name.
-mdepend: specify the dependent module interface files. This is only for a case when compiling an uninstrumented source file that has specified source dependencies using header files instead of module names (no matter whether by #include or by import), or when compiling a module interface file from a single header file. This is necessary to record a module dependency information in the compiled module form file.
-mroot: specify the directory that will be toplevel for the source files given by path. This directory defaults to the current directory, but this option allows to specify the base directory, relative to which the source paths should be. This is necessary when you compile instrumented sources and the default module name should use the file path pattern.

Now, why am I starting with this? Because we need to have a build system working with C++ units as modules (instead of *.o files) first before we even start thinking about the modules in the C++ syntax.

Ok, then, what exactly is the *.cm file and how much does it have to do with *.o file?

So, likely in today’s systems the *.cm file will have to be an archive file that contains the *.o file inside, and additionally the *.cmi file, and a kind of manifest file with dependencies and other information.

Now; the compiler can easily extract the dependency information (-mMM option) by reading the file’s import instructions. Then the module form files required for linkage are there in the manifest file in the *.cm file.

Build system details: fixed directories and naming

It’s not that simple as it was before with simply *.o files that you could even give whatever names you want. This time it’s serious: the module name is bound with the module form filename and therefore also the directory of the project matters. This is because likely when you have source files placed in a directory, they will have to have directory path names of the modules. Of course, you can configure your build to have intermediate steps, compile all files in a single directory, bind them together, and then those intermediate files bind with other so created intermediate files. Or, you can build everything in one directory, just have the source files distributed throughout a specific directory tree. That’s why we need to decide something here – and that’s why this will already require specific source-to-module binding. Unlike *.o files, the name of the *.cm file matters because this is the name being used in the C++ source files with the import instruction.

We need then two directories to be defined – stating that we may also have shadow builds:

The project toplevel directory. This is the directory from which the source path starts for the sake of module naming. Defaults to CWD, can be overridden by -mroot.
The build target directory. This is the place where target files will be created as a result of compiling. Defaults to CWD, maybe some option could be used to override it (note that for *.o files you could simply override the output name by the -o option, and this could also include the path; with the module form file names the problem is that you might be able to override the path, but not the filename).

Therefore the most sensible rule for the default module name (that is, used when you have a module default; declaration inside) is the pathname as passed to the command line, either towards the CWD or towards the declared toplevel directory (-mroot option in the examples). Note that this toplevel directory might need to be always explicitly defined in case of shadow builds. So, this command line (stating that we have simply module default; declared inside):

c++ -mc ../common/utils/array.cc -mroot ..

will produce the file named common.utils.array.cm. And it will be accessible for importing under this name. Be careful here: the same exactly file may resolve to a separate module form file if compiled by the compiler using a different toplevel path. How the path to the source file is specified though it doesn’t matter – if it is an absolute path, it will be reshaped towards the toplevel directory, unless it is not directly available from the toplevel directory (without going uplevel), in which case the command reports an error.

The other way around: you may have an explicitly named module, so you have this declaration, for example:

module common.utils.array;

There are two good ways how to translate this into a module name: it could be common.utils.array.cm or common/utils/array.cm. There are various reasons to have the first or the second possibility, or even mixed. The compiler should allow to use either, or even allow to have a mixed form like common/utils.array.cm. Searching for these two alternatives shouldn’t be a problem.

Note though that if you name your module the above way, as common.utils.array, it will always have this name and will be always identifiable by this name, no matter where your source file is located towards the command’s toplevel directory (note that you can also compile a file with an explicit module name even if its path contains uplevel).

The use of partitions

If you saw the design of the C++20 modules, you saw also the idea of partitions. But there was, first, some feature in C++20 that I don’t think is worth saving, without partitions: you can have a single module split into single files. That’s rather a wrong approach, and partitions would be a method to cover that need, among others.

A module partition is defined in a single source file, but it contains definitions that will belong to the module. They cannot be nested and the only module that may import a partition is the partition’s master module. For example, we can have a module common.utils.array (master module) and separate parts can be defined in partitions named common.utils.array:static and common.utils.array:dynamic (note that one single source file must contain either a complete module or a complete partition).

The merit behind the partitions is to allow to split a single module into multiple source files, when splitting into smaller modules is impossible – like, for example, because you have a big class with a lot of methods, or you have a big class hierarchy out of which only one should be exposed as interface.

Every module that contains partitions should also have a “master source” for the module, which declares itself the name of the module and it should import also all partitions. The import declaration for the partition doesn’t exactly have the same meaning as importing foreign modules.

For other modules, it simply declares a dependency on that module. Partitions, however, are not dependencies on the master module – it’s the other way around. That’s why this only defines a binding, it’s a declaration that this partition is an integral part of the module. Partitions can as well be declared export import, in case when they provide their part of the interface.The interface for a partitioned module is a little bit more complicated.

The declarations provided in the master module are the base declarations for all other things in this module and all partitions. Therefore the first thing you do is to compile the “default interface” of the master module source. This interface is imported by default by every partition (that is, there’s assumed “import” for the master module in every partition). Additionally, partitions may even depend on one another by using import :another_partition;. Therefore compiling partition files requires also compiling interfaces of all partitions, and in case when there’s export import in one of them, also with the right order with regard of the dependencies. Then, when all interfaces are done, there’s an interface integrator call which binds all parts together.

As you have seen, the processing of the full partition name turns the : partition separator into the - separator in the filename. So, if you have a partition source file that declares module common.utils.array:static;, the compile command will produce the common.utils.array-static.cm form file. Note that if a source file has to contain a partition, the module name (with partition) must be this time defined explicitly (name deduction in case of partitions isn’t possible). The interface filename for the main part ofthe interface of the partitioned module has then -default suffix. So, summing up, with this setup, you have the following source files:

common/utils/array.cc – the master module source
common/utils/array-static.cc – the :static partition
common/utils/array-dynamic.cc – the :dynamic partition

Now we compile the interface in the following order:

common/utils/array.cc -> compile default interface to common.utils.array-default.cmi (this happens always when the compiler has detected at least one partition import inside in a module with no partition name)
common/utils/array-static.cc -> compile to common.utils.array-static.cmi dependent on the one above
common/utils/array-dynamic.cc -> compile to common.utils.array-dynamic.cmi like above
Then call the integrator for all these above *.cmi files to produce common.utils.array.cmi

Compiling the implementation then simply requires the integrated interface and all dependent interfaces, as usual. Compiling and integrating all partial interfaces first is required for a case when there are interface dependencies.

Local modules

That’s one of many things that likely were forgotten by the designers of C++20 modules.

As you know, with #include directive, you could pass the argument as <file.h> to be searched globally and "file.h" to be searched locally (though with fallback to global). That distinction not always makes sense, but at least for visibility it is often required, at least to mark that particular header file is a file from the current project, not from some external library. Maybe not every project makes this distinction correctly, some others don’t at all – but still, it’s needed, and the lack of this feature may also make programmers stick to includes.

The sensible solution for it would be to use the leading dot in the name. If you use it, it’s local, otherwise it’s global. There are theoretically also other solutions possible, like marking only global ones with double dot (to look like something similar to uplevel), but then you’d have to use it also as import ..std.iostream, which doesn’t look good.

Note though an important difference to #include "file.h" – the local path to the header file here is the relative path towards the source file being compiled, which means that you can use the path as your system allows it, even multiple uplevel path specifications. In the module system this isn’t possible – that’s why this toplevel directory is needed for the command line.

So, if you do import .common.utils.array; then it means that the form file should be searched in the build target directory as common.utils.array.cm file (plus combinations with path-separator). If you do import gnome.gtk.widgets; – it will then search the form file in the installation directory for library interfaces, either something like /usr/include and other standard paths, or specified with something similar to -I option.

Whether the module form files for a local module import declaration are to be searched also in the global path (which is the case of #include "file.h"), or exclusively in the local project path, it’s further to be decided, as this should be well thought through, but I personally see no reason for it. If you declare the local path, you want it to be the part of your project and not to be provided otherwise by some external library. Conversely, if you want this to be provided by the external library, you might at best want that it be provided in one of the “external directories”, including your private project’s directories, just added to this list. That’s the most sensible naming control practice.

By this opportunity, let me remind you that many developers are using #include "file.h" syntax even though the actual result would be that this file.h isn’t found in the same directory as the file including it, so the compiler will fallback to global search, including the directories specified with -I option (this is actually quite a popular technique, especially when you keep header files in a directory separate from implementation files). This makes it no different to #include <file.h>, but programmers like to use this syntax just to make it explicit that what is meant here is the file from the local project, just located in some other directory, so people wanted to have a shortcut to #include "../../include/file.h". This is actually a bad practice (I can tell you a lot how often I’ve been untangling long lists of include directory specifications when the #include "common.h" directive has picked up the wrong one out of about 7 various completely different common.h files in the same project), but then programmers often need this distinction, even if this is visual only. That’s why the syntax import .localmod; that refers to a module form file that should exist in the build directory where the compiler is compiling your file, as a distinction from the external library (including the C++ standard library), will be strongly desired.

By the local modules feature there’s one more concern: inside a project you might have various different directories where you have smaller “group” of files, and then you have some bigger parts that rely on them. You could make these parts separate libraries (some project do it this way), but if this isn’t something required to be replaced in runtime or by single installation upgrade, it’s not worth a shot. It also wouldn’t be a good idea that a whole project have to be compiled in a single directory. We need then, for a compiling command line, one directory where it should store currently compiled modules (and it is by default one of local module directories), but you should be able to specify additionally extra local module directories. I know, so far in the compilers there was only one -I option for header files and it was only to extend the list of global paths, but I think – especially that module names have limitations – there should be also a separate option to extend local module path as well as global module path. Still the same idea: local modules are those that are in the current project, global modules are outside of the current project. This is important because limitations of this so far was forcing projects to split into single libraries also within the frames of one application.

There’s also one more merit to distinguish local and global modules: not every module is to be used publicly by the library user. A library, which is to be used by other projects through importing modules, would like to expose only several distinct modules as public ones, and only those should be visible outside the library as those that could be imported. Only within a single project should all modules be accessible for importing. Of course, there’s always a question: could this be “hacked” if needed – that is, can I use private modules from a library because my project needs them? No problem with this – all you have to do is to compile the dependent project the way that allows you to reach out its modules as local ones, and then add the directory where they are accessible to the list of private module directory – this way you will still be able to import them, of course, as local modules.

The general module syntax

The module syntax in general isn’t going to change much towards what is defined in C++20, although there are some distinct changes. Optional parts are surrounded by ?question marks? like this.

This is the general structure:

module; // optional: Global Module Fragment
...
... (only preprocessor directives directly used)
... (definitions provided here stored on the local files' database)
...
?export? module ModuleName; // starts import section
... (import section)
...
... (contents - up to the end of file)

Explanations for this fragment:

The module; declaration, if GMF is present, must be the very first declaration in the file, possibly except comments. If GMF is not used, then module <name>; must be the first declaration, otherwise it ends the GMF section and starts the import section.

The import section starts after the named module declaration and ends with the first declaration that isn’t import. Since that point, no other import declarations are allowed for the rest of the file.

The named module declaration may have a form of module default; in which case the name of the module is defined by the compiler (the compiler may also reject this request). The named module declaration must be added the export keyword if the module is going to export an interface.

The import declaration can have the following forms:

import <filename>; // parse the "filename" like in include and attach declarations
import "filename"; // like above, but first search in the same directory
import ModuleName; // import the interface declarations for the source file
export import ModuleName; // import the interface declarations to the interface of this source file

The import instruction with the filename argument may be used without having the module section in the same file, but it is required that the module declaration is in the beginning of the parsed source file, although it may be on any global level also between other toplevel declarations. More practically, the import instruction with the filename argument may be contained in a header file. It does merely the same thing as #include, except that it works in a separate environment, that is, declarations provided in this file will be accessible for the source file that declares this import, also recursively.

The import instruction with the module name can be placed exclusively in the import section. If the export declaration is present, then it creates an interface dependency, otherwise only the implementation is dependent on that imported interface. Note that the export-import must be done in case when exported declarations create parts of the interface dependent on parts of the imported interface (there are strict rules in which case what declarations are part of the interface).

In the module declared as export you can mark various entities as export and this way make them part of the interface. There are several rules how it is done.

The namespace declaration is level-transparent, that is, it only defines the namespace for the contained declarations, while namespaces themselves cannot be exported. Unnamed namespaces also cannot contain exported declarations. Note that declarations in unnamed namespaces (just like static) are not visible outside the same file, even for other partitions of the same module.
Type names declared using typedef or announced (with e.g. struct MyType;) become part of the interface if they are used anywhere in the interface. Otherwise they can be used if they are explicitly marked export. This concerns also complete types provided by interfaces of other modules that are not export-imported. If these types cannot be used as incomplete in the interface, an error is reported.
Normal functions (with no special storage class, or extern by default), if marked export, have their function signature only declared a part of the interface (not the body). Functions declared with inline are whole part of the interface (together with the body).
Global variables that are exported will be visible with an incomplete declaration of their type. If they are initialized with an initialization expression, this expression will not be a part of the interface. The type of this variable must be explicitly provided in the module that would like to use it, but it’s not necessary to be imported if this global variable is not used.
Complete types (struct/class/enum) can be marked export and this way they become a part of the interface as a whole. Although two new features can be considered:
- Normally a method defined inside the class is treated as inline. If you declare a method extern, then it will be just like defined outside the class (that is, the body will not be a part of the interface, even if defined inside the class). You can still define the method outside the class (the old way) with the same effect.
- If you declare the whole class extern, then it will form something like a “pimple pattern”. Every method in such a class will be extern (that is, with only the declaration being part of the interface), and all fields and derived classes must be private. Friends are still allowed, but even for friended classes and functions there’s no access to fields. Special fields must be marked extern to be present in public or protected section or accessible for friends; those are reached out using special accessors that do not rely on the structure offset and can survive changes in the class. In extern classes all exported names are part of the interface, but not the class layout and therefore you can’t do sizeof on such a class, for example.
The preprocessor macros can be exported into the interface. This is accessed through #export directive used either as a replacement for #define or with the syntax of #export (name1, name2...) to declare export for existing symbols.

Free ordering and open classes

Just before we pass on to the smooth transition rules, let’s show also additional opportunities that the modules, defined in one single file without separate header files, and importing (not including) other modules, give us.

The module system provides a unique opportunity to free yourself from the order of definitions. It’s always been annoying in C++ that you have to provide the function signature (which is a repeated the same definition that must be kept in sync with the one in the function definition) before the code that calls it, just to be able to place this function after this code, or reordering the functions’ definitions just to avoid doing it.

Imagine then that you don’t have any #include in your source file, so your source file is plain, reaches out to the other files’ interfaces through import and your file contains only its own entity definitions. Imagine then that the compiler may go multi-pass, where one pass only reads the signatures of the functions, or if you have a class, it reads the class’s contents, derived classes, fields, and method headers, but not method bodies (including initialization expressions for global variables and in-class fields). That’s actually what the compiler has to do when compiling in the interface mode.

If the compiler can do it this way, there’s no need to preserve the order of declarations. That is, you don’t have to extract functions’ signatures into function declarations just to paste it before a function that calls it. Go simply:

module default;
import std.threads;
import std.functional;
import .mainapp;

int main()
{
     start(mainapp::startup);
     return mainapp::loop();
}

void start(std::function<void()> fn)
{
    ::mainthr = std::thread(fn);
}

std::thread mainthr;

This way you are free from any ordering. This should be the rule after the module <name>; declaration, let’s say you can add some extra things before module (in the global module fragment) and there the order matters. But, after the module declaration, all signatures are read separately from bodies, so you don’t have to have function declarations at all, even forwarding declarations. Even if you have a global variable, or even a constant, their name and type will be read and remembered in the first pass, then only the second pass will read their initialization expressions. All that is required for the compiler is that by the syntax it must be able to distinguish the “alleged” (because it’s not yet defined) type name and object name – and the variable declaration should suffice for this distinction (except the infamous case of X Y( Z() );declaration, but that’s to be separately taken care of).

Note of course that not every declaration is possible to be unordered. That’s because the C++ syntax relies on the main qualification of a symbol, whether it is:

an unknown symbol (doesn’t exist in the database yet)
a type
an object
a template

So, theoretically declaring a variable of a type, which’s declaration will only be provided later, is theoretically possible, but you need to first provide some information for the compiler that particular name designates a type – for example by using class/struct/typename before the type name.

This may go even further: open classes. That is, you can declare additional methods for a class without having it prematurely declared in the class, as long as they are declared in the same module as the class. The class probably would have to have some special explicit modifier added so that this is enabled. This feature would have limitations though:

You can’t provide an “open method” for a virtual method or virtual method override.
You cannot define a method with a name that would eclipse a name from the base class. That might create a confusion because in C++ an eclipsing method eclipses also all overloads of this method.
The access specifier (public/private/protected) must be specified with this method

There shouldn’t be any problem with the interface – even if you define a public method in a partition (although this may go against readability). When the interface file is being generated by the compiler, it should take into account the whole class, including open methods, and the interface viewer should show it as the class with all these methods visible as if they were declared inside the class, although of course only with their signatures.

Smooth transition

And now we have the most important part: we need a method to provide a smooth transition from the current “C modules” style into the C++ module style. That’s why the first thing to do was to define the sensible rules of the build system for C++ using modules. In this system, every source file compiles into a module or a module partition, no matter if the specific module syntax has been used in the source files or not. The sources in the current form with having separate header files were being referred to before as “uninstrumented” sources.

Why is this smooth transition needed and important?

Ok, that’s kinda pathetic, but I feel that I need to explain several things to the people who were designing the modules for C++20.

We are today in a situation that rarely anything in C++ is created from scratch – even if you skip system and standard libraries. The code is being widely reused, and every software company has lots of legacy C++ code and switching to another language is not easy and not always possible.

The C++20 modules design, as defined by the ISO standard, offers completely no possibility for a smooth transition (or, at best, gives this thing up to the compiler designers, while those seem to have completely no idea how to do it). You would have to translate every file in the project to the new form, nothing can be done partially or as shareable with older standards – once switched, there’s no going back. And that’s the problem. Because stating what I said in the previous paragraph, a possibility for a smooth transition is crucial to adoption of C++ modules. Either this is provided, or C++20 modules will be at best used by some maniacs, and will never reach the software business.

The very first thing that would have to be done is to define the new build system. That’s why I started with the build rules using the modules already in the C++ syntax, but now it should be adopted for existing files of the C++ projects that have no C++20 modules syntax in them.

This is a huge task for the high level build systems – such as Autotools, Cmake, or my own project, Silvercat. These systems should offer you just adding a single option to your project definition and take the source files as they were there before, then generate appropriate command lines to produce the intermediate files in the module style.

The first thing you need to do is to be able to provide this new build system also for old sources. So, we had the following instructions in our first example:

c++ -mi t.cc  #> produces: t.cmi
c++ -mi u.cc  #> produces: u.cmi
c++ -mc m.cc  #> produces m.cm; requires t.cmi and u.cmi
c++ -mc t.cc  #> produces t.cm; requires u.cmi
c++ -mc u.cc  #> produces u.cm; requires t.cmi
c++ -ma m.cm -o myprog  #> produces executable myprog

While having instrumented sources. Let’s say we have now uninstrumented sources provided as implementation and header files:

t.cc t.h
u.cc u.h
m.cc

The following instructions should be used to compile them:

c++ -mi -mheader t.h -mname t
c++ -mi -mheader u.h -mname u
c++ -mc m.cc -mname m -mdepend t -mdepend u
c++ -mc t.cc -mheader t.h -mname t -mdepend u
c++ -mc u.cc -mheader u.h -mname u -mdepend t
c++ -ma m.cm -o myprog

Note that the last instruction hasn’t changed. This is because it works on module form files already, and those should be completed just like from the instrumented source files.

Of course it doesn’t look very good especially that you need to mention the module name and also the dependencies. Unfortunately the compiler won’t be able to autodetect dependencies on modules when the interface is being imported as a file, no matter if you use #include or import for it. Only when you use import with the module name can this dependency be autodetected and only then will the -mdepend option be superfluous.

Note also that there should be a possibility to supply multiple dependencies, so the syntax allowed here should be like with -Wl option: either multiple arguments separated by comma, or one argument at a time and multiple single-argument instances of the option are allowed (the latter is used in the above example). For the sake of makefile the second one is actually more comfortable because replacement of a space with comma is, although possible, kinda complicated.

Mixed sources

This build mode for uninstrumented sources will have to be available actually forever. That’s not only because this form is required to be used during the transition. Also in lots of projects there are cases of sharing the source code between projects, of which one must stay with an older standard, and simultaneously we’d like to make use of it in our project, written in the C++ module style.

Hence you should be able to tolerate “traditional” C++ files, having their own interface in the header file, with #include for other header files, which must stay this way, while we want it to fit in the build system with modules.

Let’s try then discuss the smooth transition step by step.

Step 0: Source shaping

You should simply think of single implementation files like single modules, in several cases you might resolve to having one module in multiple files (partitions). So, in the “C modules” system we can have in the source files shaped only as the following cases:

A single implementation file (to become a single module) and its header file.
Multiple implementation files having one common header file.
A single header file without an implementation file (“header only”).
A single implementation file without any interface (usually the file with main() function).

If your project contains any differently shaped sources, this must be reworked to fit in these explicit layouts. This is the first thing to do, as in order to switch your project to modules, you have to start “thinking modules”. So you have to have idea, how to assign particular files to modules and define this assignment.

Once this is ready, you can switch to the new building rules.

Step 1: New building rules

I wanted to have some example project – it’s hard to find anything that is in the middle between completely simple and unimaginably complicated, so I finally resolved to the https://github.com/JanSimek/voice-over-lan project that I found on Github.

In this project we have quite a simple structure and a build defined in qmake. When you take a look at this project, you can see that it has a structure very easy to translate – the main.cpp file with main() function and then all files split into implementation and header file. So, for Makefile we’d have the following rules:

QTPKG := QtCore QtGui QtWidgets QtMultimedia

QTHEADERS := `pkg-config --cflags $(QTPKG)`
QTLIBS := `pkg-config --libs $(QTPKG)`

# Compile header files into module interface files. Module names
# are taken explicitly from the file names. Note that we don't
# compile interface for main.cpp - nothing is dependent on it.
voiceio.cmi: voiceio.h 
voicesocket.cmi: voicesocket.h
buffer.cmi: buffer.h
messenger.cmi: messenger.h buffer.cmi voiceio.cmi voicesocket.cmi
%.cmi: %.h
    c++ -mi $< -mname $(basename $<) $(QTHEADERS) $(addprefix -mdepend ,$^)

main.cm: main.cpp messenger.cmi  #<<< 'messenger.cmi' as identified by header 'messenger.h'
    c++ -mc main.cpp -mname main -mdepend messenger.cmi $(QTHEADERS)

# General scheme for modules without dependencies and matching a simple name scheme
%.cm: %.cpp %.h
    c++ -mc $< -mheader $(basename $<).h -mname $(basename $<) $(QTHEADERS)
# This takes ^source    ^its header file        ^module name to be given to this file

messenger.cm: messenger.cpp buffer.cmi voiceio.cmi voicesocket.cmi
   c++ -mc messenger.cpp -mheader messenger.h -mname messenger $(addprefix -mdepend ,$(filter-out $<,$^))  $(QTHEADERS)
# You need -mdepend option because messenger.cpp still uses #include with a header file

# Application executable
QtIntercom: main.cm voiceio.cm voicesocket.cm buffer.cm messenger.cm
   c++ -ma -o QtIntercom $^ $(QTLIBS)

Note one important thing here: these rules cannot be anyhow generated. The definition of the modules in this form is ephemeric – they exist only in the build file and this isn’t known to the source files at all.

And that’s it for now. This build system should work just as well as the traditional one. Actually I had the high level build definition in qmake, but it was easy to prepare the right structure with make, while the right header/libs options can be provided from pkg-config. What is important is that here you have the whole structure defined just as well with the structure of modules without using module declarations explicitly in the code. This is the first step and the build definition may actually stay the same all the time, until you get rid of all the header files finally. Actually it doesn’t disturb that, for example, -mdepend is still in use, it will be simply superfluous in the future, although likely if you, for example, turn messenger.cpp and messenger.h into a single messenger.cpp file, you’d have to change this build rule as well. High level build systems would have to add some special feature to provide in-build module declarations using the source and header file, for every module, just to be blocked systematically for every module converted to a single-source.

Ok, but then let’s explain how this still matches the whole theory of modules. After all, in the module definitions we had the export keyword to mark things to be visible outside, we had the export import variant of the import to mark that things that are exported intermediately, but now we have all these things without declaration. How does this work in this intermediate form?

That’s where the -mheader option comes in. Everything that is in the file that you declare as -mheader in this compiling command is being declared as an interface for the module, which’s name is given in -mname option. This way, everything that this header file declares, is exported. This, obviously, limits the decisiveness, which parts you should be able to export, but then, the visibility of the public parts is still the same as it was with the C-modules style.

Of course, such a header file can as well include another header file – but then, this cannot really be tracked. A dependency like this should be defined by a dependency on a module interface form file, which is only a definition in the build system. If you add any local module dependency by adding include of a local header file, you have to manually update the module interface form file dependency. Fortunately, this will be way less troublesome once you switch to real modules.

Step 2: get rid of #include

We need to replace now every occurrence of #include with import. As per definition in the C++20 modules, import can also take an argument identical as the one for #include and with the same meaning. Also the #include directive with a local path take the path towards the source file being compiled, so you can freely move the toplevel directory as you need in the build definition – only in a case when you move the source file to another location there would be a need to fix the include path. Thanks to that you can prepare the toplevel paths in the build definition as you need.

Note also that you can freely mix #include and import, but #include should not be allowed after the module <name/default>; declaration (that’s why you have the global module fragment started with module; statement). That means, you should fix any cases when you do #include into an implementation file somewhere in the middle, have moved some set of functions to be in the other file, but compiled together with the implementation file (I often saw it as *.inl file). This thing would have to be solved somehow, although the partition feature should be a good enough replacement for such cases. In effect, you should simply translate every #include into import, and leave #include (in the GMF) only if you absolutely have to.

Here we have then the beginning of the voiceio.cpp file:

#include "voiceio.h"

#include <QAudio>
#include <QAudioInput>
#include <QAudioOutput>

#include <QDebug>

VoiceIO::VoiceIO(QObject *parent) : QObject(parent)
{    
    QAudioFormat format;
...

Now we simply change #include to import and nothing more for now.

module default;

import "voiceio.h";

import <QAudio>;
import <QAudioInput>;
import <QAudioOutput>;

import <QDebug>;

VoiceIO::VoiceIO(QObject *parent) : QObject(parent)
{    
    QAudioFormat format;

Why is this step important? Because the project file may unconsciously use so-called “implicit inclusions”, that is, the compiler doesn’t complain about any undefined, but used, entity in a particular header file only because they were included from another header file in the same implementation file earlier. The import instruction, however, doesn’t work this way. Every imported file is being compiled separately in the “default environment” (meaning, having particular preprocessor macros defined) and the resulted database is incorporated into the file that declares import. That is then a good opportunity to get rid of these bad practices. Note that the macroguards in the header files will not be taken properly, as a header file will be working without having any macros defined – but the macroguards will not be necessary anyway, as with import it’s also taken care that the same file imported multiple times in a step will resolve to the very first import. Here also note that it’s an opportunity to get rid of yet another bad practice – including the same file by specifying different paths, even by other header files. No problem if you are referring to a file using a different relative path (because the path will be always recorded as absolute, and with symbolic links resolved), but the problem could be if you take the file with the same name, actually it should resolve to a different path.

So, this step should be quite easy to do, and after that you should recompile the whole project and fix any possible errors that arise from any messup with includes.

Note that this time you need the module declaration in the beginning of the file. This is necessary before any import instruction can be added.

Just a side note – I know that the early implementation of the C++20 modules in gcc requires it to “precompile” particular files explicitly and make the resulting file (*.gcm) available, otherwise even a stupid import <iostream>; won’t work. That’s a wrong approach. The compiler should do exactly the same thing as it did with #include and the only different thing should be that it should be compiled in separation. If any “compiled version of the header” is required, the compiler should take care of it on its own, as it earlier did with the “precompiled headers” feature. If not possible, it should simply grab the raw text file and process it as before, just in the separated environment.

Note that also likely some of the includes will have to stay as they are in this form; if the Qt library, which is used here, isn’t provided with the modules (and that’s a separate large topic that will be taken up later), then the only way to use it would be to use its original header files, so this like import <QAudio>; is likely here to stay for at least longer time, if not forever. Not even mentioning C libraries – hardly anyone would like to provide a special wrapper with C++ modules for them.

Step 3: Translate the module imports

Now that everything is still compiling correctly with import using the header file, let’s change the imports to the module name, using the name that is given to the module in the command line (at least for now). Reminder: with the current build definition, you already compile the project in modules, which means, module form files are produced, and if so, they can be also imported as modules.

Note: you can’t yet remove “importing your own interface” in the implementation files. We are about to change imports that require interfaces of other modules.

In our project we have a kinda unusual situation: a header file includes other header files. We could be not always sure if the intention was to “use” the interface or “reexport” it – but by including a file you always do the second one, so let’s resolve to it and this can always be fixed later. Here’s then the beginning of the messenger.h file:

#ifndef MESSENGER_H
#define MESSENGER_H // keep the macroguard until getting rid of H completely.

import <QUdpSocket>;

import <QAudio>;
import <QAudioInput>;
import <QAudioOutput>;

import <QDebug>;

export import .buffer;
export import .voiceio;
export import .voicesocket;

class Messenger : public QObject
...

Note that even though we don’t have a module declaration file, this is still a header file to be imported into an implementation file, which already has the module declaration. Header files may be imported also by other header files, but even if so, they are at the top level imported from an implementation file, and there the module declaration must be provided.

Note also that we have imported some local names as modules even though they don’t use the module syntax yet. That’s also not a problem. We have the module names given at the compiling command using the -mname option and then the compiler should reach out for the module interface file through the local names in the import instruction.

Note that there still shouldn’t be a problem with replacing the import declaration with the module names only partially or only in several files. Moreover, it will be much more complicated to switch the sources into the module form. Therefore it would be even recommended that you change the import form into the module name for only those modules that you plan to switch to single-source in the first shot.

Step 4: Switch “uninstrumented” module sources to single-source form (first for the least dependent or independent modules).

Ok, as the messenger module in this project seems to be short enough so that I can give it as an example, let me paste here the complete source file that would be the resulting messenger.cpp file after transforming it to the single (“instrumented”) source.

module messenger;

import <QUdpSocket>;

import <QAudio>;
import <QAudioInput>;
import <QAudioOutput>;

import <QDebug>;

import .buffer;
import .voiceio;
import .voicesocket;

export class Messenger : public QObject
{
    Q_OBJECT
public:
    explicit Messenger(QString address, QObject *parent = 0);

signals:

public slots:

private:
    QUdpSocket _udp;

    // Voice
    VoiceIO* vio;
    VoiceSocket* vso;
    Buffer* buf1;
    Buffer* buf2;
};

Messenger::Messenger(QString address, QObject *parent) : QObject(parent)
{
    quint16 buffersize = 0;
    buf1 = new Buffer(buffersize);
    buf2 = new Buffer(buffersize);

    vso = new VoiceSocket();
    vio = new VoiceIO();

    if(address != "")
    {
        vso->connectToHost(QHostAddress(address), 30011); // QHostAddress::LocalHost
        qDebug() << "Connecting to " << address;
    }
    else
    {
        qDebug() << "No peer address specified. Will only play sound from others";
    }

    vso->setEnabled(true);
    vso->startListen();

    // Voice in
    connect(vio,  SIGNAL(readVoice(QByteArray)), buf1, SLOT(input(QByteArray)));
    connect(buf1, SIGNAL(output(QByteArray)),    vso,  SLOT(writeData(QByteArray)));

    // Voice out
    connect(vso,  SIGNAL(readData(QByteArray)),  buf2, SLOT(input(QByteArray)));
    connect(buf2, SIGNAL(output(QByteArray)), vio, SLOT(writeVoice(QByteArray)));
}

Note one important thing here. It’s just a pasted header and implementation file; the definition of the method is still separate from the class, instead of being defined inside the class. That shouldn’t be a problem – you should be able to use either, or even use open classes, as I described earlier.

OTOH we should be able to declare even long methods inside the class. That would require a change in C++, which could be applied only for a case when a source is defined as a module: the inline keyword, and the rule that a method declared inside the class should be inline, could be got rid of. The compiler should compile every function body to be callable, and decide whether to call a function or expand it inline on its own. The matter of inlines should be configurable in the compiler, but the user should be also given an opportunity to decide about this on their own.

There’s one more special focus here. In the earlier header file, messenger.h, we had the other interfaces used by e.g. export import .buffer, but now we have just import .buffer. This causes that the interface file for the messenger module will not reexport interfaces for buffer, voiceio and voicesocket. Yes, but that’s actually not necessary. Let’s show now the form of the main.cpp file translated into the module form:

module main;

import <QCoreApplication>;
import <QCommandLineParser>;

import .messenger;

export int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);
    a.setApplicationName("voice-over-lan");
    a.setApplicationVersion("0.1");

    QCommandLineParser parser;
    parser.setApplicationDescription("Voice over LAN");
    parser.addHelpOption();
    parser.addVersionOption();
    parser.addPositionalArgument("address", QCoreApplication::translate("main", "Address of the counterpart"));

    parser.process(a);

    QString address("");
    if(!parser.positionalArguments().isEmpty())
    {
        address = parser.positionalArguments().at(0);
    }

    Messenger msgr(address);

    return a.exec();
}

So, was that export import really necessary? Looxlike no. The only part of the messenger module being used in the main module is the Messenger class, so the main module will need the interface of the messenger module and so with linkage, but all other things that this class is using need not be visible. There would be another thing if, for example, this class defined any method in the public interface that uses any types defined in these dependent modules, and the use of the interface of this class required to use a full definition of this type (that is, if it was using just a pointer or reference to this type, which requires only an incomplete type definition, it is still not necessary). But that’s not the case here. If we had hypothetically here any Messenger class public method, which has, for example the const VoiceSocket& parameter, this still doesn’t make it necessary that the messenger module do export import .voicesocket. The module that is using the Messanger class may not even have a need to call this method. However, if it does, and somehow must create the object of the VoiceSocket class in order to be able to call this method, then this module must – independently on the messenger module – do import .voicesocket on its own. On the other hand, if there’s any inline method defined in the class, which uses details of given type, then this class must also reexport the module that defines this type.

As for Qt library itself, there could be another concern – the Qt library has additionally a special generator that generates extra files for implementing things marked in the class by Q_OBJECT and signals markers. That shouldn’t be a big problem in this case – this required previously a header file so that the class definition can be extracted from it, this time it will be the whole module source file. This will likely have to be made a module partition and declared the name in the command line, until the moc generator will get an option to create the source as a module partition.

Problems and the environment

There’s one thing you likely won’t be able to change in the C++ development. This is so-called “environment”.

It’s simply the set of macrodefinitions that are known before parsing the C++ source file first by the preprocessor. Some of them are defined by the compiler internally. Others can be provided in the command line (like with e.g. -DENABLE_LOGGING you provide a macro named ENABLE_LOGGING with the value 1). Additionally, the current form of the preprocessor allows them also to be defined explicitly before the #include directive so that this macro is intentionally visible inside. There’s an infamous example of stdint.h header file that requires that a user do #define __STDC_FORMAT_MACROS before including so that particular part is defined; fortunately the C++ standard didn’t adopt this method, but you can guess that this feature is still largely used in commercial projects, although mostly by using the command line (that is, a header file has declarations based on a condition check of a macro that would be provided in the command line). It can’t be simply said “sorry, not supported, and no replacement is available” because this way this project will simply stay with pure #include.

A different environment used for compiling particular source file may mean a different form of the intermediate file created out of it, as the preprocessor may cause the raw C++ file to be completed differently. That includes the sole C++ syntax. And macros are there everywhere, including the C++ standard library, which also largely relies on the underlying C headers from system libraries.

This simply means that the module interface form files (*.cmi in examples) cannot be a complete replacement for header files – they can replace header files only within the current build environment. Can be also worse – the header files in the current form may be also parametrized by changing the environment before including, or even worse, the same header file configured with a different environment (as included in different implementation files) – either through definitions before including, or by command-line macros – may result in a different “preprocessed text” of that file in different files compiled in the same build environment. That might also render the precompiled headers feature not to work properly.

By this reason, if the module system should be able to go anywhere beyond the current project (that is, when it should be used in the library distribution), we need some form that could look like the header file before preprocessing. Note that the normal workflow for having a header file finally into the code (let’s say, for example, you have an inline function defined in the header) is:

Header source file | preprocess -> preprocessed file | compile -> compiled db | link -> exec

How to place the module form files here? Currently the only sensible method is:

Header source file | preprocess -> preprocessed file | precompile -> precompiled db | link

And this precompiled form can be available as interface. The problem is, it’s preprocessed already. While what we need is something that can be called “a module template form file” (MTFF). Here “template” means that it’s something to be instantiated through the preprocessor macros. So that it can be used this way:

Header source file | templatize -> MTFF | compile -> compiled templated db | preprocess ->…

How to achieve it? Well, it’s not exactly important how to create them, but how to use them, how to incorporate them into the whole application-library system. If a compiler has a problem with preparing something being a “precompiled, but not preprocessed” header file, it can simply resolve to providing an original header file for it, including by generating a header file (by completing it from the exported definitions in a module source file) in case when you already have a single-source form. Then this header file will be simply raw included in the module template form file.

How can I imagine such a thing to exist – a form that has passed compiling, but needs to be run through the preprocessor to resolve to the final linkable form? That’s not impossible, at least I can imagine it as running first through the preprocessor, but not the usual one. A special one is needed, which will also parse the C++ entities, and will not just “resolve” macros in place, but instead generate alternatives. How?

Preprocessor macros are being used twofold – in the source code intended to be C++ as a macro to create a replacement, and in the conditionals to enable or disable particular sections. So, if there is a macro to be resolved, it will be resolved in place, as usual (note that an undefined macro will be simply not replaced and will look like a function call – but that’s nothing new). But conditionals using the macros will be resolved into multiple definitions that will be exposed as to be resolved later by a macro. If a macro definition is always provided, but depending on other macros it may have a different form, then everywhere this macro is used, there should be generated alternatives. The code to be compiled must be then provided as possible alternatives, to be resolved with providing the right values of the macros that were used in conditionals. Frankly – I don’t know if this is possible to do, that’s a fresh concept with no feasibility study, but that’s more-less how it should look like.

And such an interface file should be able to be created so that it can be attached to the library.

Object files and libraries

And here is one of the biggest problems, partially hooked up with the templates and the infamous “export template” feature from C++03.

The object files, as they were defined earlier for the C language, cannot hold any form of templates. They can’t also hold classes, structures, function signatures, and even constants, unless they are kept as variable and maybe the particular platform features marking them as constants.

It’s completely not a problem if you compile everything statically, even if you make a library. You can define a completely new format, which will allow to keep also entities like classes, templates, compile-time constants, and having a module form file that would be used simultaneously as an object file to be linked, and an interface file. Also, you don’t have to worry about upgrades, nor even variants – when you have a newer version of a library, your dependent application would have to be recompiled anyway, in order to take advantage from the upgrade.

The biggest problem here will be with shared libraries. And it’s not possible to simply move all C++ features into the world of shared libraries. At least in the very beginning you’d have to agree that simply only several types of entities are allowed to be “shareable”. Others can be only dependent statically, which means, that if your application is using a shared library, it can only use the next version of a shared library from the predefined library compatibility line and correct variant, and of course not all features may undergo shared upgrade. For example, all publicly used structures must stay the same throughout the whole compatibility line, you can at best modify the code inside the function that operates on it. In case of a structure, you can at best change a name of a field, if it was declared as not being in use (kinda stub or something), but it must remain at the same position and with the same type. This is a problem known as “ABI compatibility” and it’s a big problem in the development today, about which the C++ standard doesn’t care. Moreover, the known “pimple” pattern (by making the class that is the real API have only one field that is a pointer to some internally defined class, and all methods are then defined in the implementation file, and only those will see the true fields of this class) was one of the methods to not only avoid the needs of recompiling, when changing the structure of the class, but make also the class flexible for any internal changes. In a “pimple class” all you must preserve throughout the whole compatibility line are existing methods. In a true class, you are only allowed to add new methods, everything else, especially fields and their order, must be preserved.

Lots has been written and said about the ABI compatibility. And the problem isn’t that these compatibility rules aren’t enforced or granted – the problem is that the current software management tools do not give any possibility to take care of the ABI compatibility. At least I saw something in the MacOS system, but actually this problem is much bigger in C++ than in C. The C++ modules feature is a unique opportunity to take care of this problem. If you have a module form file from the old version of the library, there should exist a method to do automatic ABI compatibility check – that is, something that simulates the old compiled application to be used together with the upgraded version of the library. That should be doable, as long as the module form file contains enough information to perform that check. That’s why module form files are superior to object files, just because they do contain this information that would be otherwise wiped out in the library (actually it doesn’t matter if static or dynamic because both base on object files).

Modules in the service of libraries

Modules might still be useless, if we can’t find a way how to make our project use an external library still by importing modules and take all advantages from them. But to allow this, we can’t distribute libraries exactly the same way as before, that is, with header files and *.a and *.so files that will be then referred to by -l option.

Let’s imagine then first that we’d like to distribute a static library – this will be simpler and more general as development package.

We need to distinguish first the module importing method for modules within one project and importing modules from a foreign library – within the source project of the library. We can have interfaces for modules inside the project, but these are kinda “private interfaces” – which sounds maybe kinda ridiculous, but what is meant here is that these are interfaces and modules used in the project, but not to be exposed for the library user, which part is a “public interface”.

And also don’t forget the final form – we want to have everything defined directly in the source implementation file and only have marked particular functions or types that they will be a part of a public interface, while others might be exposed as interface for other modules within the project. Only those parts will go to the public interface and therefore will be stored in the module form file of the public interface.

There are many problems to solve here and several solutions we could use. Note that you can’t make a public part of an existing module any sensible way. A module is compiled into a module form file and is visible under the name of the module. The public interface must at least have a different name, or selected modules must be public (as whole).

The simplest way to solve it – although not quite nice for the programmers (because it simply makes the same thing as the “good ole headers”) is to create separate module that would only become a public interface. You know which modules constitute a public interface and only the interface form files of the public interface (you don’t need full form files because they will be in the library) will be copied to the library package, together with the library.

Another possibility is to create a special name with a partition-like suffix except that it is public. You can then mark any toplevel-capable entity (meaning, classes, structures or functions, but not, for example, methods) as export public, and this way this will be added to the public interface. The compiling command of the module interface will then produce two *.cmi files, including one with -public suffix (as this is a keyword, you can’t make it a partition name). This module will be loaded by default if you use its name with import, even though it has this suffix.

Don’t forget however what was said already about the current form of header files and the module template form files – stating that you can still use preprocessor directives and depending on the externally defined macro provide one or the other version of the function signature (for example), or whether to provide it, the conditional must be also kinda “repeated” in the interface and allow a user to decide about things in their own application. This could be avoided if module form files were created out of modules in your project and land in your build directory, but this is a file to be distributed.

And no, this can’t be avoided. Just because it’s currently possible to decide about that, this possibility must be provided, otherwise many projects won’t be able to be moved to use modules. On the other hand, it’s not that complicated – if the compiler is unable to provide the module template form file in the already compiled form, it can provide it as a hidden header (that is, the whole header file will be put as an ingredient of the module interface form file). If the header isn’t available (you have a single-source form already), the compiler should simply generate it.

The resulting (static) library then should provide:

The static library file. This should be something different to the current *.a files (because they are supposed to contain only *.o files), there must be some marker of a different format, and although they should be able to contain also *.o files, they may likely need to contain also some detailed code (might be that also that in the source form) that should be “re-imported” into the application’s source files as per request of the interface through import. Might be something like *.ca file, which will be also just the *.a file, but containing different contents. Linkage against such a library would be specified different way – the directory could be specified more-less like today with -L option, but the name of the library shouldn’t have to be specified – this information should be provided in the module interface form files.
The module interface files. You can have as many of these files as you want and you can fine-grain the interface of the library as you wish, just all of them should be provided with the library as a package.

The package can as well contain multiple libraries and there’s no need to have any outside binding of the module interface files per single library. Each module interface file should simply have an information of the library that it interfaces.

Now: when the compiler imports an interface file, which for example contains a function template, then this template will have to be physically provided in the library, not in the interface file (the interface file doesn’t provide any bodies). If it can’t be in the compiled form anyhow (and note that it will still have to be a form of module template form), then it should be provided as a source.

Linkage against such libraries – provided they are still done traditional way, that is, there is a set of *.o files there and additionally some, possibly precompiled, headers – should be done by having them used where it is necessary. Note that in the single-source and module-aware build system we only have an import instruction to load the interface (and this would include also the whole header file inside the module package), and then you compile the main module of your target, which will do linkage of appropriate *.o files when necessary – which ones, it’s already specified in your main module.

Likely your module-based library will also need something like a main module. Your application can then import everything that the library offers, by importing, for example gnome.gtk.all. The module with this name doesn’t have to contain any definitions, just import all modules that this library provides. As per how this can physically be provided, there are two potential possibilities:

Distribute the library as a package of modules. Every module has its name and contents, for both importing and linkage – linkage will be done automatically anyway. The whole package might be dissolved in a particular directory and provide a pkg-config entry that will instruct the compiler where to get the module path. This will contain module form files that will serve as both import source as well as linkage. Note though that this way it would be impossible to use shared libraries.
Distribute the library in the library form and – in place of traditional headers – the module template form files (which, in simplest form, could be raw header files containing also static definitions, that is, non-shareable ones, see below). This is not exactly necessary for static libraries, but it’s then the only way how shared libraries can work, and it’s also doable for static libraries.

Whichever method will be chosen, it actually doesn’t matter for the library user. All you still have to do is to make the compiler know the directory containing the module form files and you use them in your code by importing.

As the first method is quite obvious, let’s try the second method.

This will require likely synthesizing the header files and create some precompiled form of them (provided the compiler can do it) in the preprocessor-templated form. If you are lucky and don’t need it “templated”, it can suffice if you simply create the module interface form files. Note that for entities that are “not shareable” it will have to keep things as “header only”, which will be in this case either true headers, or the “preprocessor-templated precompiled” form.

These form files will serve as interface and simultaneously provider of the “static only” entities, while the “potentially shareable” entities will be provided as before. The library name will be provided in the manifest file in the module interface form file, so it’s enough that one source file imports this module through this module form file and it becomes dependent on the while library.

Motivation for the shared libraries

Before we discuss how this new build system would fit in the shared libraries, let’s check why we need them. There are actually three reasons as to why we need shared libraries, that is, what advantage they have over static libraries:

The TEXT sharing
The file sharing
Separated upgrade

The TEXT is one of memory sections of a program. The program, when started, gets three main memory ranges to use:

the stack, used for subroutine calls
the DATA section – a dynamically extendable region that is initially allocated for global variables and can be extended by request of the dynamic allocation.
the TEXT section – the memory that is read-only (an attempt to modify it crashes the program) and contains the assembly code for the procedures to execute. Sometimes it’s also used to store constant data there.

So here the idea of the TEXT sharing is that if the same set of procedures is used by multiple applications, there could be only one instance of them in the memory, while multiple applications can call them. In case of static linkage, every such procedure would have one instance per application in the memory – at least no one so far developed any way to identify identical procedures in statically linked applications. Not only that – also if you have them provided in a shared library, they are guaranteed to be the same on the same system, while after upgrading one application with the new library version you may have potentially multiple versions of the same function even if it doesn’t make any sense.

A similar situation is with the file sharing – if you have the same library linked to multiple applications, you’ll have multiplied this one library as a part of these applications, even if the linker optimizes the process by not attaching functions that the application doesn’t call.

And the separated upgrade: If you upgrade a static library, nothing will be changed until an application that uses it is itself recompiled and reinstalled. If you upgrade a shared library, all applications using it take advantage from it, while remaining installed as they were before.

With the new rules for C++ shared library we may have to resign from some of these advantages, but can still take advantage of the others. For example, TEXT sharing can only happen if you have one solid piece of procedure to load into the memory, and it has exactly this form everywhere. This can be not possible with a function template, which expands to a different body in every instance (although we may still have a compiler easily detect if multiple instances can be indeed linked to the same compiled version). But even if you can’t do TEXT sharing, you may still take advantage of the shared file and separated upgrade. That’s something for the future, of course, but the build system and the project compiling system can be already prepared to handle these cases, and the system will align to it when ready.

Shared libraries with C++ modules

It was quite easy to define C++ the way that all inline functions can be expanded inline, or if compiled, it’s a compiler’s problem how to implement it (might make each solid implementation static and multiplied in the code), hence it was possible to make shared function templates only as inline functions, or – since C++11 – can be also shared, but not the templates themselves, but only their particular instantiations. Likely, until there are created “dynamic library resolution methods for templates”, sharing should be available only for any solid definitions, hence only those parts will be concerned when upgrading a shared library. In a single-source exporting C++ definition there must be then a way to mark particular parts as capable of putting into a shared library. There would be desired some clear marker to explicitly define, whether a particular entity is going to be put into shared or static part. It was easy in case of the so far C++, where externally visible parts were static, when defined in the header (as a header should not contain any parts that would be visible in the *.o file, as long as not “adapted” for itself by the implementation file), and shared, when defined in the implementation file. But with a single-file definition for everything, you might have an exported solid function defined just as well as an exported function template – the first of which can be easily put into the shared part of the library, but the latter one, at the currently available C++ implementation, could not.

My first idea for this would be to obligatory mark every exported function, that is not shareable, as inline. Exported classes (and other type definitions) would need adding a static modifier. In other words, if you define a class in a single source file, this class is your local (not visible outside the module). If you want to make it visible, you have to mark it as static export. The export (without static) might be available, but only if the compiler supports it, and whether it does, it might also depend on the platform, system and simply compiler capabilities (none of the today compilers and platforms is capable of doing it). The class that is marked export (but not static), or a function template that is marked export (but not inline), will have to be resolved in runtime special way. For example, if you use fields of a class, the code would contain dynamic-resolvable markers with field names, and the advanced C++ runtime linker will resolve these named field references into appropriate field offset by reading the information from the shared library. Such solutions do not exist today, at least not for C++, and we’ll likely not see them soon, but this is the only way how these things may work when a class is shareable through the shared library. Until then, all exported entities that can’t be resolved during shared linkage must have a special marker that defines that they are not shared-linkage-resolvable.

As for the export static class statement, this static can be tolerated with only an extra warning (no software developer would be surprised that a class can’t be shared), which can be also turned off in the compiler options. But export inline would clearly differ to just export when applied to a function and in the simplest implementation it would put the body into the (extra) header or the object file respectively, and the function template might be simply required to be made export inline, while if you want to make use of the shareable template instantiations, then you should simply define the template without exporting and then export explicitly (without inline) the single instantiations, which in this case must be declared anyway.

This is extremely important for the library versioning, upgrades, and keeping up to the ABI compatibility. If you have a shareable class, which’s details are resolved through the runtime linkage, the only thing required for the ability to upgrade the library is that all so far defined methods stay as they are, as well as all fields with their current names and types – but fields may be reordered and new fields can be added (or maybe even some feature can be added that would allow to rename fields, but keep the old name as alias, through using for example). Also if you have an application that calls the library’s function to fill a structure, the new library can also modify the structure by adding new fields and everything still works. The application will use the new structure without even knowing it that it has some extra fields. In the current implementation where structures can be only statically shared, this is impossible – the structure with added fields will not be ABI-compatible with the old structure definition, hence the application will have to be recompiled, or otherwise the call to the function will “render to undefined behavior” due to reaching out to fields that are outside the old version of the structure.

But that’s only for the future. No one will create such a thing soon, so shared libraries must stay what they are – collect only solid entities, that is, functions (including method implementations). They do contain also global variables, but this is only because the functions stored in an *.o file may use variables from another *.o file, so they are shared between calls. And therefore only those solid entities can be upgraded with the upgrade of the library, while all static parts must remain either completely unchanged, or get at best a “compatible upgrade”, that is, for example, a class may be only added new methods (including virtual methods – because adding a virtual method actually modifies the runtime part), but not new fields – and there should also be available a mechanism that takes the compiled version of the earliest-backward-compatible library and checks the differences to the upgraded version whether the static entities differ only in things that do not change the module’s ABI. And, obviously, exported static entities must be marked. The compiler may display only warnings when this isn’t used, as it’s just to make the programmer aware of that.

After having this, we should have simply the library defined just like before: the compiler-linker creates the shared library *.so file and the module template form files (*.cmt) or module interface form files (*.cmi) containing the library’s interface. For static libraries there are still two forms possible: either the above cmi/cmt files as interfaces plus *.a file containing the solid part, or just as well simply cmt/cm single files in the package that will be requested as interface and linkage source by the compiler for the library client. That method could be only available as static so far and only for applications using C++20 and also itself compiled in modules. This may also follow the Java’s method with their *.jar files, so you can similarly have *.car files that are really zip files containing module form files (although if compressed it may better use a *.caz extension :D).

But it is still desired that we have a library that is maybe itself written using C++20, but available for applications that must use an earlier standard, or even interface themselves to C applications. Let’s then talk smooth transition here as well.

Modules in the service of public library

And here is the biggest problem, and something likely not even taken up by the designers of C++20 modules.

We want to have a library that is distributed two-fold:

traditionally, like before, with a development package containing header files and optionally a static library, and a runtime package with a shared library
some better way, although only available to the newer standard

Actually, for a long time the distribution will have to use both. In order to finally abandon the C++ preprocessor and its key role in interfacing and library distribution, you need to keep the library useful for the older standards, and make it available for the newer one, so that at some point the old way can be abandoned. But the old way will never be abandoned until the new way is widely adopted.

As this has to work just as well in case of libraries that are already written in a single-source module style, the header files that would have to be distributed with the development package of the library will have to be generated. You no longer have any header files, and the older standard can only use header files to access the library’s interface. Obviously, the compiler must be aware of what the oldest standard it should use for the generated header file and which features not to use. For having the header file for the old standard library distribution you will obviously have to have a separate module that will collect all things required for interfacing and mark appropriately the exported parts with a block with export extern "C", which will allow to generate a header marking extern "C" for C++ and just pure function names for C.

Alternatively, of course, if you want to make an interface for C language and prefer to complete the header file yourself, you can simply keep your header file as is, and provide an implementation file that will define direct functions as declared in the header file and it will be compiled still in modules, but using the “intermediate” way to compile an uninstrumnented file as module.

Variants and versioning

Ok, let’s think about this for a while: why would your application require a specific version and variant of the library?

Simply: because your application requires particular features that are only provided in certain variant (if a library can exist in variants) and are available only in given version. All of these things must be obviously defined by the library provider, but then you distribute the application without the required shared library, or even you distribute it in sources and it will be compiled on the target system – it should be known in advance if the user’s system provides the library in a correct variant and version. Obviously though you distribute an application (or a dependent library) compiled in the form matching appropriate variant and version line of the required library. With sources you have more flexibility; you may be variant independent, or require a limited variant selection. In short: if you have a source distribution, you need the API backward compatibility; if binary – you need the ABI backward compatibility. And changing even one detail in the variant selection creates a different ABI. API compatibility defines the ability to even compile your dependent target with this library. ABI compatibility makes particular library’s binary distribution usable with the current binary distribution of the dependent target.

These things make it hard to distribute a shared library and use a shared library by the applications in the system. But there are things that are more and less avoidable. The use of variants shall be at best used only when absolutely necessary, and if possible, several details should be selected in runtime so that multiple variants are available already in the library. Because the version line is defined only within the frames of one “static variant”.

Now, the versions can be upgraded within only a single variant. A variant may be defined by a set of parameters, so it’s such a kind of thing – in theory, you may resolve to multiple parameters and their infinite combinations; in practice, most of the variants align to particular operating systems (so it’s impossible that you have more than one of them in the system), very few refer to some specific dependent libraries implementing particular feature different ways (although still only one of them is usually in use in particular system). Therefore you will rarely have to worry about variants. Problems will be mostly with versions and version line.

What is the version line? It is started by a certain version of the library and continues with the same variant (a single variant is a single line, so forking off a new variant starts a new base version) and continues with preserving the backward compatibility (in case of currently known platforms, it’s about ABI compatibility). A new version of the library may be installed by replacing the old version of the same line and all targets compiled for the earlier version of this library can work just as well with the new one.

But that’s rather a song for the future and things that need consideration from the operating systems and their packaging and installation systems. For now we need to focus on how to write a library in C++ using the single-source modularization and fit the resulting library into the frames of the current (well, of course, prepared mainly for C libraries) operating systems.

Creating libraries in a distributable package

So, we need to have these things compiled as before, of course, in the build system we do have the module form files produced out of source files, but finally this should produce the library, both possibly static and dynamic, but in two possible forms:

For the older standards: the library file and separately header files. Header files should be generated out of declarations that are marked export. Entities marked export static will be put as a whole in the header (actually it’s not imaginable to remain accessible with the old standard with header files and having structures dynamically exported, so for generating the old type package non-static export should be not supported), and similarly functions (including methods) marked as export inline should be put whole into the header file, while for those only marked export, there should be only the function header in the header file. Macrodefinitions should be put there also if they are exported.
For C++20 module users: the library file is also created just like in 1, but instead of header files, you have shorthand module form files (template and interface files, whichever fits better) for every exported module. A possibility might exist to do some specific marking of the public modules, but this is only to make things easier for the build definition (so that you can simply walk through all module form files and take to the package only those that are marked public – might be a good idea that you declare module as public module and if so the module name gets the partition-like suffix -public). If you have any macrodefinition-dependencies detected, the compiler should create then (with -mi flag) the module template form file (*.cmt), otherwise it will be a module interface form file (*.cmi). Of course, the module interface form files must be “instantiated” through the macrodefinitions and they will produce the corresponding module interface form files in the build directory.

In the system using GNU installation manners, the library files will be installed in the same place as usual. Module form files have to find possibly some new place to install, neither /usr/lib nor /usr/include is appropriate, especially that we’d like to have already one directory for C++ modules that would in the future contain either interface and template form files, as well as possibly in the future the full form files. For the sake of further explanations, let’s say it will be /usr/modules.

So, for the interim period we’d need that the library contain both the usual header files installed in /usr/include and the module form files in /usr/modules, as well as libraries in the appropriate place. The module form files that are of interface kind (that is, not full), should contain also information about the library name that they interface.

So, as a raw example, let’s say we have a library like… ok, I really tried to avoid it, but for the sake of the article I’ll go simplest way for me: libsrt. This library provides an interface for C language, although it’s written in C++. This library has one main header file srt.h and several secondary headers for special purposes. The interface implementation file is srt_c_api.cpp file and this one would become the public module. All gets compiled into libsrt.a library file (and similarly shared one) and the installation contains headers: srt/srt.h, srt/access_control.h, and srt/logging_api.h. There would have to be then three public modules defined in these library.

In the stage of “old C++ sources, new compiling rules”, it would compile as the old sources, and header files will be normally put into the installation directory. The static library will be made by extracting all *.o files embeded into *.cm files and bound into an archive. For a shared library they will be linked together into a shared library. The module form files produced out of public modules will be then stored in /usr/modules/srt as srt.cmi, access_control.cmi and logging_api.cmi. The two latter can be simply produced out of an uninstrumented header files, while for srt.cm it will have to use the command that defines the module name srt.srt, and passes srt_c_api.cpp as implementation file and srt.h as header file.

Now let’s say we have an application consiting of two source files: application.cpp(main module) and utils.cpp. The traditionally created program will then use the header files in the application.cpp file in the beginning:

#include <srt/srt.h>
#include <srt/access_control.h>

and the source files of the application will be compiled as (with the help of pkg-config):

c++ -c application.cpp
c++ -c utils.cpp
c++ -o application application.o utils.o -lsrt -lcrypto -lssl

while the new C++20 application (stating that it is already module-instrumented) will do:

module default;

import srt.srt;
import srt.access_control;

and the source files of the application will be compiled as:

c++ -mi utils.cpp
c++ -mc application.cpp
c++ -mc utils.cpp
c++ -o application -ma application.cm

Now, you might ask: is the library specification simply skipped in case of “with module” compilation? No, only superfluous. Your application is still free to mix importing by modules and importing by header files, whichever method the particular library provides, although by using the module specification with import you automatically get the library dependency information so that it’s not necessary to be specified for linkage.

This works also for both static and shared libraries and the resulting application, if it uses shared libraries, has exactly the same shared library dependencies. This method should be also adaptable on every step of the gradual module adaptation, as shown in the beginning.

The future: installing a module package

Of course, potentially there’s another possible method of installing a C++ library built with modules: install them directly as module form files. There are, however, several problems to solve here:

Public module interface files should be marked somehow so that accessing a module from the current project, accessing a public module from an external library (possibly installed in a custom directory) and accessing a private module from an external library are distinguished. And the latter is disallowed. Making a so-far library with extra module interface form files satisfies this condition already, while for this case it should be somehow solved.
The module should be shared-loadable. Therefore a C++ application must have an appropriate format, where various markers will be filled with appropriate “meaning”, and it’s not only about filling in the call of a function defined in a separate file, but also things like instantiating a template or find an appropriate offset of a structure’s field. Yes, that should be done by a C++ dynamic linker at the moment when an application is being run.

These are problems to be solved in the file format on particular platform, not much to fix in the C++ language – although the earlier proposed public module statement should be useful here as well. Might be that – as suggested already – separate options should be available to specify directories with modules used in the current project and separate for directories with modules as external libraries. This allows developers to do tricks here – but on the other hand, here is also proposed a separate syntax for local and global modules. Hence, with the local module syntax you can reach out to any module, while with global syntax it will only load public modules.

Simple naïve implementation

It’s not so that in order to implement such a system you’d have to deal with the compiler loadable database already. If you have it, it’s better, but a simple implementation can base on the simplest things. What the compiler should be capable of is, of course, the header file generation. We have then the following rules here:

If you have a class declaration (or just any other type), it is being ignored (considered as a local type definition, visible only in the implementation file).
If you have a type definition with export modifier, then this type definition will be read whole and placed in the header file. If this is a structure definition and it has any methods defined inside, they will be grabbed as well. Only functions that are defined outside the class will remain only as signatures.
If you have an exported function, only the signature is provided to the header file. If there is an inline modifier, you take the whole definition into the header file.
Templates similarly like classes and functions; important thing is that if there is a non-inline function or a method, the only allowed sharing is through extern template, that is, only explicit instantiations can be exported.

The compiler should add options to generate the header file from it and also to configure its name and the macroguard name, by default being INC_MGEN_FILENAME_H.

The naïve implementation of the interface compiling may do the following:

Generate the header file from the implementation file, as described above.
Create the archive file that contains the following files:
- The generated header file
- The manifest file

The manifest file should have a 4-byte header, like normally special type files in POSIX which defines its format. First two characters shoul be then #! so let’s follow them by ^C. And then the NL so that it is then written as key: value form. So this manifest file will contain:

#!^C
module: F
interface-type: header
header-filename: hdr.h

This file will be then written in a file named MODULE.MF. Then the F.cmi file will be the AR archive file containing:

hdr.h
MODULE.MF

Now, when you are compiling a module source file – m.cc – that declares include .F inside, then the compiler should expect to find the F.cmi file, so it finds the MODULE.MF and reads the configuration. This finds the method of interface as header file so it takes its filename, extracts from the archive and performs interpretation in isolation – the same way as it would do in case of import "filename" statement.

Now, when you compile the implementation file itself, f.cc, then the interface module should exist, but not necessary. There should be always expected the location where the module form file should be stored (separately from having the directory where the other modules of the same project should be searched – as those could be also multiple), and the module interface form file should be first searched in this directory. If it is found, it’s read and interpreted, otherwise the compilation is simply performed anew.

Note also that if the implementation file contains any export declarations, then the header file generation must happen; if the module interface form file isn’t found, then the header file generation must still happen. This could be the case of an “independent” module file – that is, a module source that does not import any other modules from the project, so it can be compiled as first without having any other modules’ interfaces. As other modules may also be dependent on it – and it’s usually the intention, if you have anything exported – then effectively the *.cm file will be used as an interface. Therefore compilation of such a file involves all things as per interface, and then rest of the things required for a complete module form file.

Anyway, the final module form file should contain the following:

f.cm:
MODULE.MF
hdr.h
f.o

And of course the manifest file will contain this time:

#!^C
module: F
interface-type: header
header-filename: hdr.h
implementation-type: object
object-filename: f.o

This module is independent; if a module has dependencies, then this list should be also added; for example when a module X is dependent on F nad L, then it has an additional line in the manifest file:

dependencies: F L

So, let’s try to return to our first example. We had t.cc, u.cc and m.cc source files. Contents are known, so let’s show now how the MODULE.MF file will look like in these files. Note that this time we do have dependencies, which have been taken from all import declarations. In practice, this should read only the first-hand import declarations, as more isn’t necessary – from the dependent modules further dependencies will be read recursively.

File: t.cm:

#!^C
module: t
interface-type: header
header-filename: t.h
implementation-type: object
object-filename: t.o
dependencies: .u

File: u.cm:

#!^C
module: u
interface-type: header
header-filename: u.h
implementation-type: object
object-filename: u.o
dependencies: .t

and now m.cm – note that it doesn’t export anything, so there’s no header generated and no interface declared:

#!^C
module: m
interface-type: none
implementation-type: object
object-filename: m.o
dependencies: .t .u

So, these are the files that are produced by the alleged c++ -mc <source.cc> command.

Now there’s the c++ -ma m.cm command that is expected to create an executable (default output filename is, of course, a.out ;). This command does the following:

Read the manifest file from the given module form file. This reads the implementation type (which is the *.o file) and the filename that should be present in this archive file.
Additionally it reads dependencies and finds modules for these names. Module form files identified for these names have been found and their manifests read. This time we have the .t and .u modules, so we look for local module form files named t.cm and u.cm (this time we can’t tolerate having only an interface file, although this could be done in case of public modules of libraries) and read their manifests. This drives us to providing additionally the u.o and t.o object files together with our m.o file, so these files are being extracted and there’s called the linker:

gcc m.o t.o u.o -lstdc++ -o a.out # whatever

The same could be done for libraries. Although this time you may have simply an interface file, which will be distributed together with the library file. These interface files are also simply archive files containing header files plus the manifest.

After having this implementation working, you can bet for more. For example, the interface compilation may actually compile completely the implementation file, just with unresolved details. That is, for example, function bodies cannot be compiled without having the dependent interface, but they still can be tokenized and completed as a database that should be only filled with details once we have an interface of the dependent module. Therefore the *.cmi file will contain the completely compiled file, that is, something that the compiler can next take (without even taking a look into the source file) and simply continue compiling the dependent entities. These parts could he kept in a separate file so that this interface file can be additionally processed to turn it into a “slim” interface file for publishing as a library interface.

Final notes

I had some more ideas how to go further with this system, but they were rather focused on supporting some old practices, mainly with bad practices widely used in C development and shared with many C++ projects. One of important parts were possibilities to deliver alternative implementations like dependency on a library that can exist in various flavors and with a different API, but created for the same purpose, including having a variant of the main library with no support to features that require dependency on this library. Such a thing will be still a problem and it would be nice to have a solution for it and without the preprocessor.

This what has been proposed above should however allow the C++20 module system to be provided for the real C++ development and help C++ projects do their job better.

Posted in Uncategorized | Leave a comment

C++20 modules Konsidered Yoozless

Posted on June 27, 2022 by ethouris

[DISCLAIMER: this text is still under development and likely will be fixed, but I wanted to collect some more opinions from the public first].

It happened once already. The C++ standard committee has gone very far in completing the idea of templates and their linkage between compilation units that they had to add the export keyword. This way they created a feature that in this exactly form as defined in the standard is implemented literally by no existing C++ compiler in the world. The only one that tried to do it – Comeau – made it more-less working, while requiring conditions that weren’t mentioned in the standard, such as that the original source files, that define these templates, be accessible during compilation of the template’s user.

In the next standard, C++11, they removed that keyword (which wasn’t a problem since no codebase could be harmed by it), while adding a possibility that existed partially already in other compilers as extension: extern templates, which can be linked against across compilation units, and the only condition is that they are explicitly instantiated, so the linkage actually concerns only the selected instantiation, not the template itself. The practice of the software development world, as always, triumphed over the sage-wannabees from the standard committee.

There is one important lesson to be learned from the failure of the exported templates feature – the C++ language cannot live without the “build platform”, which is something that can be called “C modules”, and this build platform has its own rules, such as, for example, the only types of entities to be bound together can be a variable or a function, and if a language entity can’t be aligned to any of these, then the only way to make them accessible in the other unit is by sharing the source. This means that the only way to share templates is to either share them through header files, as source, or as a function that results from the instantiation, which must happen during compiling both the usage unit and the definition one.

Not even 10 years passed, and the committee has created again a feature aimed to replace one of the ugliest part of the C++ language, which again, by a lame-legged dog won’t be used. Just this time it’s way more ambitious: modules.

Those ugly includes

What’s the current situation with C++ and its way to import the interface of other parts of the code, that’s known and despised. We have this ugly #include directive that simply pastes the read file into the text file, or even more precisely, the compiler gets at its input one big text file that consists of all the text pasted from the file as requested by the #include directive in this file, and also all such files recursively. Plus the text consists of things included (or not included) by having a conditional directive resolved, as well as resolved macrodefinitions. Actually the preprocessor is a kind of separate language, or language-in-language, which is different than C++ itself, while having its syntax partially mixed with C++ due to the use of macros.

Bjarne Stroustrup since the very beginning found the preprocessor the worst part of C++ and urgently needed something to replace it. C++ has been first announced in 1982 year, the first standard was committed by ISO in 1998 year, and during the works on C++0x (which became later C++11, as 2009 date was rather unrealistic) they already learned some problems the hard way. But still no sensible replacement for the preprocessor has been found. 10 years later they have developed finally some proposal (I call it “proposal”, even though it’s defined in the committed standard, because it’s a language feature, aimed at being a replacement for one of the core features in the language, but the “old way” is still available, so it’s the matter of whether any developers would like to use it). Before evaluating it, let’s take a closer look at the whole “build platform” being used by C and C++, so called “C modules”, which consist of the preprocessor (as a main tool to have a shared multiple parts of the code), and the C linker.

Ugly, but necessary

So, out of the most important things we have this #include directive, which’s main intention is to inject the interface of a piece of code. For that, we need to have this interface defined somehow, and C++ is still using these “C modules” for it. Using the C++20 nomenclature, you have the “exported” part in the form of a header file, and there’s also an implementation that is “not-exported” in the object file. It’s up to you what you export, however exporting or not exporting particular parts has consequences. Speaking about simplest things, you can have a structure declared as incomplete, which means that you can still use a pointer or reference to it, as well as declare it as a return type of a function (stating that you can still ignore it), but you can’t refer to fields of this structure, nor can you create or copy values of that type. In case of a C++ class it means also that you have to provide a complete class definition in the header file, if you want to be able to call its methods. But this also means that you have to declare also all fields in this file. You can put them in the private section, of course, but they still are “exported” this way. You can also use the so-called “pimple” pattern, that is, use a separate – unexported – structure for fields and keep the pointer to it in the main class, but this is an ugly trick (you walk around a language problem with extra code), and it’s not without consequences (you need to have a dynamic allocation for this structure, if you don’t want to export this structure’s size, and also you can’t implement any methods inline). Bottom line is, this method of exporting the interface forces you to create your class specifically, and split the code of a class into the *.h and *.cpp file, or even three files (exported header, internal header and implementation, if using pimple). What is interesting in case of pimple is that the only reason of using it is that modifying any “real” class contents (that is, except the “exported” part) won’t result in recompiling all its users.

Could there be then any alternative way to do this? Well, of course. If you have an exported parts of the class, then you could be able to access methods, but not fields, you might be not able to get the size of the class if fields are not exported (this way being not even accessible to friends). Why could this be valuable? Because this way you can change the parts not being exported as you please, without any risk of breaking compatibility of any dependent code. But the “C modules” system makes it impossible – you can have only exactly ONE definition of a class, and either it is whole exported or whole hidden, and the only method to make selective method exporting is either through the “pimple”, or full-virtual interface.

Another topic is the use of preprocessor macros. The form of their use is that you can call – or rather apply – them in the form of a function call, they can have a name similar to a function, or it can be a single symbol; whatever you define inside its definition, will be replaced to in the target form. They must only be lexically complete. So it can be used to inject some fragmentary syntax that would only be complete after the macro resolution – it can’t glue in, for example + and = into += operator, but it can potentially paste incomplete operator expressions, leave open parenthesis or brace, to be closed by another macro application. A completely separate story is that it makes today C++ unable to be parsed by a single parser, at least those being in the market currently (I personally started some project of a parser that might potentially be able to do it, but that can only be achieved by a “feedback style”, that is, a language element when resolved should feed the resolved text back into the parser). A nasty problem with macros is that it uses also its own separate naming system, partially conflicting with the C++ namespaces (you can use some symbol looking like a namespace-resolved name or a method call, and the name could be a macro to be resolved to another name), and when anyhow used in the header file it’s exported everywhere.

On the other hand, macrodefinitions are used for something that is controlling the conditional content injection. Some expressions can be already provided by the compiler (either internally or through command-line options), but this is also a method of how a user might want to use parametrization for the include file. You simply make #defines with some symbols that you know will be used inside the file before #include, or even through a compiler option, and then these symbols will be recognized inside and some alternatives will be selected. Not that this is so often used, but it’s one of the things potentially possible and sometimes useful. At least it’s something being used today, even some crazy way, and it will have to stay as is, potentially forever. Whether any alternative or replacement will be available is about to be seen, but it will have to be first widely adopted, otherwise it will never disappear.

And another story is the linkage system. One of the problems is that this is built in into the operating system. And, kinda paradoxally, it wasn’t a language for the system, it was the system that has adjusted itself to the currently used language, and that language was C – ANSI C, in particular. And the C language defines symbols of two types only – functions and variables – and only those are exported out of the object file and externally visible. This also gets forwarded then to shared libraries, which is a thing defined a bit differently in every system, but finally it all aligns to how the C language defines it: you have a shared library file, which is resolved when you execute the program, the dynamic linker is running and symbols from the library are being linked to the application. What symbols? Those that are supported by C language, as I said, functions and variables (note: not even preprocessor macros). The C++ language, volens-nolens, had to adjust to that system if it wanted to exist on them – so global variables, objects, constants, all of them had to be exported as variables, and functions, class methods, both possibly inside the namespace, as functions with some very long “mangled” name, had to be exported as functions.

I think you can see now, why things are “ugly” here:

You can’t export a class, and its layout is unknown not only in the runtime, but also across linkage units. Among others, this makes an excellent point of ABI incompatibility in the changed library version. Imagine, for example, a stupid sizeof applied on a class that in the next version has been added a field. Is there a problem to have a linkage point in the source file that uses this sizeof and puts the real size of the class there, as extracted during the dynamic linkage against a shared library? Theoretically it wouldn’t be, as long as there is predicted such a thing as “immediate value linkage” (today at best it can be aligned to a variable), but not only doesn’t this thing exist in the today linkage systems, but also there’s no way to put this information into the library. The sizeof expression is static in C, and so it must be in C++. The trick here is that for C language, which contained only simple structures with data fields inside, it might be not so important to have the structure layout resolvable at link time. For C++, where “structures” are armed in virtual methods, derivation and nesting, it is unfortunately desired.
Templates can be in practice only fully defined in the header files. In C++11 there’s a possibility of template linkage, but the linkage – just like the system linker for shared libraries – can only refer to a function, which means that your function template (or your class template for all methods it defines) must be instantiated in the same file where it is defined, using the exact parameter set that is going to be used by the template’s user. In some extreme cases it also forces a user to create an implementation “in steps” – one that is a template-based API that the user will use, and the implementation in the header will do some costly parameter translation and will call a “runtime” function that is defined in a style acceptable for the linker (including shared linker). As you know, in C++98 they tried to fix this problem by providing a non-implementable rule for the compiler in the form of export keyword, which had to be later withdrawn. Seems like the standard committee didn’t understand that no matter how they want to make the language complete, they still must settle its rules with the rules of the system linker where the language has to be implemented.
The code must be manually split into the exported and internal part, kept in separate files. Theoretically you might try to use the implementation file as header, just mark the implementation part hidden by a conditional macro, but this can only be used in your own project, not exposed as a library API – in the latter case you must have the header file as the only thing provided in the form of a source (actually a better idea would be to use yet another preprocessing tool to generate the header and implementation part from a common file).
You also need to have defined things in the language that control the linkage export. For example, you can have a constant that is externally linked, but it’s in the form of an exported variable (as exporting works only for variables and functions). Therefore you have in C++ a static const, which is resolved at compile time (but internal), or an extern const, which is a “declaration”, if not initialized, or “definition” if initialized, which isn’t very intuitive if you compare it with how this thing is settled in case of normal variables and functions, that is, something that can be operated in C. There you have a declaration without marker, which defines a variable, and you use extern to mark the importing. A much better “intuitive” implementation would be simply to just define a function or a variable (or a constant or a class) and have some “export” marker to make it exported, otherwise it’s internal. From this you can clearly see that the function declaration (unless it’s just in order to free yourself of the function definition order in the file), or an extern marked variable is simply a technical detail that you have to use explicitly only because the modularity system in C++ is defective.

C++ simply doesn’t have a modularity system on its own that would be designed specifically for its needs. And God knows it needs it urgently.

The Black Sun

For those, who might not exactly understand this term – I mean this term used in the world of Star Wars, which designated a crime syndicate. The name based on a saying that “better a black sun than no sun at all”, and it was then connected to establishing a paramilitary organization that had to restore order in an anarchized society – then this paramilitary organization turned into a crime syndicate.

So, this is more-less the situation about the modules in C++20. And here are several reasons, why this actually is a “black sun” that not a person needs in the software development industry.

1. No real world usage

Let’s start with the toughest pill to swallow for the C++ standard committee:

Among all examples found in the internet that explain the matter of modules, all of them are in the style of FooBar::DoSomething(). That is, I haven’t found so far even one example of a C++ library created in the today style of “C modules” and then the same library written using C++20 modules. Not even mentioning any intermediate form, which can be used both in a “traditional style” and using the import instructions with the proper module names.

Ok, let’s say that the fact that I haven’t found anything doesn’t yet define anything, so I admit, I found some articles that show the same thing done the “old way” and using the new method with C++20 modules (like this one for example). But they are at best extremely simple examples, and even as such they already show what the worst part of it is – you need to translate “very standard” header files into “not standard at all” *.ixx files (Microsoft flavor), once and for all. This is then effectively *.h into *.ixx translation and some syntax change – the whole rest of the project remains completely intact. Should I really say it openly, how useless that system is for any C++ developer?

Nothing “real-world” that would show how pieces that are bound together in the old-style “C modules” can be done now easier way with C++20 modules. In some of them I have even found an example where the “module primary interface” file contained more-less the same things as the old “C++ header”, just without macroguards.

I tried myself to write some simple library that would be used as a module; maybe I didn’t try hard enough, but the module rules seem to be quite complicated, and it’s nothing like that you just put the header and imp file into one and expect that it works. It doesn’t even look this way when you see the examples, the hardest part is when you’d like to make the old library, written for a real use in the earlier standard, useful for C++20 application developers. Even worse when you’d like to only define some extra header files with module declarations for the files that already exist. Simply: there are no sensible transition abilities between the current “C modules” system and the C++20 modules system. Oh, there is one – you can simply replace the #include directive with the import instruction, leaving the argument intact. Really? This will simply make that in the incoming 10 years people will be using the import instruction and continue with the old school header files – of course, if they are patient enough to add the compilation step from the header file into the module cache files, whatever it is the compiler is using. And that’s about all as it comes to usefulness of this feature.

Frankly, this is the biggest defeat for the C++20 modules, if the only code that can be written using C++20 modules could be the completely new code, newly written for C++20 and inaccessible for any other language, notably previous C++ standards. You know how this will end? Very simply: if an old code is to be abandoned in order to take advantage of C++20 modules, the new code will be written in Go, Rust, D, or even maybe some scripting languages, like Lua or Python, or… C++, of course, using the old good syntax working with “C modules”.

2. No consistent definition of the file content and compiler behavior

There are three compilers currently supporting C++20 modules: Clang, GCC and Microsoft Visual. That’s way better than it was with the old export for templates, which was Comeau only, but still there are things not exactly resolved by the standard committee, which means that the compilers had to have their own rules. Even Clang and GCC, which traditionally were keeping options similar to one another to allow interchangeability, have a complete different set of options required for this and even different set of rules. For example, in GCC if you want to use import <iostream>, you have to… compile iostream as a module, and it will be stored in the local cache. What is interesting, that’s a specific solution for gcc, and every compiler does it a little bit different way, but all of them are doing it with some kind of caching – because the standard doesn’t define how the intermediate files should look like. Understandable that the intermediate parts have to be identifiable by name used in the language and this should somehow easily map to the filename – but still, in my Silvercat project I could decide that from the hell_proc.cc file the compiler should make the intermediate file as imfiles/frog-hell_proc.ag.o, as then the linking is known to be then using the object file that is the hell_proc.cc file after compiling, but as this is inside the project, it doesn’t matter. It is understandable that the filename should be able to be mapped after the name used in import instruction, something of a style of Java’s CLASSPATH variable that leads to possible locations where it starts, and with dots translated to slashes as a path separator, which wouldn’t be actually anything different than the compiler’s -I and -L options. But during the build at least this thing should be able to be decided.

You might want to ask why is this necessary to “precompile the system header”, why can’t this file be stored somewhere in the compiler’s installation files, just where the iostream header is? Well, there’s one simple answer: because preprocessor. Because what exactly form this module will have after compiling, it may depend on compiler options and possibly a definition of some macros, which’s combinations may select several parts of the file. And the “compiled module” is a form that is to be then taken by the compiler and compiled together with your source files, where any preprocessing part is already done.

That leads us to the next problem:

3. Modules are compile-side tricks for the build system, not source-side helpers for the developers.

There’s no way to pass any “parameters” to the modules, or better, you cannot “parametrize” what you import. Let’s take a simple example of an existing solution:

#define __STD_FORMAT_MACROS
#include <inttypes.h>

You can of course use import <inttypes.h>, but the #define above it won’t change anything – macros do not get imported from the outside of the import file (macros can be imported and exported in C++20 modules, but only by definition inside the module).

You can argue whether this makes sense, but there are more usages for it. For example, a library that uses encryption and wants to support various encryption libraries, of which each one is using a similar in functionality, but still a different API. This can be an internal problem to solve inside the library, but still you would like to control things by specifying several parameters – for which exactly feature there exists the -D option in the compilers.

When I asked a question about this on Stackoverflow, the people were greatly surprised that I’m even asking for that. Obviously, modules shall not be parametrized. You should simply ask a module to be imported and that’s about all. And the module – that is, something you are asking to be imported – is not a source to be only compiled, but something already compiled, where all possible parameters are already resolved. Whatever is to be conditionally added or blocked, in the “importable module unit” should be already in the target form.

Effectively then, you may even say, you could accept this fact with just such an explanation that for every combination of compiler options and external macros you should have a different form of a compiled module. Yes, but even if you do this, you can’t even distinguish between versions of a compiled module. You can’t even decide its name (let’s say you’d like to have multiple variants of it in the same build) because the name is what identifies the module for the external parts.

You need to distinguish two things, in order to properly understand the modules in C++20: The module source files (the definition, export tables, partitioning etc.) and the module as an “importable unit” – as these two things, despite being both named a “module”, are completely different things. So, you need to remember what is exactly this “thing you import”: it’s a temporary file that exists only among your intermediate build files, it has a predefined name that you are not allowed to change (which is even worse than regular “object files”). Best way to think about it would be that it’s a “generated source file” and not even for C++, but for a different language that can only be reliably interfaced to C++.

This module importable unit not only cannot be parametrized – it also cannot be versioned. This is simply a build intermediate file. If you imagine a typical program in C++ consisting of source files with includes, which are first compiled into “object” files, and from which the linker finally produces a program or a library – the “module importable unit” is something on the level of the object file (although it’s closer to some generated file in some language from which only the object file will be produced). It’s far from both the sources and the final program or library.

Effectively the only possible form of distribution for a code written in C++20 modules is… in the source files. This doesn’t sound any better than “header-only libraries”, just this time the sources can be even very large. But libraries with large sources are being distributed today with a concise header file and libraries in a binary form. Unfortunately no such intermediate form as binary modules can be used for libraries – because modules are both source and platform dependent.

4. C++20 using modules is a completely different language

A program written in C++20 using modules is a “hard handoff”. You can use libraries intended to be written for older standards using import instruction with the header file, which is kinda cosmetic change, with just one small advantage – macros defined in previous headers will not be visible in the next headers, which prevents nasty, sometimes undetectable, name clashes for preprocessor. But that’s about all. The C++20 modules can be used inside your program to maybe better organize your sources, but heck, somehow even large projects coped very well using the traditional “C modules” and didn’t have problems with it. What exactly then can you use modules for?

One of the things that you can already figure out is that you can have a whole class written in one file and you don’t have to worry about creating a specific header file. If there are things you don’t want to be visible outside – you don’t export them.

The question isn’t then how can you use C++20 modules instead of “C modules”, but rather how can you organize your sources. But the biggest problem is that practically the C++20 modules cannot be used across projects by any other way than by source files.

You might have thought that at least with the C++20 modules you can write your library in the more modern style, compile it into a library, which will be then used by an application – still, requiring C++20, but we have already accepted that limitation. Or maybe even you can write this in C++20, but the API can be exposed in the style of some older standard so that other developers can also benefit?

Nutter chance. All you can advantage from C++20 modules is the organization of your program. It doesn’t matter that you can distribute your sources in the style of modules, you can define a class in a single file, dependencies will be easily handled… only within the frames of your project. And for the library you’d like to distribute the only thing you can use is the compiled form and… a header file. Old good header file, containing a class, maybe without method bodies, but definitely with fields. Maybe you can generate it out of the module source file, if there’s a tool to do that. And no, you can’t even make it a library out of C++20 modules even if it requires at least C++20 to use.

5. You need to change the way you think about sharing parts between the project units, and there’s not a smallest thing to remind this in the so far C++.

That’s it. What is a “module” that you import? Any similarity to a header file? Something that earlier was a combination of a header file and implementation file?

What is this “importable unit, identifiable by name” from the perspective of the logic? Because the logics of so far “C modules” represented by a header file, is simply the logics of what is being defined. The implementation part – be it even a compiled object file – was to only provide longer parts to complete what is defined in the header file, but it’s actually the header file that declares things. This is the “power” of the module. It can be conditionally controlled by the language environment (preprocessor macros), it will also decide which parts you can use, and which not (even if they would otherwise refer to parts that are contained in the compiled object file). This is this “thing you import”. This provides you also with things you may use, as well as probably an information of what is currently accessible on the platform or project configuration, for which you may want to adjust your application. This is then exactly the point of true portability.

Unique like C++

I understand that C++ must do things its own way and not blindly follow all other languages, but experience with various other module systems should be at least inspiring. In the case of C++20 modules it looks like it was done so unique way that likely no one is doing this. Whereas there are some practices among module systems that are there for a reason, like:

Exported module source files must have names equal to the name of the exported symbol.
The module pathname is represented in the filesystem by directories.
The “importable unit” of a module is accessible by the name as mentioned in the importing directive, including pathname directories.
Multiple versions of the same module shall be possible. Importing shall allow to specify version constraints.
Modules should allow for specifying parameters that are provided explicitly when importing.
For systems that cannot provide runtime linkers for C++, the specific header generator for a compiled file should exist for the sake of distribution.
For systems that can provide runtime linkers for C++, the C++ libraries should be able to be distributed in the form of specific C++ shared libraries that can be linked against a C++ specific library in runtime.

Not all of them are coming from particular module system, and some of them don’t exist anywhere, but such a requirement results from the nature of C++ and the practice of using it across years.

What the customer actually needed

I don’t know why, but when looking at C++20 modules I feel like being this mythical “customer” of a software company that doesn’t understand exactly what kind of software they want, but the software company then anyway produces software that can be used as a hedgehog instead of a brush. Just this time I – a software developer – am a customer, and software developers are the members of the C++ standard committee.

No, guys. This definitely isn’t what we, developers, want. We want a system that will satisfy the following needs:

Allows to create an intermediate “compiled template” form, which can be still instantiated using compiler options and parameters from the user. Simply, if you don’t, we might resolve to modules and their interfaces, split into as small pieces as possible, then the pieces will be linked together (a series of import instructions) from an old school header file that will be picking up parts basing on user parameters through macros, compiler macros or command-line macros.
The version string is standardized and can be compared; it’s encoded in the compiled module and can be verified. If you don’t, then again, we’ll resolve to a version string written in a global variable and the version symbol in the header file.
The modules have names consistent with the directories and files where they are stored.
The ABI consistency should be able to be resolved across versions. That is e.g.:
- An exported structure/class must have template entries for class size and field offsets
- A function or method should be freely added new, default arguments without the need to recompile the user’s source. Same for new overrides of virtual methods.
If possible on particular platform, a reliable runtime-linkable library form should be designed
There should be also possible an “intermediate” form that can be compiled as both module-only capable library and a library for the old “C module” layout.

This is more-less what the developers need in order to replace the “C module” system based on the bare text file inclusion.

There’s also one quite another story. The problem with the current “C module” system is that the build system required for it requires building single files in separate commands and then linking them together. Every file must have its own header file, if another file would like to be dependent on it. The expected module system for C++ should resemble it, except that:

We want to not have a need to have header files. The actual interface should be able to be somehow extracted from the compiled, distribuable module files.
The unit of exporting for a C++ module, beside functions and global variables as we know from “C modules”, should be also a constant or a class, as well as templates. In this sense, a static field in a class should be a part of that class, not an external-symbol-marked variable.

Not your stinkun’ problem!

Well, there’s one another very important thing about the “C modules” as a currently used modularization system in C++: it’s everywhere.

Meaning, this system decides about the following things in the whole C++ software development:

Splitting the whole program into single pieces
Binding these pieces together by exposing appropriate “nodes” in every unit
Selecting the smallest unit of change that would require recompiling dependent fragments
Partially also “ABI compatibility” between single units (meaning, which parts of the “user code” must be recompiled after changes in the “service code”).
Distribution of a “service code” for the use of other applications with having the most common code provided in the precompiled form of target platform’s binary

So, the first and foremost problem of C++20 modules is that it aims to replace points 1 and 2, maybe 3 if the compiler supports it well. But rest of the things are not even bothered to be fixed. I didn’t even mention about things being solved exclusively in the frames of header files – that is, platform-dependent things and conditionals; this actually should be mentioned too because header files are also part of the whole system called “C modules”. One other thing that wasn’t also mentioned precisely was that the distribution also embraces situations like linking older applications against newer libraries and possible ABI incompatibilities coming out of it.

The important thing about the “C modules” system is that it’s not only a language tool – it is interfacing the language to local development system, build system, distribution, and the operating system. Out of these four mentioned here, C++20 modules cover the needs of only the first two.

What’s the deal here? Well, the deal is that the “C modules” system is maybe ugly, maybe dirty, hard to use, error-prone, doesn’t serve well the C++ language, it’s ABI-compatibility-unfriendly, but it covers everything that the software development needs for its work. If you want to make a sensible replacement for it, the deal is simple: provide a replacement for everything, or go f*k yourself.

And if you want to make a replacement for “C modules”, you have to start with what the “C modules” system actually is, point out its services currently provided, design a possible coverage for all of them in a different system, cover the needs, design the templating, replacement methods, resolving the binding to another unit, maybe define a possible features of this system better.

For example: the fact that the system linker can only link variables and functions is a simple fact that the linking system has been designed for ANSI C and it didn’t change a little since. Yes, ANSI C, or C-89 otherwise. Newer C language standards have introduced some more features that might even demand some upgrades of the linking system, but this didn’t happen (which means that standards like C-11 must also perform a kind of emulation in some situations).

I can understand the the current library distribution system in today operating systems may be hard to change, but there are several ways as to how to deal with it:

Best solution: provide a resolution system that is separate from the operating-system-supported shared library system. Make the libraries compiled into a single file that contains all the information that should fill in the “node points” in the application and perform the binding of all of them – in this case you’d have to have more types of nodes than a variable or a function, possibility to fill in immediate values, be resistant to reordering fields in the class (because the actual offset of a field would be an information provided from the library) etc. – here the only problem is that the idea of a shared library wasn’t only that it is distributed in binary form and allows multiple applications dependent on it to be distributed without this library compiled-in, but also that the library being used by multiple applications simultaneously occupies, at least for the “text” part, a single piece of memory, not one per application’s instance. This problem would have to be still somehow mitigated.
Intermediate solution: define, which parts of the language features are possible to be “shared”, and which must be “static”, and provide a combined half-shared half-static solution for the libraries. There would have to be strictly defined parts of the language and language system, which when changed, define changes in the dynamic part or in the static part. If you then have changes in the static part, you introduce a backward ABI incompatibility. Of course, this should also be able to identify particular “single pieces” of code, while when adding new, not existing previously, static pieces, also doesn’t break backward ABI compatibility. If you have such a system, you should be able to provide a C++ module in the form of a shared library (which is something required by the operating system) and an additional C++ module interface file, which would contain instructions of how to fill in gaps and linkage nodes in the application template so that it is consistent with the “shared binary” part, distributed as a shared library.

But before we start, we need to understand first how the “C modules” work, what exactly could be improved here, even if it would have to be done using the userspace emulation, and what solution would preserve all the needed advantages as much as possible. Let’s start first with realizing what features currently the “C modules” have (and, more importantly, which they don’t).

Back to the past

So, as you know the “C modules” consist of two main things: a header file and the object file, while the latter can be also adjusted to the form of an archive or shared object (both known as “library”). Note that both the binary format of an executable file and the object file format are parts of the operating system because also the process of dynamic linkage is also undertaken by the operating system. Running it in userspace is kinda tricky because the binary file must be somehow trusted and therefore the privileges matter here, its controlling is usually the matter of the security of the whole system, so “generating a binary file on the fly” by some userspace program requires this program to be extra trusted, which is hard. Therefore we can only do this one way – emulate as much as it’s possible in today’s systems and wait for some operating system to provide dynamic linking system that can be improved appropriately. Note that this means that more or less language features can be qualified as “dynamically linkable” and which ones qualify as such, it will depend on the operating system. It can be at best “full fledged”, where all linkage features can be used and therefore all features can be used there, or it can be only limited to things supported by today system linkers.

That said, I wouldn’t even say that this is the linkage system adjusted exclusively to the C language. It’s worse – it’s a “limited C language support” because not only it doesn’t feature translation of a constant-symbol into an immediate value, but the same concerns, for example, structures and the value of a field offset. The “C modules” offer linkage only for variables and functions, and everything else must be emulated. This way, even though the C language features constants, no one uses them because it’s known that they are actually implemented as variables, for which the language warns you if you are trying to modify them – not as immediate values.

So, if we agree that we might continue with the “emulation” method, we should also agree for generation of kinda “artificial” header files, just like we have artificial symbols. Let’s take then a simple example:

import std.core;
import std.algorithm;

export class stringview
{
	const std::string& source;

public:
    size_t b = 0, e = 0;

    stringview(const std::string& src): source(src) {}

    std::string copy();

    std::string operator+(const std::string& flw);
};

// Exported automatically from exported class
std::string stringview::copy()
{
    return source.substr(b, e);
}

std::string stringview::operator+(const std::string& flw)
{
    std::string output;
    output.reserve(e - b + flw.size());
    copy(source.begin() + b, source.begin() + e, std::back_inserter(output));
    copy(flw.begin(), flw.end(), std::back_inserter(output));
    return output;
}

// Explicit export because it's an external function
export inline std::string operator+(const std::string& fi, const stringview& flw)
{
    std::string output;
    output.reserve(flw.e - flw.b + fi.size());
    copy(fi.begin(), fi.end(), std::back_inserter(output));
    copy(flw.source.begin() + b, flw.source.begin() + e, std::back_inserter(output));
    return output;
}

Don’t worry about this code, it doesn’t make any sense. What is important is that this is being compiled this way: we keep the class as a structure, so its layout isn’t anyhow recorded in the library file, and for that we get the following header file (note that mangling method is just fictional or example, just for the needs of this article):

struct 10stringview__EM__
{
    void* __EM__filler1; // represents a hidden private field
    size_t b;
    size_t e;
};

int __N_copy_M10stringview();
int __N___pl_M10stringviewS_S();
int __N___pl_FS10stringview_S();

You may ask, what exactly is this thing for? It’s only because of the needs of the C module system. These names __N_* are names of the functions that will be used in the object file after this file is compiled. The calls of particular methods will have to be first translated into the required C form and only then linked. That is, for example, this code:

stringview sv(input);
sv.b = input.find('0');
std::string o = "o:" + sv + "ms";

will have to be translated into something of this kind:

struct 10stringview__EM__ sv;
(S*&)&sv.__EM__filler1 = &input;
sv.b = __N__find_Sc(input, '0');
S __t1 = "o:";
S __t2 = __N___pl_FS10stringview_S(__t1, sv);
S __t3 = "ms";
S o = __N___PL_MSS(sv, __t3);

What’s important here is that there’s a call to __N___pl_FS10stringview_S function that will be resolved by finding this exactly symbol in the object file that was generated from that first source file. This is more-less how it is working now – although pay attention that I showed you the desired form of the class definition, from which it is expected to generate the binary form of a file that can undergo linkage and the – artificial this time – header file that will be necessary for the use of other modules. This last translation of the user code is to show you how it should be transformed to prepare the code for the linkage. The use of sv.b is explicit and it doesn’t undergo linkage, also the initialization is resolved because it’s inline. Only the use of the operator +, actually that free function, will turn into a function call, which’s name identifies this function in the object file.

Here you can see clearly, which parts are being resolved exclusively statically, and which can be resolved as shared. You can see here, for example, that resolution of the field offsets can only be done statically – but still can be done by the C++ compiler. Different thing is if the structure is the type of the parameter passed to a function – this layout is then fixed and bound to the binary form. For class templates, everything should be resolved statically, unless you can create a universal form that can be compiled into a solid binary, which would be only parametrized by something extracted from the template parameters and their dependencies – this, however, should not be done automatically as in C++ the programmer should have control over what is common and what is “bloated”, and where the size can be sacrificed for speed and where the other way around.

And that’s exactly where it should have started. If we want to develop the module system for C++, we need to start first with defining, how it will coexist with what we have. We need to inject it into the system where you have an object file and a header file for it, we need to make a method to generate them both, and a way to translate the linkable other objects into the form “edible” by the linker.

The developer and the system

Binding with the system is the toughest part, but the desired module system must first of all take care of appropriate build system that will focus on compiling only the absolute minimum that has been affected by changes since the last build.

You know that theoretically you might be able to write the whole program in one big source file – all classes, all functions, everything, maybe except the standard library. But this doesn’t make much sense by several reasons – when you change just one function, you would like that this function only be recompiled, not everything, or at least the smallest part that contains it. If you didn’t change the function’s signature, then also functions that call this function shouldn’t be recompiled. This problem is currently solved by having this function, callable by others located in different files, in the header file, and if you don’t modify the function’s signature, you don’t modify the header file, so all files dependent on the header file are “fresh”.

This simply means that the current “C modules” system is highly dependent on the existence of the header files and the dependencies defined for the object files by having exported part defined in the header file. You want to replaced it? Then you have to preserve all the dependency system not only between the definitions in the C++ languages, but also files as a unit of ABI dependency control.

So, when you start with a statement that you have the class definition and all its method definitions in one file, you have to have a possibility to generate a header file out of it, and moreover, have this header file defined such a way that would have a defined form of “significant parts” so that when the exactly same form is generated as previously, then the header file generated after that shall not be considered refreshed. This means, for example, that you need to have some form of interface file, where for example, all methods will be put separately to files, sorted alphabetically, and also split into parts, as well as the object files using it should be able to identify which parts of this interface they are dependent on, and be considered stale only if refreshed were these parts, not just because the whole header file was refreshed.

Simply – if you want to have a module system for C++, you should forget the system of object files and header files; there’s a new format to be used, adjusted to the needs of C++, and the real object files will have to be generated only from the “dynamic part” of the code, such that could be later put into the shared library. This intermediate format will have to be added between the source files and object files, as of course, it still may happen that there would be a subproject, to be compiled together with the main project, but the subproject is written in an earlier standard and it must remain so.

The build cannot then rely on object files as they are now. Also unsure if the compiler can use files as a unit of freshness. Maybe a better solution would be some kind of database that will be updated partially and every update will have to be notified with its time so that the freshness of particular parts can be verified again. And then maybe the source would have to be kept in the database, at least for the active development time – something that Bjarne Stroustrup himself has already mentioned a long time ago.

So, in summary, we need to reconsider the following:

Whether a file, as a project unit, is a good enough indication point of a single program part prone to changes and refreshing. And if not, what to use instead.
How to map a source file into the project units. Currently we have a mapping from a single source implementation file (*.cpp file) and an object file, then object files can be linked together. This can be controlled by the compiler, but all you can do is to decide about the filename per source implementation file, or whether to combine multiple source files into one object file.
What exactly we should be able to do with the project units, what kind of entities they may export.
What should be the form of the project unit in the “binary form”, and if not object file as previously, then what. As linkage in “C modules” can only refer to a variable or function, then the object file is a form of a “template”, which can be “instantiated” by replacing the variable or function symbolic reference with the physical reference thereof.
How the interface part of the unit should be defined and what features it should provide in comparison to the current “C module” interfaces, that is, bare header files.

All options open

Look: I’m not trying to show you how I’d develop a better module system for C++, just some principles that you should have in mind when developing it. If you want to base this system on the existing framework of the “C modules”, first thing is to know, how to emulate them, and what kind of limitation this feature will have because of it. Then, how to design the whole system so that the minimum changes can be tracked, as well as to be backward compatible with the “C modules” system and can interface to it. This is the first step.

Then you can think about a different linkage system, which will be able to include structure layouts, constants, and class details and resolve them at least at compile time, and then possibly to resolve some of the details at runtime.

The definition of the modules, export rules, interface management and splitting into parts isn’t all bad. It’s a good start. But for this feature to be useful, there’s still a lot of work to do.

So, a slight reminder for the standard committee: You have committed an experimental stuff not to be used by the software development, already twice. Let’s hope, this time the module definition will not have to be withdrawn from the standard, as it happened to the export template feature. But this thing is currently at best a “partial feature” that can be experimented with, which is a good thing to have, and even good to have standardized, but with annotation that this is something standardized only for the future research, not for software development use. And as such it should be at best put into a separate standard document, not into the language standard.

Posted in Uncategorized | Leave a comment

Yoda conditions very harmful seems to be

Posted on May 16, 2019 by ethouris

There is a popular “coding style” item, mainly used in C and C++, but also in other languages that were based on C and have made the = and == operators being able to be confused. This relies on simply inverting the arguments for the operator. It’s known under a slang name, Yoda conditions.

Here is why you should absolutely avoid it, and before you try to educate anyone as to how good this idea is, you better read this whole below.

1. The pavement of the good intentions

In C language the assignment operation has been decided to use the = operator. It was a simplification of the statement known from other languages with let instruction, which written as let X = Y made sense as “let X become equal to Y”, but then someone thought that X = Y should be good enough. Then someone thought that the assignment can also leave a result in the form of the value itself being assigned, so it could be used in an expression, like A = X = Y . Therefore it needed to be distinguished from the “comparison for equality” expression, so for the latter the == (double equal) symbol was used, as in this case both X = Y and X == Y have a result and mean different things.

This has led to a confusion, as one might write mistakenly (a = b) instead of (a == b), which may lead to an undetectable error. This was also true for the “not equal” operator: != because the same problem may arise if you invert the order of ! and = and make something like a =! b, that is, still assignment, but you apply NOT operation on b first.

Note BTW. that Pascal and Ada have used a different approach: The assignment operator is:= – so a single = could stay as comparison operator. AFAIR they also don’t allow the assignment expression to be a part of a compound expression.

Someone has then spelled up a solution: as most of the comparisons are like VARIABLE == PATTERN, and the latter is constant, let’s invert this and put the constant value always at the left side of the operator, that is, PATTERN == VARIABLE. When you do this, then a mistakenly used single = will be always detected by the compiler as an attempt to write into a constant and reported as error.

Unfortunately, probably the inventor of this solution didn’t even predict how his idea will turn into an utter nightmare.

2. The practice of identification

The natural way for writing conditional expressions is always in this order:

TESTED: tested expression, which’s value’s properties we don’t know
OP: the operator
PATTERN: the pattern value, which’s value we know and we use it to test the TESTED expression

This is because it is so simply when you speak your natural language. If you want to speak about a condition as to whether it applies to some object, then you always first mention the object itself, then the condition to be tested. It doesn’t even matter if your language is SVO, SOV (even partially or conditionally) or even VSO – the order “first S then O” is always preserved, at least in the vast majority of languages in the world. The only exception is when you want to highlight the object, which only happens in an exceptional situation when your context is that it follows some already suspected condition on that object.

Let’s then take some example expressions:

a == b
p == 0
x.find(c) == -1
x.find(c) == b
produced() == consumed()

Here we want to apply – seriously – the “assignment prevention” rule, that is, we want that a non-assignable expression be on the left side of the == operator, and nothing else at all. This means that we might want to invert the order of arguments of the == operator, but only if after the inversion there’s a non-assignable expression on the left and assignable expression on the right. In all other cases, we want the rule of the above described natural order to be preserved. So, the correct form after this reordering is:

a == b // inversion is useless - both are variables
0 == p // ok, variable on the left replaced with constant
x.find(c) == -1 // expression on the left isn't assignable anyway!
x.find(c) == b // idem, and there's a variable on the right!
produced() == consumed() // both are values!

Unfortunately every time when you look at the code written using this rule looks like this:

b == a // useless
0 == p // correct
-1 == x.find(c) // useless, constant is also on the right
b == x.find(c) // WRONG! variable has been SHIFTED to left!
consumed() == produced() // useless

But then… what’s the common treat of all these expressions here in this second version?

Well, the inversion has been done: the natural order TESTED OP PATTERN has been inverted into PATTERN OP TESTED. Actually, this isn’t how this rule sounds, not even approximately, it’s not even similar to the original rule. But this is how this rule is “executed”.

But why?

Well, the answer is: because this is how the human brain works.

3. Morons or automata

Are programmers in their majority such sorry morons that they cannot understand a simple rule?

During my working in Motorola (today: Nokia Networks) in Krakow I liked this rule in the beginning and in the code peer reviews I have tried to insist on that rule very seriously. I was reporting problems with misuse of this rule and were trying to insist on that it is adhered to the letter, which means that:

If there was an opportunity to apply this rule, reported when it wasn’t applied (including when it has inverted the natural order)
If this didn’t make the constant-variable replacement, reported as unnecessary

The problem never was in case when the inversion required to invert the natural order. In such case no one was arguing. The problem was when my finding applied to an expression that has actually inverting the natural order and the fix had to revert it. If this was because the order inversion has actually “impaired” the expression, it was nothing to argue because their error was evident. Arguing was just in the case when the inversion was only useless. In this case I could only insist that the natural order is better and more readable for an average user and the inversion makes us no good. They were disagreeing just because “this looks better” or they “got familiar to this order because of the assignment prevention rule”. How?

Worse than that. They were inverting not only in “useless” cases using the == or != operators. Also in cases when the inversion shouldn’t be even in question at all:

0 <= x // RELATIVE OPERATOR, no assignment-misuse problem!
compare(0, X) // no assignment symbol even in use!

This shows the real scale of the problem. Inverting expressions, which makes them little understandable for people thinking standard ways, are used everywhere where such expressions are in use (not just those involving == and != operators). And in result, you have maybe 15% cases of assignment misuse prevention, with additional 5% of actually causing the risk of assignment misuse where it didn’t occur, and remaining 80% are simply useless, of which only 10%p were those using == and != operators.

The question is then, why can’t they simply focus exclusively on expressions where you really have a variable expression on the left and constant one on the right, and leave all others alone. They will then respond simply: because they got used to it by this rule.

Focus on this: THEY GOT USED TO IT.

To what? That the variable should not be on the left? No. To that they should invert the order of TESTED OP PATTERN and make it PATTERN OP TESTED. No matter that the rule of assignment misuse prevention speaks nothing of this. This “order inversion” is being remembered as the correct procedure because for human brain it is easier to understand, remember, and use in practice, and for the first look it seems to be the good enough approximation and can be though of as a functional equivalence.

Simplification is probably not the only reason. Another one I could see here is because changing the order is seen by the programmers as a kind of challenge (you need to “invert your thinking”, which is some hardship that you have to push over), and once you have “mastered” it you feel with it like you have learned a new skill. And as you can see a profit in using this skill – because with this you automatically invert expressions like a == 0 into 0 == a – you get some kind of satisfaction of using this “new skill” by “challenging yourself” to think in inverted expressions every single time when you have an opportunity to do it.

4. Bug in your brain

The biggest problem however is that most of the conditional inversion cases are done by programmers who have memorized the inversion so well that they have even started thinking by themselves in the inverted conditional. The reason why they write these expressions inverted automatically is that they no longer think in normal order.

You can argue, of course, what the normal order is. The problem is that the order of PATTERN OP TESTED is the order used normally everywhere, not only in programming, also in daily speech. People doing this inversion do it only for programming, and simply when they have to write a conditional expression, they always think that they should invert the order and the TESTED must always precede PATTERN.

So first, even if one might argue that there’s a big price paid by adhering to the inversion rule (I personally think it still is), which is that in case of cooperation between normal and “inverting” people in one team is hard and results in largely extended working time (as the inversion rule is never explicitly stated and never explicitly required in any formal rules in programmers’ teams, so at best programmers simply do it on their own), the other factors are here important: the fact that this rule applies to all conditional expressions, but only in a few cases really prevents from any problem, and the other one, that it doesn’t even ensure 100% reliability – in short, the wrong or useless transformation happens in majority of cases.

It wouldn’t be even a big problem, if this rule of assignment misuse prevention was really used literally, so only those expression that are affected are inverted, while all others, where inversion would be useless, remain as they were. At least they would be minor in the code and therefore bearable. But if ALL expressions are inverted, the whole code becomes harder to analyze for those, who don’t have this “bug in their brain”, and the other way around – if someone doesn’t apply the inversion rule and doesn’t need it, for those with the bug this “normal” code is harder to analyze. Effectively then, programmers doing inversion and those not doing it cannot work together in one team, or their common efficiency suffers.

5. All for naught!

Well, guys. The statement that the misuse of operator = may lead to undetected errors might have been true a long time ago, but not with today compilers, as long as you seriously treat warnings and turn them on. There are easy methods as to how to prevent the assignment operator to be used mistakenly where comparison was meant. Three basic rules.

5.1. Turn on warnings and treat them seriously

Remember: warnings are meant to save you trouble. Good compilers have warnings so well designed that you can always replace given expression that generates a warning with an alternative, more explicit expression, which will be explicit enough so that the compiler doesn’t have to warn. Example for gcc:

if (a = nextValue(c)) // Warning: assignment in conditional
if ( (a = nextValue(c)) ) // ok, double parentheses are explicit

The rule is then: always use warnings turned on – maybe some well selected warning options can be good for particular compiler – and those that are reported should be treated seriously and the code should be changed so that they are no longer reported.

This warning, however, isn’t reported in case of C++ defined operator = for classes – but read on.

This rule for interpreted languages, which are not being compiled before deployment – such as JavaScript – can be changed to “always use lint tools and treat the reported problems seriously”.

5.2. Make a code standard rule to avoid assignments in conditions

Every expression like this:

if ( (a = nextValue(c)) )

can be also written as:

a = nextValue(c);
if (a)

So don’t try to be a hypsta. Write the code decently and you won’t cause problems. Even expressions that use the while loop, like this:

while (int x = get(id, value))
{

can be also considered to be written:

for (;;)
{
    int x = get(id, value);
    if (!x)
        break;

And no, this isn’t any more dangerous than the first one. Dangerous is a loop that doesn’t have breaking condition or a loop that doesn’t test all breakable conditions as it should. But it doesn’t matter in which place you are testing this condition. It’s even better if you simply avoid at all any conditions that made side effects being used in the while conditional expression.

It doesn’t prevent a mistaken = to disappear if you simply don’t use them intentionally, as mistakes can always happen. But if you stick to the rule that an assignment expression used as a part of another expression is never normal or tolerable thing, you will more probably catch this thing during a code review or when you analyze it.

5.3. In C++ classes, make operator `=` return void

You read in any C++ course book that your class in the assignment operator shall return the reference to the assigned object?

Then that’s wrong. Class-defined operator = shall always return void. And I mean always.

There is some residual usefulness of returning the reference to enable expressions like

a = b = c

But please, you can also split this into single expressions, and the gain in the form of that this expression will generate a compile error is more important:

if (a = b)

If a belongs to a class that has defined operator= and it returns void, this expression will simply make that the type of the expression is void and therefore not convertible to bool. And this will generate a compile error.

There’s generally a good rule to avoid using expressions that generate side effects. By doing this you simply increase risks of getting an undetectable error.

6. Conclusions

Forget it. Simply. And think of the above experience with assignment misuse prevention before considering it next time. Also when someone insists, you can show them this.

Posted in Uncategorized | 1 Comment

To block or not to block – that is the harmfool kwestion

Posted on March 15, 2018 by ethouris

Any kind of API designed for data transmission, especially through the network, but actually in any kind of “stream access” device (in contrast to random-access one, such as a local file), the data you want to read need not be available at the moment, or the device you want to write to may be temporarily unable to accept your write request. There are two ways to handle this situation:

The function that tries to do this operation stalls and will only continue with the operation when the situation is resolved (the transmission subsystem can accept the write request, or the data are finally available). This is done in blocking mode.
The function returns with error that has a special code (in POSIX systems it’s simply called EAGAIN, on Windows it’s… kinda more complicated), so it always returns immediately – either with a success if the operation could be done, or with this error, if it would have to wait. This is done in non-blocking mode.

In POSIX systems the default mode is always the blocking mode, and the non-blocking mode should be requested by

int flags = fcntl(fd, F_GETFL);
fcntl(fd, F_SETFL, flags | O_NONBLOCK);

You might find in some older code that the second one is done exclusively with O_NONBLOCK (O_NDELAY or FNDELAY on some systems) argument. In the older code it was simply the only flag that could be changed, others were ignored. This is not true in today systems, so don’t try it.

This ain’t good, and the other ain’t no better

Both blocking and non-blocking mode have their own treats:

Blocking

++ When the function is not ready to do the operation, your application doesn’t have to worry how to avoid busy-waiting, it will simply do nothing and it will be stalled (put into sleep by the system) until the operation can be continued.

— When the function is blocked due to inability to perform the operation, and you might want to do something in the meantime, there’s nothing you can do, and there are limited number of things you could do to make the function stop blocking (this is usually an abnormal termination and it doesn’t come without a price).

Conclusion: You cannot try to do two different operations that may block one after another, and start doing the one that can be done at the moment. The best way to do it if you use the blocking mode is to do both operations on separate threads. The operation should provide some ability that, even though it could block the whole thread on the operation, it can be somehow painlessly unblocked by appropriate operation done by another thread.

Non-blocking

++ Your function never blocks, so at the moment when the operation can’t be completed, you can do some other, more productive things at this time.

— When you have no things to do at the moment and the operation still cannot be completed, you have to wait somehow for the right moment.

Conclusion: You can try to do two different operations that might be not ready, and the one that actually is ready will be performed. The problem can be, when none is ready at the moment, in which case you’d have to do some waiting, which can be one of:

Busy waiting: No operation can be done, so iterate over them again. This procedure should include sleeping for some time because otherwise your CPU usage will skyrocket unproductively. How long to sleep, it’s the balance between two factors: if longer, you will have more chance to sleep while the operation was actually ready, so you cause unnecessary delay with performing the operation. If shorter, your application will use more CPU unproductively.
Bar waiting: You count on some system facilities where you subscribe for a readiness signal assigned to multiple various operations, and the system will let you know when the operation is ready to perform. This is exactly what can be called “non-blocking, but blocking” because here the system blocks the call just like it blocks a single operation in the blocking mode. The difference is that it blocks multiple operations at a time and the first ready one wins. It is theoretically not even important whether you call the single operation in blocking or non-blocking mode because the operation should never block anyway (you should not try to call it if you are unsure as to whether it will not block), just the nonblocking mode is exactly for the “fallback” situation if you might have made such a mistake.

What the professionals do…

Many times I have heard things like “blocking mode is for amateurs and non-blocking mode is for professionals”. The problem here is that it’s easy to call yourself “professional”, whereas harder to know the right use of both modes in a professional application.

As you can see from the above description, it’s very simple, why the blocking and non-blocking mode is usually provided: it’s because the blocking mode should be used with fine-grained threaded architecture, and non-blocking with fiber-based architecture.

If we skip the physical part of multiple cores or processors and simplify the explanation to one core hardware architecture, both things seem to be actually quite the same thing: simulation of doing multiple things at a time by simply doing them one after another, just the details of this implementation differs in how exactly particular operations are scheduled and how the “context switching” happens:

Threads: the context switching happens arbitrarily, however here in the blocking operation we can be sure that the currently blocked operation will never be selected for execution. May happen that all threads will block at a time, which will make the whole application stall (sleep). In order to make this happen for any user-defined readiness conditions we have at our disposal condition variables.
- ++ You can define your operations simply as intended for a single undisturbed execution, and the operation may take as long time as needed – if any operation should be finished earlier even if later started, it is possible due to arbitrary context switching (although a delayed handler for an operation on this device may delay the execution of the handler for the next operation for the same device)
- — Due to the arbitrary nature of threads, the operation can be also paused and switched while it’s in progress, so in order to prevent this from happening and effectively leaving the context in an unstable state, a mutex should be used to protect common shared resources, which may undergo state changes.
Fibers: the context is never switched in the middle of the operation, only between two operations. How the operation is selected for execution, is another matter, and it depends on whether in particular implementation you are able to test the operation for readiness before executing it. There are three levels of abilities here:
- Battle-testing: you don’t know if the operation is ready, just try to execute it and ignore the error of non-readiness.
- Prerequisite-testing: you perform some very simple and quick operation to check if particular operation is ready, then only if this succeeded do you perform the operation
- Bar-waiting: you have some general-purpose waiting function, which will stall until any of the registered operations are ready to perform. When it exited you know which exactly operations are ready to perform and you perform only these.
- ++ You don’t have to worry about the simultaneous access to shared resources – every operation will be performed completely from the beginning to the end, and you have full control over the state of whatever resources the fibers may share.
- — The operation performed as fibers have very strict requirements and limitations: they must perform as quickly as possible, and if there’s any least chance that any data processing done by the fibered operation may be long or unexpectedly long time consuming, you have to probably still use threads communicated through a queue, so that the fibered operation just passes the data for continuing by the thread and returns quickly.
- — When you don’t have available the bar-waiting option, your only choice will be the busy-waiting with potentially wasting CPU time or reacting slower than needed.

Actually you can say that the blocking mode is more universal, may allow handlers to be defined very easy way (just as single-device-handling procedure) – although in order to handle multiple devices at a time, you must use threads.

For non-blocking mode the programming is a little bit more complicated, but as long as you keep in mind that handlers must be quick and behave predictably, you can achieve much better performance than with blocking mode because context switching in fibers happens in a predictable place, so there’s no burden provided by mutex locking. However if you don’t have any system bar for multi-device-waiting, your performance gain will be lost on busy-waiting.

The architecture matters

Which method you’d like to use, it’s the matter of the architecture of your application, or more practically, how much effort you’d like to spend on writing the application and how much you want to press out from the performance. For what matters generally here is just this:

How to perform the operation immediately when the operation is ready
- In blocking mode, the operation will immediately resume when ready
- In non-blocking mode, you need some method of testing for readiness
How to not waste CPU time during the period when none of the operation is ready
- In blocking mode, the operation will stall when not ready
- In non-blocking mode you need a system bar to stall the thread running the fibres
How to perform operations on multiple devices at a time, which can be ready or not under arbitrary conditions
- In blocking mode, you should use one thread per device
- In non-blocking mode you have this out-of-the-box

When you, for example, don’t have any shared data between the facilities that are bound to two different devices, or the number of such data is kinda “negligible” (worst case, you can use lock-free algorithms for simultaneous access and possibly use queues for communication with the threads), the threaded – and blocking mode therefore – is the best option. You will rarely waste time on mutex locking (I’m talking about the mutex lock performance by itself, not time wasted due to inability to lock the mutex because the thread waiting for it is stalled), and moreover, you don’t need a waiting bar, so you can use also stalling methods that cannot be integrated with any system waiting bar (such as condition variables, which can’t be integrated with select). It is true also that the blocking mode is the best option for simple applications, especially those that operate with just one device and don’t need to update the data during stall time. The blocking mode allows the handlers to be written very simple way because the procedure is focused exclusively on one device.

Be careful though with this “simplicity” because just a simple “copying application” that transmits the data from one device to another device is already an application that uses quite a number of data sharing and multiple devices at a time. And in such a situation you should rather consider non-blocking mode and use “fibers”. They can be combined with threads because sometimes an unexpectedly long handler procedure might need to be shipped to some other thread in order not to block the overall queue. By using fibers, however, you know that any switching happens exclusively between the handlers.

However, a non-blocking mode might not be a good idea even with lots of data sharing, if you don’t have at your disposal any universal event bar – some procedure that can stall and continue only when getting readiness signal for one of operations, and every operation that might normally block, this time can be provided with a signal, which can be registered in this very bar. This task might be tricky, especially if you have potential operations to perform coming from different libraries, and with device definitions that cannot be collected into one bar. For example, you can easily collect every device that can be represented by a system file descriptor in a call to select or poll function – you’ll get exit and readiness signal in the bit flags. But what if the device can’t be represented by a file descriptor and it uses some custom definition with the use of condition variables, which cannot be anyhow collected in such a system bar? When your thread is waiting on a condition variable, it can’t wait on select/poll call simultaneously, and vice versa.

However, if you are writing this application yourself and you don’t use any alien device definition using condition variables for barring, you can make your life easier by just using some solution that can provide you with some common bar for system devices and custom devices. For example, the Gtk+ library comes with a loop dispatcher that is based on file descriptors – so your event loop may exit when receiving either a readiness on a system device and a Gtk event (AFAIK Qt has been using the Gtk event loop in order to allow both libraries to cooperate). There are also available solutions like kqueue/kevent available on BSD system and its derivatives (also Mac), and for Linux-only solutions you have eventfd. There are also some opensource projects related to that – search for kqueue keyword.

Maybe at some point something around this will be added to C++ standard library, but currently there’s still a long way to go. A proposal exists to add kevent system to Linux API, but this is still a long way before it will be anyhow standardized. But even though this isn’t yet anyhow portable, it’s worth trying to employ it in an event-based application, at least for performance reasons.

Posted in Uncategorized | Leave a comment

Defensive vs. Paranoid Programming

Posted on February 14, 2013 by ethouris

0. Introduction

I have found many various flavors of the style of programming in my experience. I have already written about things like ridiculous checks for NULL, as well as how “underdocumented” many libraries are. Among many positively perceived styles of programming there is one important, Defensive Programming.

I have also already written about how checking for errors all the time, and “doing something” in response, results in producing a code that is never really tested and if there comes a situation of using it, it fails to do anything that would help in solving the problem.

Being defensive includes lots of various aspects and analyses. Poor mind programmers are using the highly simplified vision of this and it results in something that is next perceived by the next source code viewer (especially if this is someone more experienced) as “paranoid programming”. Poor mind, because the code seems like being prepared for every least kind of error that might occur during the operation, simultaneously forgetting, what they are actually trying to prevent: some “predictable” runtime situation, or some completely unpredictable situation, that is, a bug. Well, it would be really nice to code something that would prevent a bug, but well… how can I say it… 🙂

1. A gun in kid’s hands

Underlying layers of software, that is, the kernel, the operating system, and the libraries, are not only something that save you writing too complicated low level code – they are also something that should provide basic reliability (they save you providing one).

If you want to program on the lowest level, where you are dealing directly with the hardware, there are still some domains, in which you can do it – but you’re rarely allowed to do that in today software. This occurs practically when you are developing the operating system itself, the kernel, or the drivers, but not when you have to provide a business solution. It’s not so that there is some law standing behind it or something like that. It is, to all appearances, the matter of trust. You can say that you don’t trust the underlying layers of the software that it works without bugs – but why should the customer trust you? In particular, when you replace the existing lower layer with your code, how do you think, will the customer trust more you than somebody, whose library is being used by multiple software projects all over the world?

If so is it, then why do you worry whether the library is written properly and whether your software will behave correctly in case of a possible problem in the library? Do you think that your software will be free of bugs? Most likely, if you’d replace some library code by your hand-written code, you will at best make the same number of bugs, usually more.

2. What is really defensive

When some part of the software you’re using, or it can even be some other part of the software that you’ve written, may result in an error, the first and foremost thing for you to check is on which level a possible problem may occur. The fact itself that some function returns a special error value or throws an exception (the second one especially if you write in Java) doesn’t mean yet that this is something you should take care of. Some procedures return error value because in some contexts they can be used on a very low level, or just the API is predicted for this function to be implemented in a very unreliable environment as well as completely reliable ones. So, if the error that would occur in some particular case is something that your underlying software layer should never report, treat the case that it did as something that has occurred out of normal conditions.

A short example in POSIX: the close() function, which closes the file descriptor, may return -1 in case of error. But it may happen only in one case: when the descriptor is for the device open for writing, it uses buffering, and there is some buffer of data still waiting to be written to the device; it’s planned to be written when the buffer becomes full after next write, or on demand when you do flush(), or – as in our case – before disconnecting from the device when you do close(). When the operation of writing the data from the buffer to the device fails, close() returns -1. But of course, it still closes the descriptor! This -1 doesn’t mean that “closing failed” – closing cannot fail! It means, of course, that if you predict possible failure, you’d better flush the buffers manually by flush() and manage the error when you still can do anything.

So, the condition for close() to possibly return failure is when the file is open for writing, it uses buffering (although it usually does), and there is still some data in the buffer waiting to be written. It means, simply, that when you do flush() first (and it succeeded), then your next close() will always return 0 – there is no possibility that it do otherwise. Samewise when the file is not open for writing at all (for reading only, for example) – in this case close() will always return 0.

Of course, if you read the man documentation for close(), you’ll see that it’s a common error to ignore the return value from close() – but, simultaneously, somehow, close() in /usr/include/unistd.h is not declared as “__wur” (shortcut for gcc’s specific __attribute__((warn_unused_result)) attribute), as it’s for, for example, read(). That’s because there are only specific situations, when close() may return error.

So, there can be two completely different reasons, why close() would return -1. The first is when it’s doing final buffer flushing, which failed. But if you ensure that this wouldn’t be the reason, then returning -1 by close() means something completely different. It simply means that the operating system just went out of its mind. There are some theoretical possibilities for this to happen – for example, file has been reconfigured to be open for writing while you didn’t request it. I say “theoretical” because I can imagine that it may happen. The only trick is that the system’s basic reliability statements are that this will never happen.

Another example: you’re opening a file. Wait. Not yet opening. You check the file using “stat” function. You check that the file exists and that you have rights to open it, so opening this file should succeed. So, you open this file just after you made sure of that. If the opening function reports error, this can be now one of two possible errors:

the file has been deleted (or changed rights) in between
there was some very serious system error

If you want to distinguish these cases, you can call “stat” again. If the second call to “stat” proves that it has actually been deleted or changed rights, you can state it just didn’t exist. Otherwise we have the case of system error.

That’s not all. This file might have been meaning various things. It might be, for example, a file kind of “/etc/passwd” or “/lib/ld-linux.so.2” on Linux. This is such a file that must exist or there is some serious system damage.

The point is, the fact that the file cannot be open in every particular case can mean different things, so there cannot be any universal advice what you should do in case when the file can’t be open. On one hand you can state that this condition, connected with some other call, can be the only one that can provide you with the full information. On the other hand, why should you care? If the file was vital for your program to continue, you have to crash. If not – just state quickly that the file can’t be open and continue without the file. You may want to log this into a log file, you may want to display this info to the user – but whether to do this, should be also specified explicitly by knowing its real meaning to the program. None of these things has to be done “always”. Even if you stated that this is a serious system error, especially if this is only one of possibilities, displaying this to the user is the wrong thing to do. For a user it doesn’t matter if your program crashed because file /x/y/z did not exist or your program has been killed by SIGSEGV signal, or whatever else you’d like to say.

Being really defensive is, first, to realize on what level you are. Once you realize, you can now decide whether the runtime condition that failed for you is something that is just one of alternative for the runtime to work, or this is something that should not occur. In the second case you should crash. But there is also the third case.

3. The Unpredicted

There’s always one more group of error handling: checking for a condition that is declared explicitly to never occur.

For example: there is a function, which returns a pointer to an object. The objects are collected in an array and there is always an object at position 0. For values greater than 0 and existing in the array, the object at that position is returned, otherwise it’s the object at position 0.

This function, then, has in its definition that it never returns NULL. And it really never does, according to its definition. What’s the point then in checking if the object returned by this function wasn’t NULL?

The first explanation is more-less: you’ll never know. But what does that mean? Does that mean that this is an expected situation, when you are checking for one of possible results of the function? Or you rather expect the situation that is explicitly declared to never occur? Of course, it’s the second one. And if so, there can be only one reason for this to occur: the program has went out of control. So, effectively what you’re trying to do is to keep under control the situation of getting the program out of control.

Do you understand now the absurd of the situation? And, to be clear: no, it’s not putting the program back under control. If in any context and any basic conditions you predict that the program can go out of control, then it could have happened to virtually any part of the program. In short, virtually everything may happen. I say “virtually” because the only things that may really happen are problems that may happen at the machine level and come from a compromised reliability of any of the software layer. Anyway, even if we shrink this virtually everything to only possible thing, this is still a situation that the program went out of control. If you then consider that any of things that should never happen have happened, then the number of degrees of freedom for any today application software is so great that such a stupid null-check will decrease the probability of crash by… less than 1%. A probability, which is still very small, at least because it’s explicitly qualified as “should never happen”.

Some people say that even this may help in an extreme situation. Oh, really? So why don’t you resign your job and gamble instead? Why do you waste your time on focusing on something that can decrease the probability of a bug by less than 1% instead of defining the full state conditions and plan the tests based on reduced degrees of freedom and full coverage cross-path passing? Why don’t you, simply, focus on bugs that are much more probable to occur?

4. The game and the worth of the candle

If you think that you have some situation, where there is a static probability of some unexpected result, and you need a tool to catch that at the development stage, use asserts. This is exactly the tool predicted for this use. But remember: assertions are predicted only to work in a situation that you run your application for testing, in your incubator, in your environment, compiled with a special instrumentation for debugging. It’s not predicted to work in a production environment. Not because they may crash your program. Only because they may harm the performance of your software (if you think that it won’t check again the same condition manually).

But… well, there can be a situation that there is more than very little probability of error in a crucial resource! If so, you should crash.

Come on, this condition check doesn’t cost me much in focusing, doesn’t cost much work, this is just a simple instruction – and even if it gives me a negligible advantage, it’s still better to have it than not to have it…

Really? So, if this is only a simple instruction to write, you probably don’t test this code, do you. Maybe the condition to check is a simple instruction to write. But this instruction causes some additional situation to potentially happen in your code. It’s not just condition. It’s the condition, plus all actions that can be taken when it’s satisfied, treated globally, together with all the aftermath. This is not just a side path of the program. This usually is a serious problem and if you want to make the program prepared for this situation to handle it and recover from the problem, it’s really complicated – and as such it must be very thoroughly tested, the program must be run several times in a controlled environment, in which you can cause this problem to seem to happen (at least from the program’s point of view), to see, how it recovers and whether it does it properly. Because, certainly, you’d like to prevent the crash of the whole application. You don’t code your error handling just to make it cause another error, do you.

So, ask yourself first: will your program be really able to recover from this error, and are you able to test it if it really does? If your answer is anything else than “yes, certainly” (even if it’s “most probably yes”), doing anything than crashing is wasting time and resources. Now ask yourself additionally: this whole cost of preparing the recover is paid only to prevent problem that can occur with a probability similar that your customer using your software will be killed by bomb explosion – is that worth a shot?

And again, I remind: it’s not that “bugs too little probable should not be prevented for”. It’s that “bugs cannot be prevented”. Yes, bugs cannot be prevented by a runtime condition check. If there’s something that could have happened only by a very very bad luck bug, you can’t do much about it. I say “can’t do much” because you can lie to yourself that you can and you may do something – but there’s still only a very little probability that you’ll repair it by the action you take. With much greater probability you’ll cause by this action much more problems that that you have caught (that’s why it’s usually better to crash).

4. Unforeseen and unpredictable differ in level

Defensive programming is such a method of programming that prevents from problems that may occur in particular method of how the software is being used. This has completely nothing to do with problems that may occur due to unreliability of any underlying software layer (well, actually we should talk about just “underlying layer” because it’s not only about the software – hardware can also cause problems, if not hardware, then the power supply or so). Defensive programming is simply preventing the program from going out of control, stating that things declared as “should never happen” really never happen; at most, things that users “should never do” are often being done. Defensive programming is simply “doing your homework” and write your software decently, with the widest possible range where this rule applies.

Defensive programming is simply doing things in the range of declared possibilities how the function should work, not preventing a situation, when the function did something that hasn’t been described in its documentation, or a system fails to adhere to rules it guarantees to honor.

I’m pointing that out because there are many developers, who think they are doing Defensive Programming, or just they make their programs safe, while actually they are doing Paranoid Programming. This doesn’t make the program any better nor any safer. It doesn’t make the program have less bugs or harder to destroy in case of a bug. Actually there are just two possible practical results that Paranoid Programming can cause in the software production:

completely no observable value added, just wasted time to write and possibly test (best if!) the useless code; sometimes even wasting performance on doing useless checks
causing the program to be harder to properly test and therefore wipe out bugs; sometimes may lead to increased vulnerability of the software

I’ll try to describe this second one in more details.

5. Preventing from self-destruction prevents from self-destruction

You know, what the purpose was to have a “self-destruction” button in a military device? I admit I know it more from S-F than from real life, but it still had some purpose. This purpose was: protect the information and the resources, so that the enemy cannot use it on its advantage (and our disadvantage simultaneously).

That’s more than obvious that no one is going to use self-destruction button in a daily use. That would be the last day, of course. This is to be used in an ultimate situation, in which all possible protection, recovery, defense, and whatever else can be undertaken, fails. And there is a serious threat that the enemy will reach our resources and use them, which will make them stronger, or which will make us more vulnerable. Even though we will be in a serious trouble, when we destroy our ship, the whole organization may be in a much tougher trouble, if the enemy get access to our resources. In case of software it’s even not the matter of security: the program running in an environment with resources containing false data can do completely crazy things and can potentially destroy the other, valid data.

That’s why preventing from crash is a kind of “software pacifism”. Preventing from crash causes only preventing from crash. If someone is going to block the self-destruction button from firing forever, they do not “prevent the ship from getting destroyed” (the ship can still be destroyed by many other ways), but from an intended self-destruction, in case when we exceptionally want it.

It’s maybe not the main purpose of checking runtime conditions, but it always happens when somebody is doing “Paranoid Programming”. The biggest problem is that when there’s some function called, which returns some fail-prone value (not, “which fails” – it’s just enough that it returns a pointer), the first thing they think about is the error checking. I don’t say “error prevention”, it’s only error checking. When checked, that a pointer value being returned is NULL, they at best print something on the log, and return NULL. Well, our function doesn’t return a pointer? Ah, it’s integer? Good, let’s return -1. -1 is still a valid return value? Then let’s return, hmm… maybe INT_MAX? I didn’t even mention yet something like “isn’t it so, occasionally, that our function has already guaranteed that it never returns NULL?”.

The real problem is not that this is such a “shut up” method of error handling. The real problem is that if the error handling for this function has been only INTRODUCED this way, the rest of the software using this API is probably not yet prepared to handle this kind of error. Often the return value is checked different way, maybe some kind of error is not caught, or maybe just the return value is ignored. It happens very often.

6. The program writes you

The “metaproblem”, let’s call it so, is that people often do not “design” their programs. I’m not talking about “do a complete design before writing”. I’m talking about designing at all, just before you write one function, spend some time to think what resources you’ll be using, what reasons may be when these resource accessors may fail, and what the meaning is of a particular failure for you.

I just have a feeling that some people do not “write their programs”. They are rather “discovering them”. They just have the intent of the function in head, then they just write what they think should be in an optimistic scenario. When they somehow call a function that may have some notions of causing failure, they automatically put a condition check, effectively discovering this way some part of the function. This function isn’t being written – it’s being discovered. In various points urgently some failure condition pops up, and they just hit it with a condition check, just as if you hit a fly.

Source: http://www.dailypaul.com/302021/obamacare-website-programmer-admits-i-dont-always-test-my-code

I don’t always write functions this way – but when I do, I do not do any error handling! I’m writing a function with an optimistic scenario only, that’s all. Later, when I confirm that this function in this form makes sense and works as it should (by testing the positive scenario only), I’m adding appropriate and well rethought error handling. This is what I do when I have completely no idea how something should work and I must encode it to see it before my eyes and help my imagination. And this is not always the case. Usually I know what functions I should use to get access to the other resources, and what the failure conditions are; having that I can at least plan the API so that appropriate error handling conditions are preserved before the function is written. In particular, I know what kinds of failure should be propagated or handled otherwise.

And, probably, some of programmers additionally think that they are good coders when they write a complete code at “a shot”. That is, they’ll write a function that since the very beginning contains everything it needs: the business logics, the error handling, possibly also logging (but no, of course not the comments!). And once it’s written, it’s working completely correctly and needs no fixes in future. Well, if there were people, who can do that, they wouldn’t even have a chance to externalize themselves in internet forums because they would be quickly caught by some software company to do bugless code for them, before the others can surpass them.

So, another rule – making the function logically complete is much more important than any error handling – and error handling is something that you can always do later. This way you are really focused on error handling and not on any other thing, so you can think this through and you are more certain of not missing any part that requires error handling – at least more than when you’d write error handling exactly at the first time.

Before you write any error handling, you must first (!) – yes, first, that is, before you add any error handling instruction for any of the functions you call inside – make sure that your function is prepared to PROPAGATE the error, or that it’s a kind of error meaning ALTERNATIVE WAY for itself. I can only tell you that the second one is a very rare case – usually a kind of “I can’t do what you request me to do” function’s response means that your function is also unable to perform the task it has been written for. And it means that this function must report “accordingly”. Propagation means also that any other part of the code that calls this function must be prepared for that this function may report failure, and handle this failure properly. I guess you write the functions before you use them, or at least you plan the API of the function completely before you start using it.

Only this is defensive programming.

7. This function may, but need not, end up with a failure

And, the most important thing: never, never ever do things like providing a function to someone else (including distribution of a library) and say in the documentation something like “this function returns XYZ in case of success and -1 in case of failure” (or NULL, or it throws an exception in case of failure). If you are considering using some open-source library, which has something like that in the documentation, then try to find some better documentation for it (it happens that there exist some!), and if it cannot be found, then – seriously – don’t use it and find some alternative. If there’s really no alternative and you must use this library – give it to somebody to (or do it yourself) reverse-engineer the source code and find out the exact conditions where failures may occur, then prepare the good documentation out of these research results. If you don’t keep an eye on it, you won’t be able to know what it really means for you that the function has failed (and this is a key information for planning the error handling). This way you’ll handle every erroneous value the same way, by crashing the current path completely. This is the straight way to paranoid programming.

If you are using exceptions to handle the problems, you are actually in a better situation – because you’re certain that either your caller will catch the exception and secure the program in a result of bad behavior of your function, or the program will crash. At least your function will not be responsible of making mess in the program. The only problem is that, occasionally, your function might have been called just to check if this works or not. That’s why exceptions should be used only in the following cases:

when it touches upon a condition that the function explicitly states that it must be satisfied before calling, and the user really could do it
when the function could not continue and provide the expected results, and the exception report in this case was explicitly requested (by a flag or alternative version of the function)

If exceptions are used any different way, and especially when the language uses strict exceptions (Java, Vala), exception handling is practically nothing else than a different (and usually awkward) way to handle error-meaning return values. They are even easier to be ignored. It’s very simple to catch all exceptions, provide an empty block at the end, and… continue the program, just as if all the results required for this function to continue, were already available.

8. Catch me! If you can…

It’s very good when assertions are used in the code and next can be used to trace cases when something’s wrong. It’s very bad, though, when you don’t even start to think, what might be the cause of happening this error. The fact that you can catch some erroneous situation doesn’t mean that you have to.

Let’s take such an example:

ptr = new Object;
assert( ptr != 0 );

Of course, let’s state that overloading of operator new in this code is not allowed. What exactly thing are you checking in this particular assert() call? Are you checking if you had made a mistake in the program? How can this pointer be ever null, if the definition of operator new explicitly states that it never does?

What? You may happen to find a non-compliant compiler? Ok, I guess you are really serious, so if you are uncertain about the compiler compliance, create an additional small test program, which will contain various instructions predicted to test the compiler’s standard compliance. This program will be compiled and run before even starting compiling the whole project. All things you may have doubts of can be checked beforehand and confirmed that they work. Once it’s confirmed, you don’t need any checks for that in your production code!

Of course, the compiler may be qualified as non-compliant, but you’d still like to use it. In this case you can provide some specific things, which will solve some of the non-compliance problems, BUT, again: it is extremely unlikely that you’ll be doing that. First, because it’s hard to find such a non-compliant and “in-use” compiler, which returns NULL from new. Second, because even if you find, most likely you don’t want to use it (you’d use some C++-to-C translator or some other language that is easily translated to C and use C compiler for this platform – it would be easier than dealing with various non-compliances of the C++ compiler).

This is not the only case – there’s a variety of possibilities for that. Let’s take even some of the things that are provided librarywise – like boost::shared_ptr. Ok, in C++11 it’s standard, but I just had some good example for that:

shared_ptr<Object> o( new Object );
assert( o.use_count() == 1 );

Think a while about it. It’s not just the fact that it’s stupid – that’s actually obvious, it’s not the point. The point is that you are not testing your code in this case. You are testing the code of shared_ptr. Believe me, if there happened to be any opportunity for this problem to happen, guys from Boost team would have caught that before you’d ever had a chance to see this code. On the other hand, if there was any kind of error like that in your 3rd party library, you can’t do much about it!

What’s the actual reason of doing assertions like this? Say, why do people do it, even though they maybe even know that these things are unlikely to happen? The only thing comes to my mind: they would like to ease their conscience that they “did all the best to prevent errors”. Or maybe not yet that. They just “found something that may be very easy to check”. Because this is easier, much easier than doing good design, good path analysis, good unit and component tests. Because the latter is boring, and adding such checks is so easy and it so much eases the conscience!

Paranoid programming is effectively not a result of wrongly understood rules of defensive programming. This is exactly like with all the other bad practices in programming: the lack of thinking and the lack of good work organization. Which makes it not unlike any other domain of work.

Posted in Uncategorized | Leave a comment

Why Java is not a high level language

Posted on September 14, 2012 by ethouris

… and there was never any attempt for that.

0. Introduction

Depending on various rankings, Java is either the most popular programming language in the world, or one of the most. No matter how trustful these rankings are, it’s undeniable to say that Java is big use and big business. And it has gained popularity very rapidly, if we count how old it is, and especially how big problems with performance it had since the very beginning.

It’s so funny, however, that the reason of this popularity is assigned to various treats that Java… well, doesn’t have. It’s said that it’s simpler, it’s more a high level language, it’s a true object-oriented language, and it’s more efficient for the software business (“time to market”). Actually all those explanations are bullshit, except the last one – but this last one is just a forwarding, not an explanation.

As long as “simplicity” is concerned, you can mention, of course, that the direct split into builtin value types (int, float) and librarywise defined class types, makes the language simpler. The default garbage-collected type of object and reference also makes it simpler. But if you take a closer look to Java 8 and compare it to C++11, you can quickly come to a conclusion that you can already forget any statements like “Java is simple”.

You have to realize that “high level language” is a language that uses high level constructs reflecting the logical constructs in the human mind. The function-based nature of C language, the class-based nature of Java, and the string-based nature of Tcl language, they all are the same as bytecode-command-based nature of Assembly Languages: it’s simply a low level. The low level language isn’t an “assembly language” or “system language”. It’s the language that bases on one strict simple “nature” that is used to implement everything.

So, when it comes to a language meant as “high level” what we really mean is that it should represent some basic logical concepts of data types (such as number and string) and execution environment (threads). Not necessarily unlimited integers, but at least ability to easily create a data type that limits the use to required range (in contrast to having integers for everything, with ability to use some value out of given range). One of mostly known models of such a language is Ada. When it comes to a model of an “object oriented” language, the most famous language is Smalltalk.

The problem with Java isn’t that it’s not “like these languages”. Java might have been a high level language and a true object-oriented language, still having C++-derived syntax and even possibly other treats borrowed from C++. The only reason why this didn’t happen was: the intent of Java’s creator was to make it a better C++. And I didn’t say that they did it – they at best have created a “better C”. And here are the explanations, why.

1. True object oriented language

I have an ambivalence whether saying that Smalltalk is the only true object oriented language is a truism, or exaggeration. Nevertheless it can be a very good model of an object oriented language and a source of information what a true object oriented language should have: it should rely on object itself to perform a task and expect that the object do it its own way (that’s why it’s called “method”).

In order to simplify adding common “methods”, the term of “class” has been introduced, and it’s used by majority of OO languages – not by all of them, though, which means that it’s not the obligatory part of an OO language. The most important is just this: rely on objects. Hence the name. Of course, those OO languages that relied on classes, have evolved lots of various rules about using classes, and managing the software changes around them – however, the central part of OO isn’t a class, but object – however defined.

So, a true object oriented language is the language that:

Relies on objects (not on classes)
Doesn’t use any other entities than referring to objects

What does it mean “relies on objects”? For example, no matter what the meaning of particular thing in the program is, all operations are defined per object, including whether the object can do it or not. So, for example, in every place where some data is required, there can be either an integer, or a string, or a file stream, or a book, or a procedure block (say, lambda function) or even a “nil” – the only limitation is that not every operation would accept it. But that’s not because of the object’s type (there’s no such thing; in some rare cases it may be its class), but rather because the operation that particular procedure would do to an object is not supported by it (“duck typing”). For example, in Smalltalk when you do a+b and assign the result to c, the + method is called for an object designated by ‘a’ with one argument, ‘b’. It really doesn’t matter what they really designate, it’s only required that the operation a+b be able to be done to them, in particular, whether + can be done to ‘a’ with ‘b’ (yes, Smalltalk supports defining operators as methods of objects, should that be of any surprise).

The “class” in such a language is only yet another object that is being used as a delegate that provides methods to be invoked when the object receives particular “message”. But the fact that the object is of some class or of a class that is derived from some class, should be rarely a checked condition. Maybe in some cases it may make sense to check if an object understands some set of messages (“protocol”). But usually it should be just called particular method for, with responding to an exception that it doesn’t understand it.

So, first, the “class” is only a helper to prepare the object in the correct form – and not every language is using them. Self (Smalltalk dialect) is one of exceptions, but for today readers Javascript would be a better example. Yes, Javascript is an object oriented language (although not fully object-oriented, as it uses integers and strings as values), and it doesn’t use classes. To make an object have a method, just assign a function to one of its fields (there’s just a small helper in the form of ‘new’ operator, which just calls a given function, having a newly created object seen through ‘this’ symbol). But the method call syntax does execute a function referred to in the object’s data – the object, through which’s field the method was called, will be inside seen as ‘this’.

Second, in a true object-oriented language everything should be object, with no “exceptions”. In Smalltalk everything is object – an integer, a string, even a method, even a class, and even a “nil”, which is considered as something being “not an object” (say, a universal marker for a case when an object cannot be put in this place – it has its own unique class). If not, we may still talk about having a “true object-oriented flavor”, but not that it’s a true object-oriented language.

Third, as a consequence of relying on what object can do in response to a method call, the only typing in such a language should be dynamic typing. If you want to make use of anything that relies on objects, at least for this part of the program you should forget static types. So, whatever relies on static types, it’s not object-oriented at all.

There exist various OO systems, which at least can be considered “true object-oriented” even though this concerns only part of the language. There’s for example Objective-C, where the whole object system is a kind of “alien feature” applied as a patch to C language, and there’s just one static type of reference-to-object, named “id”. Similar feature is in Vala and C# – the “dynamic” keyword. You can use a variable of such a type, assign an object to it, and call a method – the call will be resolved at runtime. It’s not required that the method be known prior to compile the call instruction.

So, in Java there are entities that are not objects, it uses static types also for classes, and there’s no possibility to call a method on an object, if the static type of which the reference is, does not define it (even as alternative, as in Vala and C#). Theoretically you should be able to do it using reflection (by searching through its methods), but anyway there’s no direct language syntax dedicated for that (and to some extent some C++ libraries also feature reflection). So, the object system in Java isn’t “true object oriented” – it’s C++-like.

The creators of standard libraries in Java were likely to be completely unaware of this. The majority of all APIs in all Java libraries strongly relies on “OO features”, meaning in this language that it’s based on classes. Java has this OO feature as a “central feature” and something that the whole API relies on. Such a thing makes sense in Smalltalk, or even Objective-C – but in Java APIs like this are exactly the same clumsy as in C++ due to weak OO features (MFC is one of the most dire examples of this mistake). From the OO design point of view this is the most stupid language design decision ever made – but this has nothing to do with the business point of view.

The fact that a method can be only called when there’s a certain definition for it provided, has important consequences. For example, in Java you can keep an object through a variable of type Object. But you can’t call a method named indexOf on it, can you. Of course not. The only way to achieve it is to cast this value first to a reference to String type (say, that’s what you meant), and only then you can call this method. And it’s because there is no method named indexOf defined for Object.

This causes trouble, for example, when a framework gives you access to some stored object of some base class, but it may be an object of some class derived from it. Even though you know that this is the object of your class, you can call your new methods only after you cast it to your class. This is a typical Smalltalk way, but in C++ and Java it causes clumsy API.

This fact also strongly influences the hierarchical structure of the design and method naming. For example, if you want to call a method – which will be then overridden by the user – in C++ (and Java), you have to have some class that defines it and call the method through the pointer to that class. Then, your class must be derived from that class because that’s the only way that the method call be effectively redirected to your implementation. None of these things is true for Smalltalk. In Smalltalk you just get the object and call the method, as there’s no such thing as “pointer to some type” in Smalltalk – it’s just some variable that designates an object.

But, on the other hand, you cannot name your method just “open”, which – depending on the context – may be expected to open a file, a window, a gate of the garage, or whatever else. In C++ if you want to open a window, you get the window, which is known to be at least of a class derived from Window class, so you know that this method can be only an override of Window::open. Any File::open or Gate::open may exist simultaneously and none of these has a thing to do with each other. In Smalltalk, if you had a method named “open”, and the code would like to call “open” on the object, then any possible version of “open” (although number of arguments matters – here it’s empty) is accepted, no matter what you meant in particular case.

All these things only confirm what I’ve already stated: the object system in Java is exactly the same as that in C++. And static types for objet-oriented system is a burden, not a helping feature.

So, of course, it’s sad that it took some time the Java creators to realize that an OO language that features static types must have something of a kind of C++ templates. As Java is posing a “real OO language”, mainly by making the API depending only on objects so that everything is done OO way – but practically this has very little to do with OO itself, it’s either of: doing it within the frames of “class” term, create some weird term to excuse doing something by only playing with objects, or just patch the language with some specific unique feature that helps use some particular idiom.

If you ask a question like “what’s the damned reason for this language to have these jinterfaces” or “why must I make a whole new class just to pass a code for execution to a function”, you usually get the answer “because this is object-oriented language”. It’s really, exactly the same stupid bullshit as heard also from some undereducated C++ fans as if overloading and defining operators were “object-oriented” treats of this language.

Jinterface? Well, this is just something that the Java language understands as interface – not what interface in software development really is. The “normal” explanation what interface is is the set of, say, “ways” how to use particular type or set of types. If this would be something like “interface for a class”, at best it may be something that collects methods (and their signatures) that a class should define (“protocol”), which is said to be “conformed to”, if given class defines all of them – but not that the class explicitly declares it. If something containing basic definitions is explicitly declared as being part of the class’s definition, it’s a base class (although only from static types point of view – in Smalltalk you don’t have to declare anything to be able to call a method for an object). Java generally introduces several entities proving that its authors didn’t understand their correct meaning – like for example “jackage”. This is something like namespace, but in Java world it’s called “package”. Anyway, back to the point.

So, how much to do with OO has that “jinterface”? From OO perspective, it’s just an abstract class where all methods are abstract (and this definition is still more C++-like than OO-like – in Smalltalk there’s no such thing as “abstract method”; the method can be called without restrictions, in the worst case it just redirects to doesNotUnderstand: ). As classes are just a “helper feature” for OO, not the precondition, then so the jinterface is. The fact that this “interface” plays almost the same role as class in Java (it can be used as a type for references and provide definition of methods, so it’s enough to be logically treated as a class) only confirms that this is just a special kind of class (and it’s a class in C++ sense). In practice, it’s only a method to overcome the limitation of classes, like lack of multiple inheritance. In Smalltalk we could have at best something like Objective-C protocol, that is, a set of methods of which all should be implemented in the class. But the conformance means that all methods are defined, not that a class explicitly declares it – older classes can be checked against newer protocols.

And what about listeners? If you think that this is more object-oriented than lambda functions, as lately added to C++11 (and Java 8 as well), you’re completely wrong. In Smalltalk – and likewise in Objective-C – you can treat a block of code as an object and also call methods for it. This functions more-less like lambdas. So, it looks like that these “lambdas” are much more OO than listeners. Java 8 has already admitted that as it has introduced lambdas. And these listeners, in order to be usable, had to be armed in additional language features in Java: anonymous classes and their closures (a method created in an anonymous class has automatically access to the variables of the method in which this object was created). Anonymous class that derives some explicit class, especially with this additional closure, is something completely unknown to all other OO languages. And this has still nothing to do with OO features. This is just “a set of features to make the use of listeners easier”.

That’s not all. If lambdas were in this language since the very beginning, maybe this could have been done with some special, unique name of the method. But now the creators have tried to make it able to be used with existing APIs that use listeners – so they just “adapt” them to the required class. Well, this had to be somehow composed with the existing form of class-based replacement for pointers to functions (actually a virtual method is nothing else than an index to a “virtual table” keeping function pointers) and overloading – both being the treats borrowed from C++ and not existing in Smalltalk.

In result, all OO treats in Java are:

done C++ way and very far from Smalltalk
armed with additional specific problem oriented features
based on classes, not on objects

And all things that “force using OO style” practically just force to use class-based features.

And I repeat: don’t get me wrong. I’m not saying that Java is bad because it’s not like Smalltalk and much closer to C++. It would be even funny to say that a language is bad because of that, as I am a C++ developer and a great fan of that language. So let that be obvious that what I mean is that the biggest power of Java comes from the fact that it’s based on C++. It just makes me laugh this whole hypocrisy that tries to deny it.

2. Integers

The C language is accused of having too much roles played by integer numbers and “alike”. This “alike” includes also pointers. And these complaints are also spread to C++. Actually, in this “high level wannabe” C++ we have characters, which are integers, booleans, which are integers, bit flag containers, which also can only be integers, pointers, which also can be more less integers, and of course integers themselves. I bet that every experienced programmer, who “feels” what “high level language” really means, knows that this was done so in C just because it’s easier to implement this in the machine terms, not because it has anything to do with program logics. From high level language we should expect that it implements bit flag containers as just containers of bits, strings as value types no matter how many characters they have (including 0 or 1), booleans that are just two values of its own type, and pointers that cannot be “arithmeticized”. Of course, we not necessarily expect that integers have unlimited range (the “gmp” integers can be added optionally), but at least that integers are used only as integer numbers, we can do arithmetics on it, but nothing else.

So, out of all these things the only thing that Java has “achieved” over C++ is the lack of pointer arithmetics. All the rest of stupid things are hilariously incorporated.

Ok, let’s even admit, in Java boolean and char types are completely separated from integer types. But how many people have paid attention that it’s the “character type” itself something characteristic to low level language, not a fact that it’s treated as an integer number?

Could we have lived without char type? It’s more than obvious – of course we can. If we had a language-builtin “string” type, which is a value type, it may be empty, may contain just one character, and may also contain multiple characters – why would we need a char type? So what, what would str[i] return (or, say, the “at” method)? A string! A string with just one character. Just like Tcl does in its [string index $str $i] instruction – which is only a simplified version of [string range $str $i $i]. Moreover – thanks to that Tcl doesn’t have any “char” representation, it was just a piece of cake to add UTF-8 support to this language, which was completely transparent to all existing code, just the matter of changing the implementation. While in Java you have the name “char” coming from C (and C++), in which it was an 8-bit integer, but in Java it’s 16-bit (ha! see how smart – they declare that it’s a 16-bit, but not an integer :D). Of course, this doesn’t make it unable to use UTF encodings (Java’s String is using UTF-16 encoding internally), but what do you expect to have when some input character at specified position happened to be a 32-bit character? It’s impossible to return this character because it wouldn’t fit in char value. So, String has a method named “charAt”, which returns char value being either a character at specified position, or a surrogate, if the character cannot be represented by char value. This can be checked, of course, and if needed there is another method named “codePointAt”, which returns an integer this time, which is the numerical value representing the character. As int is declared to be 32-bit and it’s enough to represent any unicode character – but, well, not as a character, though. You can also get a string containing just one character, but heck, to get a string of one character from string s on position N, you have to do s.substring(N,N+1).

Why has Java this solution? You can look for excuses why it’s using UTF-16 representation internally, but this has some reasons – it completely doesn’t matter and doesn’t explain, why Java contains charAt() method and char type. This has completely nothing to do with converting to array of bytes because this should be treated as a “specific representation”, into which you shouldn’t have a need to look (and this is so in Java). Why would you need just one character at specified position? If it’s in order to have it glued to some other place – you can glue it as a string, too. If to convert to bytes – you have a much better solution to “encode” the string. String is heavier than a character? Smalltalk has already found a solution for that – Java could have set rightmost 8 bits to 1 and this would mean the UTF-8 character itself as a “string”. Anyway – there’s completely no “business” reasons to have charAt() method that returns a (explicitly!) 16-bit char value. But just one – to remind C++ as much as possible.

The String type is yet another flower. In a high level language it is also never an object – it’s a value. You can assign it to another variable, you can glue it and overwrite existing value. There’s no such thing as “null string”, just by the same reason as there can’t be something like “null integer” or “null colour”. And this is how std::string in C++ works.

Not in Java. In Java you have the same thing as in C, with just a slight exception, that in case of dynamically allocated strings, in Java you don’t have to free() them. String is just a pointer to something, it can be null, and this way, it should be tested for “nullarity” before doing any operation on it. Thanks to that you have lots of occasion to make mistakes and needs for testing the string for both nullarity and emptiness. Not even mentioning comparisons – even in C++ you just do a == b. Fortunately in Java you don’t have to do a.compare(b)==0 and you can’t repeat the stupid C-derived “if ( !a.compare(b) )”, but a.equals(b) doesn’t look much better, if we’d treat Java as a high level language.

Bit flags is even funnier thing. The best thing I can imagine for having a set of boolean flags is to have some container of bits. Either as a vector of boolean values, or as a constant size bit container with compile-time constant indexation. And this is even how C++ does this, with its vector<bool> and bitset. If you want to use a set of binary flags, use bitset. You can easily compare it with a mask, do shifts, selective bit replacements and so on. And you are not limited to fixed 8-based length.

So, this is exactly what I would expect from a high level language. Wanna flags? Take a dedicated type, bitset. Wanna number? Use integers.

Not in Java. Not only does not Java feature anything like “bitset” (let’s admit, in Java it’s not possible to define this thing librarywise, but it still could have been done as usual in Java – as a builtin type), but also all bit set things are implemented just like in old, good C – by integers. It has all the bitwise operators, which are predicted to work on a flag set, only for integer numbers, including bit shifting operators, moreover, right one is in two flavors – signed, when the leftmost bit is copied to itself, and unsigned, when the leftmost bit is set to 0. Is anyone using it? Of course, bit shifting is one of the operations done on integers on the machine level. But this can be at best used to get a better optimized division by 2 (shift right does the same as division by 2, but it’s much faster). Effectively this is for making algorithms most possibly efficient. What is worth such a feature of, in a language, in which performance doesn’t really matter? Moreover, it still has optimizers (even though only as JIT), so this kind of optimization still can be done. The only reason of having &, |, ^, << and >> operators in C was to provide access to low level assembly instructions. They may make sense in a high level language, as long as you explicitly declare that it’s a set of boolean flags and you are doing an operation on a value with a mask. But not as “and, or, exor, shift” – as “set bits”, “clear bits”, “extract bits” and “slice the bitset” (shifting can be used for implementing “slicing”).

Similar thing is with indexOf from String. From a high level language you should expect that if indexOf informs you that the searched character was not found, it won’t just return -1, letting you still do operations on it. You’d expect to either get an exception (bad idea in this particular case), or return some special value that would lead to this result, if found, or lead to nowhere otherwise. A high level language should afford to a concept of “optional” variables – actually the role of them is perfectly played by value wrappers. So, if indexOf returned Integer (not int), it would return null in case if not found. You still have to check it, but at least it won’t produce a stupid, but good looking positive integer if you add something to the result blindly, but do NullPointerException instead.

And, finally, the integer numbers. First weird thing is that when they have already made all the numbers the same names as in C, they still haven’t added the unsigned modifier. This changes the rules a lot (and this is why Java also has unsigned right bit shift operator), while it looks ridiculous to have types named “byte”, “short”, “int” and “long”, and they are 8-, 16-, 32- and 64-bit types respectively. Probably in future we’ll have also “quad”. Ok, I understand that there must be a type named “int”, and that it must be of 32 bits. The truth about freedom of definition for sizes of integers in C and C++ did not get into practice. Of course, there was a change between 16-bit systems and 32-bit systems, where “int” type changed its size from being equal to short to be equal to long. But the practice after introducing 64-bit machines for C++ compiler is that “int” is still 32-bit, just “long” changed the size to 64-bit, while in the C++11 standard a new “long long” type has been introduced to represent an integer wider than int or long (on 64-bit systems it’s actually the same as “long”). So, in practice, what’s the deal of giving these integers so many various names? Their use is practically none. The mostly used integer type is int, in some special situations there’s a 64-bit type (long in Java, long long in C++ – yes, I know, C++11, but long long existed a long time before as extension). Types like “short” or “byte” is something you can only see in some library that interfaces to some C library. So, the only sensible set of integer numbers for a high level language is: int, which is 4-byte by default, then integers like int1, int2, int4 (== int) and int8, or even int16 – for cases where they are really needed. So why are there these funny names? The same reason: to be like C++. The “byte” name is already something that happened to be a user-defined type assigned to “unsigned char” (alghouth in Java it’s still signed), and it was a good enough replacement for “char” from C++, for which the better assignment was to be a UCS-2 character.

I agree that this set of names is the same stupid in case of C++. Of course. But this was already seen at the times when Java was designed. C++ must have them because it’s still being implemented for various platforms and still has some of C legacy. But even C++ has int8_t, int16_t, int32_t and int64_t types (the last one defined as long long on 32-bit systems and long on 64-bit systems, causing this way problems with printf format). Java designers could have made them like this, adding just a universal “int” equal to int32_t – especially that they have predicted it to work only on one platform. They would do it, if their goal would be to be a high level language. But they just wanted to be a better C++.

3. Pointers and null

What is NULL? It’s something that has been introduced in the C language. If you think that this has anything to do with Smalltalk’s nil, you’re completely wrong. There’s no such thing as “not a pointer” in Smalltalk. Well, you can say that there are no pointers in Smalltalk (I prefer to say that all variables in Smalltalk can be of pointer to object type only), but this is how it is there: this “not an object” is just a unique object that does not respond to any calls. But you can still try to do it. This won’t result in any crash or any data destruction.

Some may say that it’s obvious. Not exactly. When you have NULL in C, you should check a pointer against NULL before dereferencing the pointer (or somehow be sure that it’s not by any other premises). In Smalltalk you can do it (for example, when your function allows nil to be passed in the place of object), but normally you don’t have to check it. You can always blindly try to call a method for this object – and it may fail because the object is nil or it may fail because the object somehow does not understand the method specification (I know it’s called “selector”, but I’m trying not to use any terminology that is specific to Smalltalk and different to what is in Java and C++ for the same things) or it may even fail because of any runtime condition – all these things should be somehow planned handling for. In C you just have all of that, but NULL is special – you shouldn’t try to derefer it because it results in an undefined behavior (at least in POSIX system, with virtual memory on, we know that it results in termination on SIGSEGV).

So, Java just changed this undefined behavior into NullPointerException (if we agree that SIGSEGV is what you get, or something similar on Windows, rather than undefined behavior, this is just a cosmetic change). For example, if you check whether a string designated as s is equal to “equal”, you should do in various languages the following thing:

In Smalltalk, you do s = “equal”
In Tcl, you do $s == “equal”
In C++, you do s == “equal”
In C, you do s != NULL && 0 == strcmp(s, “equal” )
In Java, you do s != null && s.equals( “equal” ). Or some hackers propose “equal”.equals(s)

So, compare now the way to do that in Java with rest of the languages, and you’ll see which of them is the closest equivalent. Just by a case, the equals() method gets Object as argument, even though the intent of it would be to compare it with another string. Well, in C you can also pass a void* value as argument to strcmp.

4. Reflection

Before stating whether the “reflection” feature in Java makes it high-level or not, you have to realize first what the reflection is from the language implementation point of view.

So, if someone has missed that part, I’d try to remind you that both Java and Smalltalk are languages predicted to be working only on one platform, which is a Virtual Machine. It doesn’t mean that you can’t find reflection in languages predicted to be machine-compiled. It does mean, however, that when you have a virtual machine, you can plan it anyhow you’d wish – if you have a physical platform, you usually have nothing and the only way to provide any kind of “reflection” is by using some extra layer between the “train” (language) and the platform. Often at the expense of performance.

But this isn’t even important. Important thing is what advantage you have from reflection (especially important if you reconsider it in the frames of high-level language). That’s why I have to remind you one more time that Smalltalk uses dynamic typing only and the only “static type” in this language is a reference to object. Because of that, reflection in Smalltalk is just available occasionally because in this language it’s inevitable in order to provide the dynamic type system. If we have a language with static type system – as is Java, C++ and even, say, Eiffel – things change a bit. In these languages reflection doesn’t have the same usefulness as in Smalltalk and I’d even say that reflection in such language provided way more limited advantage to anyone.

The only “usage” of reflection I found so far for Java is Java Beans and the implementation of some scripting languages that deal directly with the Java objects (Jython, Jacl). So, as you can see, for things not connected to writing any software in Java language at all!

And additionally, you have to pay attention, what really happens in this particular case. The java.lang.Object type is already a kind of “object orchestra”. Reflection in Java is limited to this exactly thing – you don’t have reflection for builtin value types (Java fans will say that it’s still not possible to create global objects of these types – I prefer to say that it’s rather because it’s impossible to provide reflection for them). So, java.lang.Object is simply a core class for the whole object system, and there’s just one object system in Java. That’s all. The reflection is provided for the standard Java object system (being part of the Java standard library), NOT for Java language.

If we realize that, we can simply follow that in a statement that in C++ you can use a variety of object systems, and a designer of that object system might have predicted some form of reflection. This is done in case of Qt and Gtk+ – reflection provided librarywise.

So, now it should be clear, why C++ doesn’t feature reflection as a language – because its runtime library doesn’t predict it, as well as very little part of the language depends on its language runtime (surprise!). These parts are only exceptions and RTTI.

If you want comparison with high level language, here it is: Ada. Does Ada feature reflection? To some very limited extent, yes, but it’s generally not much more things than in C++. So, anyway, this thing does not make the language more or less high level.

5. Threads

Java features threads. Ha ha ha. Good joke.

Java programming language provides just one thread-related feature in the language – the “synchronized” keyword. And it’s only needed because this language does not feature RAII – with RAII this could have been still defined librarywise. All the other stuff in this language, despite that it requires some language system support, is defined librarywise anyway.

Of course, being defined librarywise doesn’t automatically mean that this is not a high level construct. But it may mean that for some language, especially when the language provides libraries with very little abilities to define an API. This is how it is in C and this is how it is in Java – because all APIs in Java must be defined basing on classes. The only special construct, as I have mentioned, is anonymous class, and this is the most advanced thing you could think of, before Java introduced lambdas.

I won’t evaluate it. Just look at examples of using the Thread class, as well as some high level concurrency tools like Future class. So, the same question as ever: what would you like to see in a high level language as an implementation for concurrency?

I would like to see something like:

An ability to define several procedures in place, which will be done in parallel.
An implementation of futures and promises that can look in my code exactly as if I didn’t specifically use any special tool, just try to call functions as usual, read a value or assign it to somewhere else.
A system of running parallel tasks that can pass messages to each other and the language interface provides me with a nice view of how this is running.
Maybe some additional logical parallel features like coroutines

For example, I’d like that my procedure look exactly the same, maybe with some slight marker, regardless if I normally call a function, or make a request-response cycle, while my procedure is waiting (when a timeout is regarded, then this “function call” results in an exception). Same thing regardless if my value comes from a usual variable, or from a promise.

Why this is important? Because first, threads are simply low-level system tools, and second, it should be the tool’s problem to spread the execution into multiple threads, and I, as a programmer, should only worry about that the task is done. Execution, splitting, joining, synchronization – all these things should be worried about by the language system. I should only define a procedure, the language system should worry about parallelizing it.

So, what do we have in Java? Even though this language has lots of things that are a purposewise language support, this one doesn’t have any dedicated language support. Future is just a class, Thread is just a class, if you want to do anything with that, then create or obtain the object of this class and call its method. You can more-less achieve the “procedure split” lookalike using the listener idiom (let’s name it so – Java is such a special language that every idiom in this language must have a dedicated language support).

Many various things can be tailored to using the object interface (using class, object creation, calling methods, also never deleting an object), but there are many exceptions. I have already mentioned String as one of such exceptions. Thread is another such exception because it’s not just simply “an object” – it’s something that comprises a part of language system; the thread object is just a reflection of it. And also not the best representation of it. For such things the object-based interface is awkward and looks… well, very low level. Because it’s just written “how to use some low level tools to achieve the result” instead of “what the programmer intent is when writing this code”.

How much does this interface differ to what’s in, for example, POSIX thread interface for C language? Only such that in Java you don’t deal with memory management. But no need of dealing with memory management is way too little to be meant high level language.

6. Afterwards

Look: I’m not criticizing Java. I don’t say that Java is a bad language or something like that. Or that Java should not be used because it’s not a high level language. I’m just speaking the fact: Java is not a high level language, and no matter how much things the JSC will pack into this language in future, it will never become even close to the meaning of “high level language”.

I haven’t written about many other things, like exceptions (and why throw-new word pair in Java is like sinister-plot in English language), weak references, or the structure of classes. Probably you may find much more. These that I have mentioned are enough to confirm the main statement of this article.

On the other hand, let’s pay attention that in many other languages, which also pose to be high level, you can find many design flaws that cause that they are not, or not fully, or their “highlevelness” is compromised. For example, in Haskell language the string is represented as… a list (while list is the basic and language supported container) of characters. You just get characters and operate with them as with a list. This way you also have a string represented as an array (ok, list) of characters. I understand that the language needs some way to iterate over each one character in the string, but Tcl can do it, too – just do [split $s “”] and you’ll get the list of strings (not characters!), each being just one character string. It’s not the same as having abilities to iterate over the string by list interface and accessing chars. These single characters are still strings, while in Haskell, just the same as in Java and C++, you have an array of characters.

Pay attention also that the “proving in practice” is something that is only valuable in the commercial software development – academics may like various languages, but in this use programming languages make no money. And the practice is that in the commercial development still it’s the C language in the biggest trust (no, I’m not talking about the use, I’m talking exactly about the trust! – yes, that’s sad!), and Java is of trust also because it’s much like C. C++ is taking some parts of it over, but this is because its primary purpose is of use here: let’s use C++ to have access to some high level constructs and this way make our work easier and faster, and when the high level construct fails, we can always fall back to low level C-like. In a high level language you just don’t have where to fall back to.

Maybe then, despite the declarations, “we don’t need no stinkun’ high level language”. Maybe people like low level languages more than high level ones. Maybe the high level language concepts do not “speak” to people. I personally admit that it also didn’t speak to me in the beginning. Before learning any languages for today computers, I have been using only some BASIC and then assembly language. This way I was more used to low level concepts than to high level. And also I still haven’t found any language that can be meant high level and well enough acceptable. I prefer C++ not because of it being any high level, but because it provides an ability to be high level and develop high level constructs. So, it may be that high level concepts are still not mature enough so that any good and well acceptable high level language can be created.

But Java designers’ attempts were far from any approach like this. Java is as it is – designed to be very similar to C++, designed to remind the low level things from C and C++, designed to do everything just one way, designed to provide the user with a flail-simple way of coding to encode complex things. All the concepts that can be meant high level, known even at the time when the idea for this language came up, have been ignored. Of course, Java is voided of memory management, low level memory access, or treating every value as integer. But this was just cleaning up a part of most dangerous features. There are still lots of low level features in Java, and they don’t even have any high level replacements, as it is for some of them in C++.

Java is still business successful? Then its “lowlevelness” was part of this success. Sad, but true.

Posted in Uncategorized | Tagged high level language, object oriented languages, software-development | 3 Comments

The good, the bad, and the dumb

Posted on July 31, 2012 by ethouris

Cameron Purdy, vice president of development for Oracle’s application server group, has made a presentation showing how Java supplants C++ and, probably to even things up, also some cases when C++ supplants Java. The problem is not that I disagree with the first ones. They are just explained wrong way and the author seems not even know what “Java” really means. Being a VP in the strongly Java-focused software development group in one of the top software companies in the world.

1. Garbage collection

It’s been said already lots of things about garbage collection and this is always the case of either from Java fans that “every modern language must have it or it’s not modern otherwise” or C++ fans “this leads to bad practices and causes unacceptable overhead”. In the beginning there have to be said that from the technical point of view GC may have several disadvantages in the form of using more memory for the same task and making possibly overheads by running an additional process, sometimes locking the whole system’s access to memory. But it also has advantages of being more convenient for the user that they don’t have to worry about object ownership (not “releasing the memory” – if you hear that GC solves the problem of “releasing the memory” it means you talk to an idiot) and it’s also faster than manual (heap-based) memory allocation, as well as can lead to less memory fragmentation and stronger object localization. The last advantage is an important advantage over shared_ptr in C++, which suffers of poor performance and does not help in memory defragmentation.

But things told by this guy is such a pile of bullshit that it’s hard to believe how this guy became a VP:

Garbage collection (GC) is a form of automatic memory management. The garbage collector attempts to reclaim garbage, or memory, occupied by objects that are no longer in use by the program.

No. A programmer writing a program in a language that imposes GC as the only memory management, doesn’t even “think” in the term of memory management. They just create objects that have some contents and that’s all. For them there’s just no such thing as “memory”. GC was once well described as “a simulation of unlimited memory”. Important thing about GC is that it provides this virtually unlimited memory, not that it does memory reclamation.

A significant portion of C++ code is dedicated to memory management.

False. Of course, if your program is stupidly written, it may be true, but it’s not true that this is required or that GC significantly solves this problem. For example, a program that reads a text from the input stream, processes it, and displays postprocessed things, is able to be written in C++ using completely no explicit dynamic memory allocation. It’s not possible in C or in Java. So, in this particular case, GC solves some “problem” that does not exist in C++.

Cross-component memory management does not exist in C++.

Really? How come? My function can return auto_ptr<X> (or, better with C++11, unique_ptr<X>) and I can either assign the result to some other variable or ignore the result and the object is taken care of. A memory allocated in one component can be deallocated in the other. Unless you somehow “play with allocators”, of course. But if you do, you should be prepared that you should solve much more problems. Normally you can use a unified and universal memory allocation and it works also cross-component. I have completely no clue what this guy is talking about.

Maybe this guy is talking about Windows and Visual C++, in which the debug configuration is interconnected with traced memory allocation (which is indeed stupid idea). When you have components in your application and they are compiled in “mixed” configurations, it makes of course a memory allocated in one module not able to be deallocated in the other. But this is a VC’s problem, not C++’s.

Libraries and components are harder to build and have less natural APIs.

WTF? Libraries and components may be hard to build in C++, of course, but this has completely nothing to do with the memory management. There are problems with modules, distribution, platform specifics, also maybe there can be some module that uses specific memory management (in other words: in C++ you have more occasions to make a stupid code, but this still has nothing to do with memory management). What is “less natural API”? Is that a natural API that integers are passed by value and cannot modify the value passed, while arrays are passed by pointer and can be modified in place without restrictions? Only when you think “natural” and you mean “Java”.

Garbage collection leads to faster time to market and lower bug count.

Unless you made a stupid design and need to fight design bugs.

The garbage collector can be just “a good tool for things that need it” and from the implementation point of view it can have advantages and disadvantages and that’s ok. If put this way, of course, C++ can also use gc (provided by a library). There are several things, however, that have to be pointed out:

1. From the programmer point of view, objects in a GC-only environment have two clearly defined treats:

The GC-ed object must be trivially destructible. You should even always state that objects created in Java are never deleted. Some may argue that objects may have finalizers. But destruction means “reliable and timely reacquiring the resource” and finalization is not something even close. Finalization is closer to a “wishful thinking” about what other kind of resource requisition can be done when memory requisition is done. But as the memory isn’t guaranteed to be actually reclaimed, the finalizer is also not guaranteed to be called. That’s why if you think C++ way about destruction, GC-allocated object must be of a class that is treated as trivially-destructible. It means for example that if your object refers to a file and you no longer require the file, you should disconnect the external file resource from the local object explicitly. And indeed this is how things happen in Java – Java doesn’t close the file in finalizer, does it.
GC means object sharing by default. It means that the pointer to an object can be passed to some other function and written elsewhere, this way being a second reference to the same object, and (unless it’s a weak reference) shares the ownership with the first one. Some idiots say that with GC you don’t have to worry about object’s ownership. That’s not true – of course you have to. At least you need to think well whether your object really has to be owned by particular reference variable, or maybe it should only view the object, not own (co-own) it, and just become null in case when the object has been reclaimed. In C++ this shared ownership is implemented with shared_ptr (and it comes with weak_ptr the same way), and despite its poor performance, it’s good enough if you shrink its use only to explicitly required situations. The important part of it is: object sharing. As you know, object sharing is the main purpose of “those really really bad” global variables. It’s something that is shared by everyone. So, with GC-ed objects you allow every kind of object to be potentially co-owned. In practice co-ownership is very rarely required, however you can get convinced about that only when you have some experience with a language that features also other-than-GC memory management.

2. There are also two interesting consequences for the difference between GC and shared_ptr:

The shared pointer concept still means timely and reliable resource requisition. This means any kind of resource (not only memory) and any kind of object. The object is being deleted when the pointer variable goes out of scope and it was the last owner. So even if we can’t state it for sure that the object will be deleted at the end of scope, we can at least make it certain that these are the potential places when it may happen. This way shared_ptr can also use objects that aren’t trivially desctructible – in case of GC you can’t even rely on that the object will be ever deleted, nor even in which thread it will (would) happen.
The object deletion is synchronous – that is, when the object is going to be deleted due to getting the last owner out of scope, every weak pointer to that object becomes cleared immediately. Of course, it still doesn’t change much in case when the weak_ptr user procedure is running in a separate thread, but at least it matters when you have them in a single thread. The weak_ptr becomes NULL only because it could only be a dangling pointer otherwise – but no matter that, you should never test this pointer for “nullarity”, you just shouldn’t use this pointer if you are not certain that the pointed object is still owned and still alive. However this at least isn’t dangerous in case of shared_ptr because you can always state that if the object isn’t NULL, it’s still alive. In case when the last owner goes out of scope in the GC environment, there may pass some time between it going out of scope and clearing out all weak pointers due to object deletion, during which the object shouldn’t be referred to. This is how GC simply turns the problem of dangling pointer into the problem of “dangling object”.

2. GC is hailed of being resistant to object cycles, unlike the refcount-based solutions (including shared_ptr). But if you research this topic well enough you can quickly come to a conclusion that a cycled ownership is something that… should never occur in a well-designed system! The “cycled ownership” is something that “in the real world” occurs in just one case: a company may own the other company and that other company may own the first one, of course, only as a partial ownership (so this is also a shared ownership). Actually I don’t know how this is handled in the law, I rather think that governments all over the world just don’t know how to handle this, so they just allow it without restrictions (the problem may be when the ownership level is much higher and only after passing through some hierarchy branches you can see that you come back to the company once found already). But companies are special case – there is no similar thing in case of, for example, hierarchies in management in the companies. Actually this situation is handled by GC using the method that “if A owns B and B owns A and no other object in the system owns either of them, then both have to be deleted”. In other words, cycled ownership is just treated as if there was no ownership at all (because this is like having two managers A and B in the organization, A manages B, B manages A and neither of them is managed by anyone else – can you imagine such a situation in reality, unless this is some state-held company in some banana republic?). So, if you didn’t mean the ownership, why did you use it? Shouldn’t you have used a weak reference for one of them? You don’t know, which object is more important than the other and which should own which one? If so, then you most likely have no clue, what your system is up to! So, you shouldn’t participate in the development, or you should first do your homework. For me, if there is a situation that there’s a cycled ownership, there should be some Lint-like tool that detects this situation and reports it as error. And GC should crash your program always whenever it detects ownership cycles, if you really think about helping developers make good software.

So, summing up, these advantages for the developer, hailed as a better “time to market”, only lead to worse designs and are more tolerant for logic errors – instead of helping clear them up. A C++ code, for which a good Lint-like tool is used to check things up – and, believe me, it can really reliably find all potential memory leaks, and even if this is only a great potential of memory leak, it’s still better to write this more explicitly – is much better in the term of time-to-market than Java code, where design errors are fixed on the fly by the language runtime.

One more interesting thing about GC is something that probably comes from inability to understand the difference between “Java PL” and “Java VM”. It may be a little bit shocking, so please keep your chair well.

Well, Java DOES NOT FEATURE GARBAGE COLLECTION. Surprised? Yes, really. There’s no such thing, at least in Java programming language. Of course, it’s thought of as if they were in the language, but it’s really not in the language. If you want to see gc implemented in the language, the only such things can be functional languages, including such far away from each other as Lisp, OCaml and Haskell. These are languages that feature garbage-collection. Java programming language does not feature it.

Yes, I’m not kidding. Ok, let it be, Java programming language does not provide any ability to delete the object, at least as a language builtin feature. But it doesn’t guarantee anything like that as a language. There are some Java implementations (or extensions) that use deterministic or even explicit object deletion. This can be done by using some external library, not necessarily a language extension. The Java language requires that the program in it be well formed even though the objects are created and never explicitly disposed. This doesn’t mean that Java can only be implemented on JVM – gcj provides a native Java compiler using Boehm’s GC.

It’s because garbage collection is a feature of JVM. It means that any language you’d implement for JVM (including C++, if anyone will do it in future) will take advantage of garbage collection, even though its natural way of managing objects would be manual. In C++, for example, you’d be allowed to use delete on objects; this would just call the destructor and do nothing else. And yes, of course, C++ is predicted to be used with GC, just there is no “standard GC” to be used with C++.

So, the “GC-related” difference between Java and C++ is that Java language doesn’t provide any possibility to delete objects and the language runtime is expected to take care of this by itself, while in C++ you can use different memory management policies for objects, although the default policy is to explicitly delete objects.

2. The build process

Guys, come on, come on… How much does it matter for a large project, how many build machines you have to prepare and how strong they should be? It’s significant for a home-cooked project (maybe), but not in today environment. When today running a javac command takes itself more time than to compile the files (in some cases), what’s the real difference to compiling C++? In today machines it mostly just occupies more cores to compile. It’s really funny when you hear something from fans of a language, which is already accused of having problems with performance, cleared by stating that “machines and processors are getting better and faster so it shouldn’t matter much”.

That’s still not the most funny thing – this is:

Java has more approachable build tools like Ant and Maven, while C++’s Make and NMake are considered less so.

What? Ant is suggested to be any more advanced tool than Make? Come on…

First, Ant is, regarding the features, at best the same advanced tool as make. And what is “more approachable” in case of Ant, the XML-based syntax, which is meant horrible by most of the people? It’s not Ant itself what makes things better (let’s state they are). It’s Java compiler.

The Makefile’s role is partially to define the dependencies between definitions implemented in separate C++ files. The definitions are therefore contained in header files and the header files providing definitions used by another file are defined in the Makefile rules (gcc -MM is a command that can be used to automatically generate them). In Java this thing is completely automated by not having explicit header files – however the dependency problem is handed off by the Java compiler. You just pass all the source files that may have dependencies between each other to a compiler command line. You can imagine that there can be created a tool that by the same way takes all *.cpp files and produces the “result” in the form of *.o files – taking care by itself to properly handle the *.h files and the dependencies. There’s no such tool only because there are more advanced build tools for C++ that do this and much more.

The only thing that Ant handles is to associate the name of the target (possibly default) with source files that have to be passed to the compiler. That’s all. It’s really not more advanced than just a shell script in which you’d encode a command: “javac File1.java File2.java … etc”. Ok, maybe with some CLASSPATH. Make is a no compare to Ant – Ant also cannot be used to build C++ projects because – surprise – it doesn’t feature targets connected to physical files. It’s merely because this part is exactly done by Java compiler.

Maven is different – it’s really an advanced, high level build tool. Enumerating Ant and Maven in one sentence as “two different more approachable build tools” is just WorseThanFailure(tm) and I just can’t express how stupid must be someone who says something like that. Maven, first, imposes a strict source file layout, and second, it manages external Java packages automatically. You just need to specify where your sources are and what packages you use – rest of the things is taken care of by Maven, including version control and upgrades.

But, if you point that out, you have to remember that in C++ world there are also tools that provide advanced build systems. Examples are autotools, Boost.Jam, qmake and CMake. A very important tool that solves the problem of providing modules in C++ is also pkg-config. Having that, you just add the package name to the list and you don’t have to worry about specific compile flags and linker flags – you just add the name (not all packages provide entry for pkg-config unfortunately, but it has quickly become a de-facto standard). You still have to add the include file in the sources, of course, but this has nothing to do with the build system. And, well, the syntax is awkward? Yes. But I really think that XML syntax is even worse. I have once written a make-like tool in Tcl that was predicted to be able to extend into a highly automated high-level build system, just had no resources to work on it. I’m mentioning that to point out that the lack of such a system for C++ as Maven is probably not the biggest problem in this language.

3. Simplicity of Source Code and Artifacts

Yes, I admit, C++ still needs a lot of work in development for this case. But if you are really seriously done with some Java projects, you know that saying that “Java is only *.java or *.class” files is maybe true as long as your background is at best homegrown or academic. Experienced people know that if you are doing some Java projects there will be a lot more kinds of files to deal with, like:

*.properties files
*.xml files
*.jar files
*.war files

And they are really indispensable in serious Java projects. Additionally some people state that having all methods explicitly defined inside class entities make the class unreadable. I don’t know if I can agree with that, just wanted to point out that having everything in one file is not something that can be unanimously thought of as an advantage. The Ada programming language, for example, is also using header files, despite that it doesn’t use #include directive.

You can also quickly point out that header files is the only thing that makes any addition towards what is in Java. Rest of the mappings are simple: *.cc is like *.java, *.o is like *.class, *.so (*.dll) is like *.jar. If you try to point out that there are also executable files, don’t forget to mention that in case of Java you’d either have to create it manually as a script that calls java interpreter with passing the explicit class name to run its main() or you have exactly the same situation if you compile to native code with gcj.

4. Binary standard

And this is my favourite: this guy completely doesn’t understand the difference between “Java programming language” and “Java Virtual Machine”. JVM is just a platform, a runtime platform that can run programs, for which the programs can be compiled and so on. And Java is not the only language in which you can write programs for JVM platform.

There’s a variety of languages implemented for JVM, maybe not all of the existing ones, but at least there is an interesting choice: Scala, Groovy, JRuby (Ruby), Jython (Python) and Jacl (Tcl). Interesting thing about the 3 last ones is that they normally are scripting languages. You can write programs in Tcl language, looking as normal scripting language program, in which you can create Java objects, call its methods and even create classes and interfaces – this is possible due to reflection feature. As long as there is a language implemented for JVM, you can write a program in this language, not necessarily in Java. It’s also not a big deal to provide a kind-of compiler that produces the JVM assembly code from the Jacl source.

On the other hand, Java is also just a programming language and it can be implemented for any platform, it’s not just fixed to JVM. One of compilers that can compile Java programs to native code is gcj (coming from gcc collection). I have even heard that using this compiler to compile Eclipse may result in much better performant code, comparing to compiling with javac. The resulting code is, obviously, not binary compatible with those compiled by javac.

So, unfortunately for the author, Java (as a programming language) doesn’t have any binary standard. The only one thing that has a binary standard is JVM – but heck, this is just a runtime platform, for the God’s sake, what’s the deal with binary standard! Every runtime platform must have something that is considered “binary standard”. What, you’d say it has the same form on every physical platform? Well, .NET has it too, LLVM has it too, Argante has it too, even Smalltalk VM has it, and the OCaml binary representation has it. Guys, come one, what’s special in JVM’s “binary standard”?

5. Dynamic linking

This was partially explained with the module problem for C++. Yes, C++ still uses the old C-based dynamic libraries, which simply means that if some feature is “resolved” by the compiler and cannot be put into *.o file, it automatically cannot be handled by dynamic libraries. But heck, what’s the “DLL hell”? Looxlike this is one of many guys that think that C++ works only on Windows. In POSIX systems there’s really no such thing as “DLL hell”.

But, again, this is also specific for JVM. Yes, unfortunately. Java programs compiled to native code suffer exactly the same problems as C++ on native platforms.

Or maybe this is a problem with dependencies and versioning. Oh dear… you shouldn’t’ve really suggested a problem like that. That’s one of the properties of Java’s “modules” and one of the problems you can get quickly and roughly convinced when you work with Maven. Imagine: you have a library for Java, and you are using in your code a class that this library provides. You do “import com.johndoe.bulb.*” in your code (about 90% of Java “coders” have no clue that this is the same as “using namespace com::johndoe::bulb” in C++, that is, nothing more than shortening the name). But you are using in your code some feature that is not provided in some earlier version of the library. Now… can you specify something in your sources that would require particular version to be used? Or, if you have multiple versions, slotted versions, specific versions etc. – well, this can be done, as long as Maven manages things for you. This only fixes the problem that has arisen in the programming language – kind of patch.

Do you want to see a perfect module management system? Look at Tcl. It comes with “package require” command that specifies the package and optionally its minimum required version (creating packages before 8.5 was horrible of course, but 8.5 version comes with new feature called “modules”, that is, the only thing to do to make the package available is to put *.tm file in one of search directories). In Java language you just don’t specify the package (in the physical sense, not in Java sense – it’s maybe better to call it “jackage”, just like “jinterfaces”?), the package itself is searched through all existing packages in the system to which the CLASSPATH leads, which maybe provides the symbols used in the source file being compiled. In the default environment you are just unable to specify the version of the package you are actually using. And moreover, you can accidentally have multiple such packages in your system and the first found, by searching through CLASSPATH, is considered as the required one.

Why so many words about that? Well, mostly because this exactly is what “DLL hell” means, if you just take your time and try to search for this term. By the same reason you have a “CLASSPATH hell” in Java.

6. Portability

This is the funniest fun ever.

Comparing a portability of programming language, which generally describes how it should be implemented with the best effort to become the fastest programming language for particular platform, and a portability of a “one platform language”?

In English language, specific for software development, the verb “port” means to take the source project and aim to adjust it to the needs of a different platform than that for which it was initially created. For example, you may have a Linux application and you “port” it to Windows or to QNX. So, “portable”, in the software development specific meaning, means that the application written in this language can be (at least there is a possibility) preferably easily, or even effortlessly “ported” elsewhere.

If you measure the true value of “portability”, the only way is to state how many runtime platforms can be thought of as currently being important on the market and have any higher level programming language implemented for that, and then realize, for how many of them the particular programming language has been implemented. If we state that there are something about 10 such platforms, say we include .NET, LLVM, Windows/x86, SunOS/Sun SPARC, SGI, MacOS/PowerPC, JVM, Linux/ARM, Argante and VxWorks/PowerPC, C++ is currently implemented for 8 of them, so its portability is 80%. For the same set of platforms, Java programming language achieves 10% of portability.

Worseover, if you really take seriously the comparison of a native-compiled language and a virtual machine, note that the JVM is also not implemented in all of these platforms. From this point of view – as long as I didn’t make a mistake – it only achieves 70% of portability. I didn’t even mention some limited platforms, for which C++ is implemented, although with a limited number of features, like for example, you cannot dynamically allocate memory (it doesn’t make C++ non-standard, it just requires that every call to “new” results in std::bad_alloc and every call to “new(nothrow)” results in nullptr).

Well, platform specifics, well, so many #ifdefs etc., guys, come on. Maybe it was true 10 years ago, maybe this is because of required some very detailed platform specifics and it’s needed for the best performance or, say, a specific lock-free code (CAS) – although in Java you usually don’t think about it simply because if you use Java you don’t think about performance. I understand that there are lots of #ifdefs used in header files, but please, if you don’t like #ifdefs, just don’t look into the header files. These header files are using these specifics in order to make your application code best performant and not having to use them also in your application code. Since C++11 standard all compilers are the best C++98 compliant and no serious software producer uses a compiler that doesn’t support it.

You may say that this portability means that in case of C++ you must perform the complete test procedure for a prospective new platform (that is, always do some porting job), while in every place where you run a Java bytecode it will always work and behave the same, so you don’t have to test this program in every platform. If you really think so, you’re more than childishly naiive… For example: try to think about just a simple thing as filesystem. Try to access a file being given its path in POSIX systems and in Windows. This is just a “no way”, in both C++ (at least standard) and Java. Windows-specific path won’t work on POSIX and vice versa. The only way to achieve portability is to take some base path out of which the other should be drawn, then add path elements so that the path can be composed. State we have the “base” variable that holds the base directory – you can do it portable (yes!) way in Tcl:

set dir [file join $base usr home system]

and in C++ with boost.filesystem:

dir = base/"usr"/"home"/"system";

but in Java you have to worry by yourself how to compose the path. In a string. Ok, the system gives you a property that returns the path separator character used in current system and you “only” have to worry about gluing the string properly (Tclers and Boost users will be still ROTFLing). Just pray that your application never run on VAX VMS, where the file path “alpha/beta/gamma/mu.txt” looks like “[alpha.beta.gamma]mu.txt”. Do you still want to say something about Java’s portability?

7. Standard Type System

Facepalm. One of the most important thing in C++ standard is the standard type system. Of course, it not always comes with a fixed size for every type, but who said that it’s not a standard type system because of that?

Actually this slide doesn’t even talk about standard type system. It talks about the features in the standard library. Maybe C++ deserves to have XML parser and database connectivity library in its standard library, but heck, this first required that the standard be modularized. It’s very hard to define the C++ standard this way and it’s hard to convince the standard committee to do it. And it can’t be done without modularizing the standard because this way lots of C++ implementation would have to declare non-compliance because a C++ implementation for some tiny ARM system to be used in a refrigerator does not feature database connectivity. Even though if this is provided, a lame-legged dog wouldn’t use it.

Actually there are lots of C++ libraries that feature XML parsing or database connectivity. Maybe just no one needed them as a standard. I would even predict that probably they are in Java because there must have been some initial libraries provided for this language or otherwise it wouldn’t attract attention. Same with GUI. I really don’t find it a problem. And somehow rarely anyone is writing GUI programs in Java. If you think about Eclipse, then don’t forget that SWT, which is used for GUI, is completely implemented in the native platform (for example, in Linux it’s implemented on top of Gtk). Why SWT, not the “standard” Swing or AWT? Well, guess.

8. Reflection

Hello? Is anyone there? Is this reflector really lighting? Ah, well…

Did you remember to have ever used reflection in your Java application? I mean really serious Java project, where you write an application running in the web application server; no other kind of Java development is serious. So, did you? Of course not. Why? Simply because it’s a feature just “for future use”. There are some tools that use it, like for example, Java Beans. But user code, application development? Guys, come on, if this language required from the user that they use reflection in their application code, users would quickly kick it in the ass and use other language.

Reflection is something that allows a language to easily implement things like runtime support, adding plugins to the application, or having some more advanced framework. Comparing to Java, C++ has lots wider use in the software world, and not each one of them requires this feature. But, again, if it’s required, there still are libraries that allow for that, for example, Qt library features reflection, at least for QObject-derived objects, but it’s obvious that there’s no need for anything else. So, you need reflection? Use correct library. The fact that Java supports reflection by default doesn’t mean that it’s such a great and wonderful language, but that the use of Java language is limited to use cases where reflection makes sense, or at least where it can be sensibly implemented.

For a usual Java programmer, it is really of completely no use.

9. Performance

Although performance is typically not considered one of the benefits Java has over C++, Purdy argues that garbage collection can make memory management much more efficient, thereby impacting performance.

Rather “thereby making the performance a little bit better than horrible”. Of couse, GC improves performance of memory management (and it can be even easily shown with GC for C++, especially if you compare it to shared_ptr). It doesn’t come without a price, though. The price is increased memory use. There is usually kept some memory that is not yet reacquired. Also the unified memory management help Java a bit. But Java easily wastes this performance by having overheads provided in java.lang.Object, and all user types can be only defined using classes derived by default from this one. This one even more increases the memory usage. Think at least about that how the performance decreases also by the high memory use itself! Maybe it does not impact the application itself, but it definitely does impact the system and this way may affect the application from the back.

In addition, Java is multi-threaded while C++ does not support multi-threading.

Yes, at least when you think about the “standard C++98” (it’s changed in C++11), but it somehow did not disturb people writing threaded applications in C++. So what’s the deal? It’s easy to suggestively say “does not support multithreading”, but if you said “it’s impossible to write multithreaded C++ programs” you’d be lying. So, isn’t it better to just shut up?

Do you want to see how threads are supported by the language? Look at Ada. Do you want to see how multicore programming can work? Look at Makefile. Yes, really. This simple task automating tool (because this is generally what make is) can run tasks in parallel, as long as they are considered independent. This is the real support for threads. Java’s thread support is just a thread library plus one slight keyword that performs acquire/release for the whole method or block (which is only provided for convenience just because Java does not feature RAII/RRID – this thing in C++ can be implemented in a library, too). I completely don’t understand why this is any better than Boost.Threads but the fact that they are not in the standard library (in C++11 they are).

C++’s thread safe smart pointers are three times slower than Java references.

Well, even if it may look strange if once he says that C++ does not support multithreading and next that C++ has something thread-safe… Really, only three times slower? They are ten times slower than C++ pointers using Boehm’s Garbage Collector. And, one more important thing – this guy is probably speaking about shared_ptr, not unique_ptr, which do not use mutex synchronization, so they are just like plain pointers.

I suspect the poor performance of shared_ptr comes from the fact that it is “screwing in” the reference counter to the object; it’s said that std::make_shared should ensure better performance because this one can allocate at once one big piece of memory that keeps the refcount and the object itself in a solid block; this at least decreases the number of dereferencing. However still a compiler support may be required to optimize out unnecessary refcount modifications.

And Java has HotSpot Java Virtual Machine (JVM), which features just-in-time (JIT) compilation for better performance.

For “a little bit better than horrible” performance, did I already say that? Ok, no kidding, yes, JIT really makes the performance better; an application running on long-run application server does not suffer any big performance problems towards a similar application in C++. There are just three small problems:

It doesn’t change the fact that these Java apps still gorge the memory with a terrible pace
It should be a really long run. Freshly started server comes with a really poor performance and it betters as long as the server runs.
C++ can take advantage of JIT compiling, as long as you use a specific compiler. For example “clang” compiler for LLVM virtual machine is using JIT compiling when it runs and this way produce an overwhelming performance. Important thing is that it’s doing it on a code that was already compile-time-optimized.

10. Safety

Well, if you are talking about Java programming language, it does not provide any safety by itself. The safety is provided by JVM, so it’s available for C++ as well (as long as I can make it with the implementation :).

As it comes to these 5 reasons why Java still has things to overcome for C++, there’s nothing I can say, but just reminding that there are still many people that state that Java can be a very good language for real-time programming, and its performance will be better and better. Well, most funny thing is that I heard these things already 10 years ago. And Java’s limitations are still valid.

11. Conclusion

Don’t treat these things on these slides too seriously. They really look as a good summary of the differences of Java to C++. But if you mention these things when applying for a C++ job (when you apply for a Java job, you are unlikely to be asked that), you may be treated as just not a professional.

Java had to rule the world of software development. It didn’t make it. Not because it didn’t manage to overcome the limitations. It didn’t make it because there urgently grew a market for C++. And the users of utility electronics, now widely using software, had no will to adjust to Java.

Posted in Uncategorized | Leave a comment

C++ concepts that wouldn’t be considered harmful

Posted on March 13, 2012 by ethouris

0. Background

Templates have been accepted in C++98 without concepts (although they were considered) because at that time no one saw any advantage of them. However, after some years of having long and bloated error messages when a template can’t be resolved, concepts now make sense. So lots of work was done to define such a feature for C++. The final form of concepts, implemented in an experimental ConceptGCC branch of gcc compiler, is quite well described on Wikipedia.

The works on this feature, as I can personally evaluate, were probably shifted to wrong direction, so it eventually got far away of the main goal. It’s much better that the concepts in this form have been removed from C++0x. But the problem still remains and has to be solved.

Bjarne Stroustrup enumerates several problems and misunderstandings concerning concepts, especially their latest version. It looks like the first idea of concepts has been only spoiled later. But it suffered also from one problem in the beginning: the syntax does not coincide with the syntax of templates, which makes concept a kind-of “alien” feature, not matching the rest of the language.

Because of that I have created my own proposal.

1. The basic syntax

A quick example: a template definition using a concept (that applies limitation on a type) would look like this – just like the original proposal:

template<class T>
requires LessThanComparable<T>
T min value(T x, T y) {
   return x < y? x : y;
}

and the LessThanComparable concept is defined this way:

concept<class T> LessThanComparable
{
    bool operator<( T, T );
};

Note similarity of the syntax between “template” and “concept”. Also similarly, if we want to provide this feature to a type that does logically the same as operator <, but with different name (a “concept map”), we’ll do:

concept<> LessThanComparable<std::type_info>
{
    bool operator<(const std::type_info& t1, const std::type_info& t2)
    { return t1.before( t2 ); }
};

Yes, this is also called “partial specialization”. Just like templates, concepts also may have (even partial) specializations (which functions like concept maps) and also can be instantiated (instantiation is required to “invoke” the match on a given type). Although their main use is to become a match for template entities that are expected to have particular features (concepts are always auto-applicable), additionally concepts have the following features:

can provide lacking features to types (“patch” a type)
can be derived (also multiply)
can be abstract (such a concept does not match anything)
can be bound in logical expressions
can be matched partially
can be used to synthesize a code template
can have multiple parameters, including variadic

The most general definition of concepts’ syntax is (names in .. are descriptive, fragments in ?{ … }? are optional):

concept<..parameters..> ..ConceptName.. ?{ <..specializationArgs..> }?
   ?{ : ..DerivedConcepts.., ... }?
?{ {
    ?{ ..specificStatements..; ... }?
    ?{ ..requirements..; ... }?
} }?
;

Where:

ConceptName: new concept’s name, if master definition, or existing concept name, if specialization
parameters: just like template parameters (no overloading or default parameters)
specializationArgs: arguments passed to the concept as specialization (just like in template partial specialization), used only when defining a specialization
DerivedConcepts: concepts that are prerequisites for the defined concept
requirements: definitions that should be available so that the concept can be meant satisfied: function/method definitions, requirement expressions, or requirements for elaborated types
specificStatements: just placeholder for future changes 🙂

The concept usage is:

1. When defining an entity template:

template<..args..>
requires|desires|constrains ..ConceptInstantiation.. , ...
..TemplatedEntity..

2. When requesting a code template synthesis:

// synthesize code in current scope as working on given type
requires ..ConceptInstantiation.. ;

// synthesize code in current scope with new type name
using ..NewTypeName.. = typename ..ConceptInstantiation..;

Where:

ConceptInstantiation: an expression of ConceptName<args…>. Just a “requires” with this expression brings all type patches into all types for which any are provided by this concept, within the current scope
NewTypeName: a name of the type that is created by patching. The syntax using this should use only such concept that applies patches on only one type. The original type is then left untouched and the patched type is identified by this name.

New keywords: concept, requires, desires, constrains.

The definition of a requirement inside the concept must be in one of two forms:

interface element declaration (function, method, or type definition)
requirement expression (will be covered later)

The interface element declaration needs only to be provided in one of possible forms, however it’s only important that there is usually some way to call it. The real interface element may be defined different way, but still the way allowing it to be called with the same syntax. The following equivalences are allowed:

Type equivalence: real return type may be convertible to requested type and argument type of requested interface may be convertible to argument type of the real interface (when this is not desired, the type should be preceded by explicit)
An operator may be defined either as a method or as an external function
Non-explicit constructor A(B) requested in the concept is allowed to be satisfied by B::operator A() (again, use explicit modifier, if this is not desired)
A list of function or method parameters may be longer, with the excess parameters being default. Such a function is callable the same way as that mentioned in the requirements (again, use explicit modifier after the closing parenthesis of the function, if this is not desired – especially important if, for example, the function is required to have a strict signature due to being taken the address from)

The syntax like “template<LessThanComparable T>” is not possible because concepts must always be supplied with parameters where used – just like templates. Also, they can be used with more than just one parameter:

requires LessThanComparable<T>

requires Convertible<T, typename U::value_type>

requires TypeIsOneOf<T, int, char, bool>

There still is a simplified syntax for a concept with one parameter, but it’s specified in place of type, where it is used, not as a prefix in concept parameters. This will be covered later.

2. Special variants of the syntax

a. abstract concept and empty concept

The basic form of a concept is a pure abstract concept. Such a concept is never satisfied (because it’s unknown whether there is any requirement):

concept<class A, class B> IsSame;

And this is the empty concept:

concept<class T> Defined {};

Such a concept is always satisfied (because it doesn’t impose any requirements). Note though that incomplete types are not allowed to be concept’s parameters.

The IsSame concept can be then declared as satisfied by specialization:

concept<class T> IsSame<T, T> {};

It means that in general IsSame isn’t satisfied, but in a case when some type has been passed to it as first argument, which is the same as the second argument, then it matches the partial specialization shown above, and this way the concept is satisfied.

b. concept derivation and partial specialization

Concepts can be derived. The syntax is the same as for structures and the meaning is also the same: the contents of the base concept are incorporated into the derived concept. In result, the derived concepts imposes its own requirements plus all the requirements that come from the base concepts.

You can also derive an abstract concept from another non-abstract concept. The concept doesn’t become less abstract because of that, but it carries additional rule: Any partial specialization that would like to redefine the concept for specified type must also derive the same set of concepts as the master definition does.

There are two main things that make concepts different than templates:

concepts must always have exactly one master concept definition provided
partial specializations are only allowed (and required) to cover particular requirements, as defined in the master definition

The first statement means that you cannot “announce a concept and define it later” – when you created an “announced” version (that is, abstract), then later you can only add specializations.

The second one means that you cannot add requirements in the concept specialization nor can you remove them (every single requirement must be covered) – the concept specialization does not “derive” any requirements from the master definition.

RATIONALE: You may consider, why lacking definitions cannot be taken as default from the master definition. There are two reasons why this shouldn’t be done:

The goal of this proposal is to make concepts similar to understand as templates. So if template partial specializations do not silently derive from the contents of the master definition, so shouldn’t the concepts do.
Even though it is sometimes required, it should be explicitly visible for the user that there are defaults used (or every requirement is covered otherwise)

However, as it is sometimes desired that definitions from the base concept be “silently derived” by the specialization, these are the possible methods for how to provide a syntax for that:

1. “self derivation”:

concept<> Container<MyContainer>: Container<MyContainer>
{
   size_t MyContainer::size() { return this->numberItems(); }
};

Here the concept derives, however not “itself”, but “such an instance of this concept that would be generated from master definition” (because until this partial specialization is finished, this is what this concept instance resolves to).

2. using “default” keyword as derived:

concept<> Container<MyContainer>: default { ... };

3. using “default” keyword in the specialization:

concept<> Container<MyContainer>
{
    using default;
    ...
};

The self-deriving syntax has one important advantage: it suggests the user that there was a derivation, which means that the deriving entity incorporates definitions from the derived entity, just like classes do. It’s still worth considering a form of deriving syntax. On the other hand, it may cause to be allowed that concepts are being first used and later specialized, which would result to have two different concept definitions for particular type under the same name (and this will most likely lead into problems). It would be then better probably to issue an error when a type specialization was provided explicitly after it was generated.

c. concepts with variadic parameters

Concepts may have also variadic parameters:

concept<class T, class Arg, class... Args> TypeIsOneOf
{
    requires IsSame<T, Arg> or TypeIsOneOf<T, Args...>;
}

How to make a “termination” version, stating that concepts cannot be overloaded? No strict idea, although this is the proposal (it may be considered later for variadic templates, too):

The auto-expandable expressions (those that use “…”) are treated special way and they are replaced with some special concept instantiation statement, let’s name it “Nevermind”, if “Args…” resolves to nothing. Then:

A and Nevermind resolves to A
A or Nevermind resolves to A
not Nevermind resolves to Nevermind
single Nevermind, as resolved from some expression, resolves to nothing
requires {…something that resolves to Nevermind…} resolves to nothing
Any concept or template instantiated with the use of Nevermind resolves to Nevermind

In other words, the part that has “Args…”, resolved to nothing in particular case, will disappear as a whole and will drag the preceding operator with itself (and every higher level expression that contained this resulting Nevermind). If the expression was required to be nonempty by some reason, then when it resolved to nothing, an error is reported. So, the above concept, when used as:

requires TypeIsOneOf<T, int, char, bool>

will resolve to:

requires IsSame<T, int> or IsSame<T, char> or IsSame<T, bool>
  /* or Nevermind */ ;

3. Code synthesis and type patching

For example: how to define that an external function begin() and end() is required, although we can still live with methods with these names? There are two ways to accomplish that:

harder, first define a concept that will match a type that contains begin() and end() methods, then define a partial specialization for such a concept
easier, define requirements together with default implementation

Let’s try the easier one first:

concept<class T> Sequence
{
    typename iterator = T::iterator;
    iterator begin(T& c) { return c.begin(); }
    iterator end(T& c) { return c.end(); }
}

This way we provided a concept with default implementation. It means that this concept is satisfied for every type, for which the begin external function is provided (and the others as defined in the concept). However if there is no such function defined, the compiler will try to instantiate the default implementation. If the instantiation succeeds, the concept is meant as satisfied, and – pay attention – the “fix” for the type that requires it to satisfy the concept, is applied! It means, for example, that:

template <class C, class T>
requries Sequence<C>,
requires Convertible<T, typename C::value_type>
bool has(C& cont, const T& value )
{
   for ( C::iterator i = begin( cont ); i != end( cont ); ++i )
      if ( *i == value )
         return true;
   return false;
}

… in this example, it was possible to use begin( cont ) and end( cont ), even though there are no begin(C) nor end(C) functions! Of course, only inside this function because only in this function the C type has been imposed requirements of the Sequence<C> concept (and patches, by the way). In particular, C here has become not exactly the type of the 1st argument, determined by the template, but it’s C with appropriate type patches, as provided by Sequence<C> concept.

It means that inside the entity that required that particular type satisfy particular concept, this type has also additionally all the patches that this concept has defined. For types that don’t satisfy this concept the template entity cannot be instantiated.

The harder way will be shown later, together with overloading.

Let’s try another case. You know that in a part of C++ standard library, formerly called STL (I personally call it “CIA” – Containers, Iterators, Algorithms – as it still needs distinguishing from the rest of the standard library), there’s a concept of InputIterator and OutputIterator, which have one common treat, that is, they are both “single pass iterators”. The normal way to use single pass iterators is to use the “*x++” instruction. What is not allowed for this one (and it’s allowed only for multi-pass iterators) is to use only one of these operators, that is, use * operator without doing ++ in the same instruction (in particular, it’s not allowed to perform next * after previous * without ++ in the middle, and same for subsequent ++).

However the method that was used to achieve it is awkward – it’s simply the * operator does the “pass” as a whole, and ++ does nothing. It makes that it’s practically possible to make *x twice, although it will behave as if *x++ was made. It is then desired that these both things, as they should not be separated, are done by use of exactly one instruction:

 x.input();

It would be much better then, if InputIterator and OutputIterator have only defined input() and output() methods respectively (and not * nor ++ operators). This would be then:

concept<class T> InputIterator
{
    typename value_type = T::value_type;
    value_type T::input() { return *(*this)++; }
};

concept<class T> OutputIterator
{
    typename value_type = T::value_type;
    void T::output(const value_type& val) { *(*this)++ = val; }
};

The << and >> operators may be also good for that, although the >> operator doesn’t give it a chance to return the result by value, but on the other hand it would make the std::cout object also satisfy the requirements of OutputIterator.

You can guess that now only the purely single-pass iterators should have the input/output methods defined. For the others it’s just enough that they define * and ++ operators.

It would be a nice idea to have such a change in the standard library – however the current algorithms that are allowed to work on single pass iterators (most of them are: for_each, copy*, transform*, merge*, but not sort) would have to be changed. Such a change is possible to be made in the standard library, although the * and ++ operators should still be provided, with an annotation that they are obsolete so that the user algorithms working on single-pass iterators can be adjusted to the new form.

Thanks to concepts, there will be no need to provide input() and output() methods for multi-pass algorithms separately.

4. Lesser concept satisfaction requirement: desires

One of the main reasons for which the concepts were designed was to have better error messages. This is what we often want. It doesn’t simultaneously mean that we’d like to force a type satisfy the concept, as we, say, don’t really use all the features that are imposed by the concept. Practically we can live with some of definitions lacking – for example, we need that the type provide a copying operator=, but not necessarily a copy constructor. We have a CopyConstructible concept that contains both of them and we’d like to be able to use it. However we don’t really want that this type be CopyConstructible; we just use CopyConstructible to check for copying operator=. Of course, we can always define a new concept that contains only operator=, but this sounds like a usual explanation of language designers’ laziness “well, you can easily achieve that by making a new class for 200 lines of code”. It should be able to provide a user with a fragmentary matching, so that existing concepts can be reused.

When you use “desires” instead of “requires“, then you’ll still have the same simple and good error messages in case of error. When the compiler is compiling the template entity, it will check what exactly features of the type are being used, and they are matched with the “desired” concept, then a new unnamed concept will be created, for this particular case, that consists only of such features of the concept that are actually used. It means that the concept matching in this case will always succeed, although the entity is still allowed to use features from a type not covered by the concept that is “desired”. If a required feature is not covered by the type, then:

if the feature is not provided by the concept, a usual “very long and bloated” error message is printed
if the feature is provided by the concept, the compiler complains that the type does not satisfy the desired concept of specified name (and some feature of it in particular)

Note, however, that weak concept matching only means, in practice, that if a type does not provide a feature being used, the error message would refer to a not covered concept rather than showing what feature is lacking (usually with a long and bloated error message). It doesn’t provide other concept features like:

overloading by concept type
patching the type with additional definitions from the concept (even if the entity uses them)

It means that, for example, if you “desire” and InputIterator, as shown in the example above, the entity is using x.input(), and some multi-pass iterator is passed as x, it won’t work, because the mapping from x.input() to *x++ will not be created. The compiler will show an error message that the concept is not satisfied, pointing this method.

TO CONSIDER: Use “desires explicit” to only have the type fixes, as long as their definitions can be compiled in this particular case. In this case the link from x.input() to *x++ will be created as long as * and ++ are defined. Or, simpler, maybe it’s better to use this solution as default – that is, even when “desires”, the type fixes should be applied. Although this still should not allow that this declaration be used to distinguish type parameters for overloading.

5. Overloading and partial specialization

Overloading is simple:

template<class Iter>
requires RandomAccessIterator<Iter>
void sort( const Iter& begin, const Iter& end );

template<class Container, class Function>
requires RandomAccessContainer<Container>,
requires CallableAs<Function, bool(typename Container::value_type)>
void sort( Container& cont, Function predicate );

Overloading resolution is possible, even for template-parameter types, as long as the requirements imposed on these types are mutually exclusive for at least one argument position. In this particular case we have both arguments have types of mutually exclusive concepts that they satisfy, although you can theoretically think that it’s not impossible to satisfy both CallableAs with given signature and being an iterator (although it would be stupid).

A type for which a requirement was imposed becomes a kind of another type. That’s why it’s possible to treat it as if it was a separate type, only inside a template that uses it.

Now let’s try similar things with templates’ partial specialization. Normally we define a partial specialization the following way:

template<class T>
class SomeClass<T*>
{
...
};

which means that we provide a specific definition of a SomeClass template, where its argument is of T* type for whatever type T. We can do the same with types that satisfy particular requirement:

template<class T>
requires SomeConcept<T>
class SomeClass<T>
{
  ...
};

or, simpler:

template<class T>
class SomeClass<typename SomeConcept<T>> // such T that is SomeConcept
{
  ...
};

Of course, please note that the T type used to specialize the SomeClass template is not the same T as in the first line. The last T is such a T that has been imposed requirements defined in SomeConcept, including default implementations. However it’s a definition of a partial specialization only for such types that satisfy SomeConcept.

As you can provide this additional thing for partial specializations of a template, you can do the same with partial specialization of concepts:

concept<class T> SomeConcept<T>
requires AnotherConcept<T>
{
 ...
};

In this particular case, “requires” does not mean that the concept (specialization) being defined imposes AnotherConcept on T, but that the concept specialization being defined concerns only such types that already satisfy AnotherConcept. All requirements being imposed on a concept being defined are defined always exclusively between { and } plus requirements provided by the derived concept.

Now we can try the harder way of providing the information about satisfying the Sequence concept for classes that have begin() and end() methods. First, we define a concept that detects whether the given class has begin() and end() methods.

concept<class T> SequenceClass
{
	typename iterator;

	iterator T::begin();
	iterator T::end();
};

Having that, we can make a specialization of Sequence concept:

concept<class T> Sequence<T>
requires SequenceClass<T>
{
	typename iterator;

	iterator begin(T t) { return t.begin(); }
	iterator end(T t) { return t.end(); }
};

Actually, we did it exactly the same way, as we still needed to provide the implementation that shows how the concept is going to be satisfied. The main difference is that this particular specialization is provided only for T classes that satisfy SequenceClass<T>, not for every possible class, for which particular default implementation can be successfully evaluated.

6. Check by usage

I’m not sure whether it’s a good idea for such a feature, however it may be meant useful. Instead of a definition to be provided that should be replicated in user’s code, we can define expressions that will be tried to be evaluated. The expression to be evaluated has also restrictions about the result type:

if the return type is void, it only needs to be successfully compiled
if the return type is bool, it requires to be successfully compiled, evaluated at compile time and it should evaluate to true
other types are not allowed

concept<class T> Sequence
{
	typename iterator;

	requires { (void)begin( T() ); }
	requires { (void)end( T() ); }
};

Please note though, that the practical consequence of the following code is that the requirements imposed are greater than they seem to be for the first look. We have used “T()” expression to magically create a value of T type. This magic, however, cost an additional requirement: now T is also required to have a constructor able to be called with no arguments.

You can prevent this, for example, by saying “begin( *((T*)0) )”. You can safely do it because this expression is declared to be of void type (as forced) and this way it won’t be evaluated (such an expression is never evaluated at runtime anyway, whether bool or void). Using operator comma would help, however in today C++ it’s not possible that this operator (not unlike any other operator though :)) be passed a void expression as argument. Because of that, you are still allowed to make a sequence of instructions, of which the last one must evaluate to void type (this is not allowed for bool type).

For boolean value of an expression, please note that the expression must be able to be evaluated at compile time. So, if you want to “require” an expression that will return a boolean value, even stating that you know that it will always evaluate to true, but it will call some function that is not constexpr, the compilation will fail. You’d better force (void) type for such an expression then.

The boolean expression can be done also the following way:

requires { std::has_trivial_destructor<T>::value; }

This expression can be evaluated at compile time, so it matches the requirement. It’s a similar requirement as for static_assert.

Check by usage is easier to define, but allows for making more mistakes. For requirements defined as function or method definitions there are many equivalences and additional statements:

argument or return type may be “auto”, which means that the exact type in this place is not imposed. Please note, though, that a concept that uses this feature is not allowed for constrained matching (see below) because it is not possible to determine whether the user entity is using the type correctly
argument types normally are allowed to undergo all defined conversions; if this is not desired, the argument should be preceded by “explicit”. It means, for example, that if the requirement is for “f(int,T)”, then the existence of “f(char,T)” satisfies the requirement; you should use “f(explicit int, T)” in order that the type be exactly int
same about the return type: if given return type can be converted to the return type in the concept definition, the concept is meant satisfied. Like with argument, you can force no conversion accepted by “explicit” modifier. This means additionally that if the requirement defines that the return type is void, the matching function’s return type may be anything
a constructor statement in a form A::A(B) requires that type A have a constructor with one argument of type B, or that B has a conversion operator to type A. If this is preceded by “explicit”, then only just one-argument constructor of A type is allowed (both implicit and explicit – here “explicit” does not disallow implicit A constructor, but conversion operator in B, from satisfying the requirement)
const T& is equivalent to T in argument types, although note that if type T has been simultaneously defined as non-copyable, it causes a conflict. In other words, if T is not required to be copyable, const T& or T&& must be used as argument
T& has no equivalences

7. Constrained matching

The concept matching can additionally state that the template entity, that is using the concept, is only allowed to use those type’s features that are defined in the concept. If a type feature is not defined in the concept, the template entity is not allowed to use it, even though the type itself defines it. The difference between “requires” and “constrains” is then that in the first case the template entity still can use features that were not described in the concept (although it will end up with a “usual bloated” error message, if the type doesn’t provide it). Here is then the summary of what happens when there are problems with matching the concept and finding a feature of a type, for all three cases, plus one – when the type is not matched to any concept at all:

1. when a template entity uses the feature that is provided by:	not using any concept	desires	requires	constrains
concept: no; type: no	usual error message	usual error message	usual error message	error: feature not defined in the concept
concept: yes; type: no	usual error message	error: type doesn’t match the concept	error: type doesn’t match the concept	error: type doesn’t match the concept
concept: no; type: yes	not a problem	not a problem	not a problem	error: feature not defined in the concept
2. when type just doesn’t match the concept	N/A	not a problem	error: type doesn’t match the concept	error: type doesn’t match the concept

It’s not covered a case, when we’d like to forbid using a feature that wasn’t defined in the concept (as constrains does), while not requiring that the type matches the concept, but that it only provide those features that were actually used (as desires does).

8. Type interfaces and logical matching expressions

Let’s explain first one thing. Like templates, concepts are being instantiated. When templates are instantiated, this results in a new class or function. When a concept is instantiated, it results with one or more type interfaces. This type interface is next used as a “patched” interface of a type that was imposed concept’s requirements on. As you know, when there are forwarding definitions in the concepts, which cause that there be some fixes provided to the type (like begin(T) external function that redirects to T::begin()), they are applied to the type, depending on what type of imposing was used:

requires: the type interface contains all things that are normally available for given type plus fixes provided by the concept
desires: the type interface contains all things that are normally available for given type, but not fixes provided by the concept
constrains: the type interface contains only things that have been defined in the concept, including fixes, but not things provided by the original type not covered by the concept (that is, elements of the type interface that haven’t been defined in the concept are removed from the resulting type interface)

Type interfaces may be also shared – one function may require several different types of arguments, so each such argument type is said to be provided a new part of type interface for itself. It means then that it is not unusual that one concept provide multiple type interfaces at a time (which could be similar to having an overloaded function – multiple functions under one name). However, when for particular type there was provided a type interface, then no new type interface can be provided for the same type anymore. This would be something like creating two different typedefs with the same type name. Because of that if you want to create multiple requirements for the same type coming from multiple concepts, you must bind these concepts into a concept expression. It’s especially required if you want to define, that a type should not match a concept.

You can then bind particular concepts into logical expressions using and, or and not. Note that this is only allowed for requires and constrains (not desires, and constrains may only be used as a defined requirement) and the expression has strong influence on the resulting type. The full expression must be passed just after requires keyword and, exceptionally, only the and, or and not keywords are allowed for use, not the equivalent logical operators “&&”, “||” nor “!”. For example:

requires not Assignable<T>

requires Assignable<T> and CopyConstructible<T>

(RATIONALE: using &&, || and ! operators will suggest that the logical operations are being used with the C++ bool type. This is due to a tradition of C++, also coming from C; also some coding standards disable using the word-operators. The common sense then is that if symbolic operators are used, then the expression is surely an expression that requires boolean types, so it should be sure for the common sense people that concepts themselves can be used as boolean expressions, which isn’t true. Because of that it is reasonable to make use of already existing keywords that are predicted to be used in logical expressions, although the use of them is quite rare today, and they better compose with these more high level instructions.)

Two separate concepts applied for one type compose its type interface (see below) together, so it must be explicitly stated how it is built – if you then use two separate concept matches for the same types, you create two different type interfaces for the same type, which causes a conflict. In other words, if you have once used some type in one “requires” statement, it can’t be used in any other requires statement in the same entity anymore. If you use concepts that have more arguments than one, and the types are not equally distributed between used concepts, you’ll probably have to compose a complex logical expression to express it:

  requires MutuallyExclusive<A, B> and CopyConvertible<B,C>
       and ReverseIterable<A, C>;

It’s not unusual that one “requires” statement provide three separate type interfaces in result. It’s not allowed, however, that separate statements each provide two separate sets of requirements for the same type.

The situation is complicated in case when the type is a class that defines some type inside it. If a concept does not impose additional requirements to some type defined inside it (and it usually doesn’t), then using the internal type is a separate thing to using the external type. It means, for example, that you can define a concept matching for T in one requires statement and for T::value_type in another. This is allowed, but only in case when the concept being matched for T does not require that T has value_type and additionally imposes some requirements on it. If it does, then – just by a case – the concept imposed on T is imposed also on T::value_type (implicitly because it’s defined inside the concept definition), so this way adding a separate concept matching statement for T::value_type will also result in a conflict (and you need to compose them in a logical expression, if you want to solve the conflict).

RATIONALE: Why to require that type be used only once? The reason is to not allow the user code to become bloated and having too many hidden and implicit interconnections (it’s the same as not allowing classes to be open). In other words, if a user would like to make a bloated and messed up concept interconnections, let it define it as a complex, long and unreadable expression tree – and if it’s unable to do it, it’s good because probably we have saved this way someone’s one week of work.

Of course, there is a special case when it is allowed: if, by a case, two separate requires statements used for the same type require exactly the same concept to be imposed (that is, you can impose the same concept multiple times – two separate type interfaces in one entity are allowed, as long as they result in the same, just like two separate typedefs for the same name are allowed, as long as they resolve to the same type). It is reasonless for normal “requires” statements, of course, but may be helpful in case when there is some implicit requirement for some internal type.

Using things like “constrains not“, on the other hand, may cause erasing some unwanted feature from a type. Imagine, for example, that you have three classes:

A, that contains operator=
B, that contains operator= and assign()
C, that contains only assign()

Your entity is using operator= for whatever type is passed, and there may be passed classes like A, B and C. You’d like to use this class’s operator=, however if the assign() method is provided, it is preferred over possible operator=.

You can’t make it normal way because if you provide a concept that provides a default implementation for operator= as redirecting to assign(), for B class operator= will be used, not assign(), as it is required. For this particular case you have to use something like that:

concept<class T> AssignOnly
{
     void T::operator=(auto x)
     { return this->assign( x ); } // restore operator=
};

concept<class T> AssignOnly<T>
requires Assignable<T> and HasAssign<T>, //operator = and .assign
{
     constrains not HasAssign<T>; // but erase operator=
     // now restore operator=
     void T::operator=(auto x) { return this->assign( x ); }
};

Here “auto” has been used, which just means that we don’t care what exactly the argument’s type is. Note that in expressions defining the requirements the “requires” may be also followed by a requirement, which is merely the same as deriving a requirement. You should only remember that, by the same rule, there can’t be two separate requirement expressions with concepts that impose requirements on the same type.

Of course, the truth is that in this particular case it was simpler to make the user entity use assign() instead of operator= and make a concept that only defines “assign” with default redirection to operator=. In this case it will work. But there still may be more complicated cases where various conditions have to be imposed on various parts of the type in a concept and in this case such a simple inversion would not be possible.

Concept expressions may also be used to create an integrate concept with simplified syntax:

concept<class T> ValueType:  DefaultConstructible<T>
                            and CopyConstructible<T>
                            and CopyAssignable<T> {};

Note that this is a syntax for deriving concepts and the rule of not using the same type in separate expressions applies here as well. Note that you can also create an abstract, though deriving concept.

9. Explicit type interfaces and simplified matching syntax

Normally, without the concept, the type that is produced when instantiating a class template isn’t exactly the same as the template defines. There are created all internal parts (fields), but from all the features (that is, methods) there are extracted only those that are actually used (SFINAE rule). It means that the resulting instance of a template is a specific set of definitions composed out of what has been used by the type.

It’s not exactly the same in case when the type is imposed a concept on. In this case, all things that are needed to satisfy the concept, are instantiated, even if the template entity is not going to use some of them. However, as far as the other features are concerned, the selective use of features is still in charge.

Type interface is something that comprises a full definition of type’s features, that is, includes all natural features of the type and those that have been additionally imposed on a type as added from the concept. When you define a template, and one of template arguments has been imposed a concept, this type-argument becomes a type interface. It doesn’t matter whether the concept has only one parameter, or more. Even if the type is just one of several parameters of a concept, the succeeded match of a concept causes that for this type several additional statements might have been imposed.

It’s also allowed that such a composed type, out of the type itself and all additional definitions provided by the concept, be created explicitly:

using TypeInfo = std::type_info
         requires LessThanComparable<std::type_info>;

This syntax is required because there may be a case to say something like:

   requires Convertible<MyType, typename OtherType::value_type>;

However, exceptionally for concepts that yield one type interface (usually it’s the same as a concept with one parameter – not sure whether there are cases when it’s not true – note that concepts cannot be overloaded nor can have default parameters), we have a simplified syntax: we can use the “instantiated” concept as a type:

using TypeInfo = typename LessThanComparable<std::type_info>;

or, the old style:

typedef typename LessThanComparable<std::type_info> TypeInfo;

(CONSIDER: it would be nice to provide this “patched type” availability by declaring it inside the concept. For example, by using the following syntax:

concept<typename T> GoodLooking {
    using type = concept typename T;
    ...
}

you can make the patched type T exposed as GoodLooking<SomeType>::type instead of typename GoodLooking<SomeType>).

Note though that the full requirement for being able to use this syntax is that the concept imposes requirements on only one type at a time (not that it only requires to have exactly one argument), in particular, the imposed requirement may yield exactly one type interface.

Let’s repeat also the example from the beginning:

template <class C, class T>
requires Convertible<T, typename C::value_type>
bool has(typename Sequence<C>& cont, const T& value )
{
   for ( C::iterator i = begin( cont ); i != end( cont ); ++i )
      if ( *i == value )
         return true;
   return false;
}

Similarly it can be used with template partial specializations:

template<class T>
class SomeClass<typename SomeConcept<T>>
{
  ...
};

And, of course, it also works with C++14 generic lambdas:

auto L = [](const typename Integer<auto>& x,
             typename Integer<auto>& y)
{ return x + y; };

Please note that the type interface may differ, depending on the concept matching keyword:

desires: the type interface is exactly the same as the original type
requires: the type interface contains all of that the type provides plus possibly additional declarations provided by the concept
constrains: the type interface contains only those declarations that are mentioned in the concept, in a form that is actually provided by the original type (features that do not match anything in the concept are removed)

10. Concept as a code template synthesizer

Normally a template may only be a function template or a class template. When instantiating, you can only get a single class or a single function from the template. Sometimes it’s required that you generate a set of functions or classes by making a simple magic one shot definition. In current C++ if you want to achieve this the only possibility is to use #defines.

Concepts have additional feature that allows them to bring the additional fixes made to the code into any other scope. Normally such fixes are available only inside the template entity that required some concepts to be satisfied. The following statement brings the fixes just to the current scope – that is, this makes that std::type_info becomes, in the scope where this declaration is provided, the same as TypeInfo declared above:

using LessThanComparable<std::type_info>;

After this is specified (see LessThanComparable in the beginning), you can safely do (until the scope with this “using” ends):

 if ( typeid(X) < typeid(Y) ) ...

Using this method you can also provide a whole set of functions to types, which’s definitions you cannot change. It includes also operators, such as those provided by std::rel_ops. Using std::rel_ops namespace makes that since this declaration these operators are available for all types and only require to have == and < operators defined, which not necessarily is what you want. Using the “using concept” statement you can provide this set of operators only to exactly one specified type. This is a very useful feature for enums:

enum eSize { SMALL, MEDIUM, LARGE };
using OrderableEnum<eSize> // adds operator <
  and EquivalentOperators<eSize>; // adds rest of the operators

11. Name clash problems

Of course, names of types and of functions being defined are assigned to particular namespace. But the concept may be defined in other namespace than that when it was used by a template entity. In practice it means that the names of functions or classes being used in the concept requirement statements shouldn’t be the same as the namespace in which the concept was defined.

Names are, then, qualified in the namespace where they are being required. It means that the concept is “instantiated” in the namespace, in which it is used. It doesn’t mean much the exact namespace, in which the statements are instantiated. What is important is whether at the location where the concept is used, particular functions are already available in the current namespace – no matter whether they are defined in this namespace or are imported from another one. On the other hand, you can always put a namespaced name in the concept, and in this case the function is looked for in exactly this namespace. Note that this is the case when “constrains” can be made use of. If you wanted to use, for example, std::begin because this is mentioned in the concept, and you mistakenly used just “begin” and did not make “using namespace std”, the constrain won’t let you use it, even if you have a begin() function in the current namespace that is defined for this type.

12. Things that are not going to be supported

One of the main things that are not supported are explicit concepts, that is, concepts that are not satisfied by default, unless explicitly specified in the concept map. This was the default in the original proposal, here it is not supported at all.

This feature can be easily achieved using the following way: define an abstract concept with a name that you’d like your concept to have, then define (empty) partial specializations for types that you explicitly define that they satisfy this concept (the “whitelist” method). Or opposite way – define an empty concept, which means that every type satisfies it, then define a partial specialization as an abstract concept, for types that, exceptionally, should not satisfy it (the “blacklist” method).

You want to put additional requirements for them to have? What for? Isn’t it enough that you just specify that the type should satisfy it? Ok, if you really need this, you can use concept derivation: define a concept normal way, then define another concept, that you expect to be explicit, as abstract, and then derive it from the previous one. This way you’ll have an abstract concept, which can be only explicitly defined as satisfying the requirement, and additionally it imposes some requirements on a type.

I also haven’t described in this proposal any kind of “axiom”. I’m not quite convinced to that it’s worth to implement such a feature. But I think it could be just added to this proposal without change, in the original form. The only problem is that I’m not quite convinced that this should be a part of concepts. This rather should be a part of class definition, so that the class type defines what simplifications can be used by the compiler. Then, concepts may have this axiom mentioned in its requirements, but for concepts this thing will still be derived.

I also haven’t added anything about “late_check”, however I think this feature may be accomplished by using the “desires” keyword, which already means that the underlying expression may be imposed requirements, but not as a must. I haven’t done any research for that yet.

13. Additional things to be considered

One of the important additional changes to be considered is to provide multiple default implementations for a concept’s feature. It would be selected as the first one that works. For example:

concept<class T>
LessThanComparable
{
    bool T::operator<(const T& oth)
       { return this->before( oth ); }
    or { return this->precedes( oth ); }
};

I’m not convinced that it makes sense. The same result (and more readable) can be achieved by using also different concepts and make concept maps basing on partial specializations for classes that satisfy some other concept.

This brings back the discussion about accidental concept match: the fact that a class has some method with some name and with matching argument types, need not necessarily mean that this method can be used as a replacement for some lacking feature. It was better, for example, to provide LessThanComparable specifically for std::type_info because we know this type has the before() method that is to be used for ordering, so for the rules of CIA (STL) this is equivalent for operator<. But it doesn’t mean that any other type that has before() method, which accidentally returns bool and accepts an object of the same type, has specified meaning.

The practice is that if we want to make sure about correct feature recognition, then there must be used some “common sense” things. Operators like == or < are meant common sense. Also operator << meaning “send to” and >> meaning “receive from” (not shift left or shift right) can be also meant common sense. But operator >= for the operation “push on the stack” is not a common sense.

However this discussion may unnecessarily bring back the doubts whether it makes sense to blindly match the concept by method name, while it’s already proven to be a minor problem. First, usually concepts are not based on just one method. Second, concepts should be build to help developers, not to protect them against anything. For this one then, it’s better to use common sense in defining concepts: the before() method for preceding is not the common sense – common sense for this operation is operator<. Likewise, common sense is to have size() method to return the number of items the object has, while any numberItems() or length() or anything alike is a specific naming for particular data type, so the concept specialization should be provided for this one.

Another thing to be considered is to make it possible to create requirements “on the fly” when needed for some other definition. For example, you’d like to define a concept specialization only for types that have operator=, but you don’t want to define a separate concept that will only detect whether operator= exists for this type. This way, instead of “requires Assignable<T>”, you’d say “requires { auto T::operator=(auto); }”.

I have an ambivalence about this. From one point of view it will allow users to create very bloated code definitions. On the other hand, without this feature users will be again forced to “spell up the name and consequently use it”, which works always against users. That’s why “auto” is such a great feature in C++11 because it doesn’t require that users define typedefs and spell up some name.

There is one more thing that comes to my mind. Possibly there should be provided some method that allows to define a concept for some type, and then some partial specializations, but disallows that this definition be open for further extensions. For example, it would be nice to be able to define an IsSame concept, which will not be allowed to be later specialized a way not intended by the original concept definition, that is, to define that char is same as int. Similarly, there should be some standard concept that says that the type is one of builtin integer types. Such a concept should not be possible to be later extended for the other types because this way a user may break the basic statement of the concept’s logical definition.

The problem is that the concept together with its specializations can’t be defined as a closed entity (like a class is). The only feasible syntax that comes to my mind is that after all specializations have been provided, you should say at the end:

concept<> BuiltinInteger final;

or something like that. From this definition on, no additional partial specialization of a concept may be provided.

I know this syntax is awkward, so maybe someone else will have some better proposal. Note that any better proposal needs that partial specializations be alternatively provided also inside the concept definition. If such a syntax can be proposed and well explained, then the concept can be marked final and then any external concept specialization will be rejected. The problem is that if such an additional thing is provided it increases the complexity of the solution – mostly because this creates an additional difference to template partial specializations.

One more thing that came to my mind is that a concept may additionally help during development when an interface is going to change. For example, in current C++11 we can have a code that is independent on function’s return type:

auto n = GetFreeSpace(device);

You expect that the return type may change in future or be different in other code configurations. By using auto you make your code independent on possible return type. However, you still would like to use this value some way, and you want to be sure that if your code wouldn’t work because of some basic statements violated towards this return type, it will be early detected. So we can say instead:

Integer<auto> n = GetFreeSpace(device);

This way it’s still auto, but it’s additionally checked if this type also satisfies requirements imposed by Integer concept. Similarly we can impose such a requirement as PointerTo<T, P>, which checks if T is a pointer to P, that is, not only T is P*, but also unique_ptr<P> or shared_ptr<P>, and whatever else type declares that it is a smart pointer.

14. Summary – can this be easily explained?

Simplicity is one of the most important thing for any new feature to be added to C++. C++ is already a complicated language, and as it can’t attract people who just don’t accept languages that are that complicated, it’s best to keep new features simple enough.

The explanation for concepts should start with examples of standard concepts, standard entities that use concepts, and how to interpret possible errors reported by the compiler. Then, how to do it yourself. Let’s begin with the well known Comparable:

concept<class T> Comparable
{
    bool operator==(T,T);
};

Objects must be comparable in order to be found. Because of that the find algorithm imposes this requirement on the value type:

template<class Iter>
requires InputIterator<Iter>,
requires Comparable<typename Iter::value_type>
Iter find( Iter from, Iter to, typename Iter::value_type val )
...

If you pass as ‘val’ something that doesn’t have operator ==, the compiler will complain that your type does not satisfy the Comparable concept because it doesn’t have operator== with specified signature – not that “bla bla bla (1000 characters of text) your value_type doesn’t have operator== with bla bla bla (1000 characters of text) signature”.

However, if you have to find an object of SomeClass, which can be compared for equality, but only using some EqualTo method, you can declare this especially for this class, as a partial specialization:

concept<> Comparable<SomeClass>
{
    bool operator==(T a, T b) { return a.EqualTo(b); }
};

This causes that objects of type SomeClass can always be compared by operator==, as long as the attempt happens inside the entity that requested that the type is imposed Comparable concept on. Of course, alternatively you can define an external == operator for this type, but operator== is in a lucky situation to be able to be provided as a standalone function. It’s not possible with all operators as long as with methods. Additional advantage is that you don’t have to provide global definition of ==, so you don’t make a garbage in the other’s code.

You can define your own class or function template, stating that you expect that some type satisfies the concept, just like find is shown above. If you have more than one requirement to be imposed on a type, however, you must bind them in one logical expression, using ‘and’, ‘or’ and ‘not’ keywords.

Using concept requirement imposed on a type, you can more strictly define what this type is, and distinguish it from the other types assigned to template parameters. This way you can also use concepts to make partial specialization of templates:

template<class T>
class Wrapper     // master definition (general)
{ ... };

template<class T>
class Wrapper<T*> // partial specialization for pointer types
{ ... }; 

template<class T>
requires ValueType<T> // partial specialization for ValueType
class Wrapper<T> { ...  };

// or simpler:
template<class T>
class Wrapper<ValueType<T>> { ... };

and overloading

template<class T>
requires SequenceIterator<T>
pair<T,T> range( T a, T b )
{ return make_pair( a, b ); }

template<class T>
requires Integer<T>
integer_range<T> range( T a, T b )
{ return create_int_seq( a, b, 1 ); }

Satisfying a concept is enough to distinguish the type to the other that doesn’t satisfy the concept. Although the overload resolution will fail in case when you pass such a type that satisfies both requirements (from two overloaded functions) simultaneously.

You can define requirements for types (beside things like types or constants inside the type) in your concept by two ways:

Provide a declaration that describes what should be available for your type. In this case the requirement is meant satisfied if provided feature can be called the same way
Provide a requires {} statement with an expression that should be possible to be performed on the object of this type, which can be either a void expression (then it’s only required to compile), or a compile-time constant boolean expression (in this case it also must evaluate to true)

Please note that incomplete types cannot be checked for satisfying the concept.

This feature has the following main purposes:

support overloading of function templates (for arguments that are of template parameter type)
provide better error messages
create ability to synthesize code templates (without preprocessor and code generator)

Currently that’s all for starters.

So I hope I have covered everything that is needed and expected from concepts in C++. It’s at least a good starting point.

strong

Posted in Uncategorized | Tagged enterprise-it | Leave a comment

0. Introduction

1. The present situation

2. Problems before C++

3. Adjustment to C++

4. Linker features required

5. C++ entities that could use new linker features

6. C++ interface form files

7. Binding together

8. Considerations

Introduction

Some historical background

Sinking in C

Short introduction to Esoteric Languages

Some didn’t get the joke

Those who know the history and still chose to repeat it

What the programmers actually need

So many features!

Just to sum up

Change of plans!

Build system details: fixed directories and naming

The use of partitions

Local modules

The general module syntax

Free ordering and open classes

Smooth transition

Mixed sources

Step 0: Source shaping

Step 1: New building rules

Step 2: get rid of #include

Step 3: Translate the module imports

Step 4: Switch “uninstrumented” module sources to single-source form (first for the least dependent or independent modules).

Problems and the environment

Object files and libraries

Modules in the service of libraries

Motivation for the shared libraries

Shared libraries with C++ modules

Modules in the service of public library

Variants and versioning

Creating libraries in a distributable package

The future: installing a module package

Simple naïve implementation

Final notes

Those ugly includes

Ugly, but necessary

The Black Sun

1. No real world usage

2. No consistent definition of the file content and compiler behavior

3. Modules are compile-side tricks for the build system, not source-side helpers for the developers.

4. C++20 using modules is a completely different language

5. You need to change the way you think about sharing parts between the project units, and there’s not a smallest thing to remind this in the so far C++.

Unique like C++

What the customer actually needed

Not your stinkun’ problem!

Back to the past

The developer and the system

All options open

1. The pavement of the good intentions

2. The practice of identification

3. Morons or automata

4. Bug in your brain

5. All for naught!

5.1. Turn on warnings and treat them seriously

5.2. Make a code standard rule to avoid assignments in conditions

5.3. In C++ classes, make operator = return void

6. Conclusions

This ain’t good, and the other ain’t no better

Blocking

Non-blocking

What the professionals do…

The architecture matters

0. Introduction

1. A gun in kid’s hands

2. What is really defensive

3. The Unpredicted

4. The game and the worth of the candle

4. Unforeseen and unpredictable differ in level

5. Preventing from self-destruction prevents from self-destruction

6. The program writes you

7. This function may, but need not, end up with a failure

8. Catch me! If you can…

5.3. In C++ classes, make operator `=` return void