Defensive vs. Paranoid Programming

0. Introduction

I have found many various flavors of the style of programming in my experience. I have already written about things like ridiculous checks for NULL, as well as how “underdocumented” many libraries are. Among many positively perceived styles of programming there is one important, Defensive Programming.

I have also already written about how checking for errors all the time, and “doing something” in response, results in producing a code that is never really tested and if there comes a situation of using it, it fails to do anything that would help in solving the problem.

Being defensive includes lots of various aspects and analyses. Poor mind programmers are using the highly simplified vision of this and it results in something that is next perceived by the next source code viewer (especially if this is someone more experienced) as “paranoid programming”. Poor mind, because the code seems like being prepared for every least kind of error that might occur during the operation, simultaneously forgetting, what they are actually trying to prevent: some “predictable” runtime situation, or some completely unpredictable situation, that is, a bug. Well, it would be really nice to code something that would prevent a bug, but well… how can I say it… ūüôā

1. A gun in kid’s hands

Underlying layers of software, that is, the kernel, the operating system, and the libraries, are not only something that save you writing too complicated low level code – they are also something that should provide basic reliability (they save you providing one).

If you want to program on the lowest level, where you are dealing directly with the hardware, there are still some domains, in which you can do it – but you’re rarely allowed to do that in today software. This occurs practically when you are developing the operating system itself, the kernel, or the drivers, but not when you have to provide a business solution. It’s not so that there is some law standing behind it or something like that. It is, to all appearances, the matter of trust. You can say that you don’t trust the underlying layers of the software that it works without bugs – but why should the customer trust you? In particular, when you replace the existing lower layer with your code, how do you think, will the customer trust more you than somebody, whose library is being used by multiple software projects all over the world?

If so is it, then why do you worry whether the library is written properly and whether your software will behave correctly in case of a possible problem in the library? Do you think that your software will be free of bugs? Most likely, if you’d replace some library code by your hand-written code, you will at best make the same number of bugs, usually more.

2. What is really defensive

When some part of the software you’re using, or it can even be some other part of the software that you’ve written, may result in an error, the first and foremost thing for you to check is on which level a possible problem may occur. The fact itself that some function returns a special error value or throws an exception (the second one especially if you write in Java) doesn’t mean yet that this is something you should take care of. Some procedures return error value because in some contexts they can be used on a very low level, or just the API is predicted for this function to be implemented in a very unreliable environment as well as completely reliable ones. So, if the error that would occur in some particular case is something that your underlying software layer should never report, treat the case that it did as something that has occurred out of normal conditions.

A short example in POSIX: the close() function, which closes the file descriptor, may return -1 in case of error. But it may happen only in one case: when the descriptor is for the device open for writing, it uses buffering, and there is some buffer of data still waiting to be written to the device; it’s planned to be written when the buffer becomes full after next write, or on demand when you do flush(), or – as in our case – before disconnecting from the device when you do close(). When the operation of writing the data from the buffer to the device fails, close() returns -1. But of course, it still closes the descriptor! This -1 doesn’t mean that “closing failed” – closing cannot fail! It means, of course, that if you predict possible failure, you’d better flush the buffers manually by flush() and manage the error when you still can do anything.

So, the condition for close() to possibly return failure is when the file is open for writing, it uses buffering (although it usually does), and there is still some data in the buffer waiting to be written. It means, simply, that when you do flush() first (and it succeeded), then your next close() will always return 0 Рthere is no possibility that it do otherwise. Samewise when the file is not open for writing at all (for reading only, for example) Рin this case close() will always return 0.

Of course, if you read the man documentation for close(), you’ll see that it’s a common error to ignore the return value from close() – but, simultaneously, somehow, close() in /usr/include/unistd.h is not declared as “__wur” (shortcut for gcc’s specific¬†__attribute__((warn_unused_result)) attribute), as it’s for, for example, read(). That’s because there are only specific situations, when close() may return error.

So, there can be two completely different reasons, why close() would return -1. The first is when it’s doing final buffer flushing, which failed. But if you ensure that this wouldn’t be the reason, then returning -1 by close() means something completely different. It simply means that the operating system just went out of its mind. There are some theoretical possibilities for this to happen – for example, file has been reconfigured to be open for writing while you didn’t request it. I say “theoretical” because I can imagine that it may happen. The only trick is that the system’s basic reliability statements are that this will never happen.

Another example: you’re opening a file. Wait. Not yet opening. You check the file using “stat” function. You check that the file exists and that you have rights to open it, so opening this file should succeed. So, you open this file just after you made sure of that. If the opening function reports error, this can be now one of two possible errors:

  • the file has been deleted (or changed rights) in between
  • there was some very serious system error

If you want to distinguish these cases, you can call “stat” again. If the second call to “stat” proves that it has actually been deleted or changed rights, you can state it just didn’t exist. Otherwise we have the case of system error.

That’s not all. This file might have been meaning various things. It might be, for example, a file kind of “/etc/passwd” or “/lib/ld-linux.so.2” on Linux. This is such a file that must exist or there is some serious system damage.

The point is, the fact that the file cannot be open in every particular case can mean different things, so there cannot be any universal advice what you should do in case when the file can’t be open. On one hand you can state that this condition, connected with some other call, can be the only one that can provide you with the full information. On the other hand, why should you care? If the file was vital for your program to continue, you have to crash. If not – just state quickly that the file can’t be open and continue without the file. You may want to log this into a log file, you may want to display this info to the user – but whether to do this, should be also specified explicitly by knowing its real meaning to the program. None of these things has to be done “always”. Even if you stated that this is a serious system error, especially if this is only one of possibilities, displaying this to the user is the wrong thing to do. For a user it doesn’t matter if your program crashed because file /x/y/z did not exist or your program has been killed by SIGSEGV signal, or whatever else you’d like to say.

Being really defensive is, first, to realize on what level you are. Once you realize, you can now decide whether the runtime condition that failed for you is something that is just one of alternative for the runtime to work, or this is something that should not occur. In the second case you should crash. But there is also the third case.

3. The Unpredicted

There’s always one more group of error handling: checking for a condition that is declared explicitly to never occur.

For example: there is a function, which returns a pointer to an object. The objects are collected in an array and there is always an object at position 0. For values greater than 0 and existing in the array, the object at that position is returned, otherwise it’s the object at position 0.

This function, then, has in its definition that it never returns NULL. And it really never does, according to its definition. What’s the point then in checking if the object returned by this function wasn’t NULL?

The first explanation is more-less: you’ll never know. But what does that mean? Does that mean that this is an expected situation, when you are checking for one of possible results of the function? Or you rather expect the situation that is explicitly declared to never occur? Of course, it’s the second one. And if so, there can be only one reason for this to occur: the program has went out of control. So, effectively what you’re trying to do is to keep under control the situation of getting the program out of control.

Do you understand now the absurd of the situation? And, to be clear: no, it’s not putting the program back under control. If in any context and any basic conditions you predict that the program can go out of control, then it could have happened to virtually any part of the program. In short, virtually everything may happen. I say “virtually” because the only things that may really happen are problems that may happen at the machine level and come from a compromised reliability of any of the software layer. Anyway, even if we shrink this virtually everything to only possible thing, this is still a situation that the program went out of control. If you then consider that any of things that should never happen have happened, then the number of degrees of freedom for any today application software is so great that such a stupid null-check will decrease the probability of crash by… less than 1%. A probability, which is still very small, at least because it’s explicitly qualified as “should never happen”.

Some people say that even this may help in an extreme situation. Oh, really? So why don’t you resign your job and gamble instead? Why do you waste your time on focusing on something that can decrease the probability of a bug by less than 1% instead of defining the full state conditions and plan the tests based on reduced degrees of freedom and full coverage cross-path passing? Why don’t you, simply, focus on bugs that are much more probable to occur?

4. The game and the worth of the candle

If you think that you have some situation, where there is a static probability of some unexpected result, and you need a tool to catch that at the development stage, use asserts. This is exactly the tool predicted for this use. But remember: assertions are predicted only to work in a situation that you run your application for testing, in your incubator, in your environment, compiled with a special instrumentation for debugging. It’s not predicted to work in a production environment. Not because they may crash your program. Only because they may harm the performance of your software (if you think that it won’t check again the same condition manually).

But… well, there can be a situation that there is more than very little probability of error in a crucial resource! If so, you should crash.

Come on, this condition check doesn’t cost me much in focusing, doesn’t cost much work, this is just a simple instruction – and even if it gives me a negligible advantage, it’s still better to have it than not to have it…

Really? So, if this is only a simple instruction to write, you probably don’t test this code, do you. Maybe the condition to check is a simple instruction to write. But this instruction causes some additional situation to potentially happen in your code. It’s not just condition. It’s the condition, plus all actions that can be taken when it’s satisfied, treated globally, together with all the aftermath. This is not just a side path of the program. This usually is a serious problem and if you want to make the program prepared for this situation to handle it and recover from the problem, it’s really complicated – and as such it must be very thoroughly tested, the program must be run several times in a controlled environment, in which you can cause this problem to seem to happen (at least from the program’s point of view), to see, how it recovers and whether it does it properly. Because, certainly, you’d like to prevent the crash of the whole application. You don’t code your error handling just to make it cause another error, do you.

So, ask yourself first: will your program be really able to recover from this error, and are you able to test it if it really does? If your answer is anything else than “yes, certainly” (even if it’s “most probably yes”), doing anything than crashing is wasting time and resources. Now ask yourself additionally: this whole cost of preparing the recover is paid only to prevent problem that can occur with a probability similar that your customer using your software will be killed by bomb explosion – is that worth a shot?

And again, I remind: it’s not that “bugs too little probable should not be prevented for”. It’s that “bugs cannot be prevented”. Yes,¬†bugs cannot be prevented by a runtime condition check.¬†If there’s something that could have happened only by a very very bad luck bug, you can’t do much about it. I say “can’t do much” because you can lie to yourself that you can and you may do something – but there’s still only a very little probability that you’ll repair it by the action you take. With much greater probability you’ll cause by this action much more problems that that you have caught (that’s why it’s usually better to crash).

4. Unforeseen and unpredictable differ in level

Defensive programming is such a method of programming that prevents from problems that may occur in particular method of how the software is being used. This has completely nothing to do with problems that may occur due to unreliability of any underlying software layer (well, actually we should talk about just “underlying layer” because it’s not only about the software – hardware can also cause problems, if not hardware, then the power supply or so). Defensive programming is simply preventing the program from going out of control, stating that things declared as “should never happen” really never happen; at most, things that users “should never do” are often being done. Defensive programming is simply “doing your homework” and write your software decently, with the widest possible range where this rule applies.

Defensive programming is simply doing things in the range of declared possibilities how the function should work, not preventing a situation, when the function did something that hasn’t been described in its documentation, or a system fails to adhere to rules it guarantees to honor.

I’m pointing that out because there are many developers, who think they are doing Defensive Programming, or just they make their programs safe, while actually they are doing Paranoid Programming. This doesn’t make the program any better nor any safer. It doesn’t make the program have less bugs or harder to destroy in case of a bug. Actually there are just two possible practical results that Paranoid Programming can cause in the software production:

  • completely no observable value added, just wasted time to write and possibly test (best if!) the useless code; sometimes even wasting performance on doing useless checks
  • causing the program to be harder to properly test and therefore wipe out bugs; sometimes may lead to increased vulnerability of the software

I’ll try to describe this second one in more details.

5. Preventing from self-destruction prevents from self-destruction

You know, what the purpose was to have a “self-destruction” button in a military device? I admit I know it more from S-F than from real life, but it still had some purpose. This purpose was: protect the information and the resources, so that the enemy cannot use it on its advantage (and our disadvantage simultaneously).

That’s more than obvious that no one is going to use self-destruction button in a daily use. That would be the last day, of course. This is to be used in an ultimate situation, in which all possible protection, recovery, defense, and whatever else can be undertaken, fails. And there is a serious threat that the enemy will reach our resources and use them, which will make them stronger, or which will make us more vulnerable. Even though we will be in a serious trouble, when we destroy our ship, the whole organization may be in a much tougher trouble, if the enemy get access to our resources. In case of software it’s even not the matter of security: the program running in an environment with resources containing false data can do completely crazy things and can potentially destroy the other, valid data.

That’s why preventing from crash is a kind of “software pacifism”. Preventing from crash causes only preventing from crash. If someone is going to block the self-destruction button from firing forever, they do not “prevent the ship from getting destroyed” (the ship can still be destroyed by many other ways), but from an intended self-destruction, in case when we exceptionally want it.

It’s maybe not the main purpose of checking runtime conditions, but it always happens when somebody is doing “Paranoid Programming”. The biggest problem is that when there’s some function called, which returns some fail-prone value (not, “which fails” – it’s just enough that it returns a pointer), the first thing they think about is the error checking. I don’t say “error prevention”, it’s only error checking. When checked, that a pointer value being returned is NULL, they at best print something on the log, and return NULL. Well, our function doesn’t return a pointer? Ah, it’s integer? Good, let’s return -1. -1 is still a valid return value? Then let’s return, hmm… maybe INT_MAX? I didn’t even mention yet something like “isn’t it so, occasionally, that our function has already guaranteed that it never returns NULL?”.

The real problem is not that this is such a “shut up” method of error handling. The real problem is that if the error handling for this function has been only INTRODUCED this way, the rest of the software using this API is probably not yet prepared to handle this kind of error. Often the return value is checked different way, maybe some kind of error is not caught, or maybe just the return value is ignored. It happens very often.

6. The program writes you

The “metaproblem”, let’s call it so, is that people often do not “design” their programs. I’m not talking about “do a complete design before writing”. I’m talking about designing at all, just before you write one function, spend some time to think what resources you’ll be using, what reasons may be when these resource accessors may fail, and what the meaning is of a particular failure for you.

I just have a feeling that some people do not “write their programs”. They are rather “discovering them”. They just have the intent of the function in head, then they just write what they think should be in an optimistic scenario. When they somehow call a function that may have some notions of causing failure, they automatically put a condition check, effectively discovering this way some part of the function. This function isn’t being written – it’s being discovered. In various points urgently some failure condition pops up, and they just hit it with a condition check, just as if you hit a fly.

I don’t always write functions this way – but when I do, I do not do any error handling! I’m writing a function with an optimistic scenario only, that’s all. Later, when I confirm that this function in this form makes sense and works as it should (by testing the positive scenario only), I’m adding appropriate and well rethought error handling. This is what I do when I have completely no idea how something should work and I must encode it to see it before my eyes and help my imagination. And this is not always the case. Usually I know what functions I should use to get access to the other resources, and what the failure conditions are; having that I can at least plan the API so that appropriate error handling conditions are preserved before the function is written. In particular, I know what kinds of failure should be propagated or handled otherwise.

And, probably, some of programmers additionally think that they are good coders when they write a complete code at “a shot”. That is, they’ll write a function that since the very beginning contains everything it needs: the business logics, the error handling, possibly also logging (but no, of course not the comments!). And once it’s written, it’s working completely correctly and needs no fixes in future. Well, if there were people, who can do that, they wouldn’t even have a chance to externalize themselves in internet forums because they would be quickly caught by some software company to do bugless code for them, before the others can surpass them.

So, another rule – making the function logically complete is much more important than any error handling – and error handling is something that you can always do later. This way you are really focused on error handling and not on any other thing, so you can think this through and you are more certain of not missing any part that requires error handling – at least more than when you’d write error handling exactly at the first time.

Before you write any error handling, you must first (!) – yes, first, that is, before you add any error handling instruction for any of the functions you call inside – make sure that your function is prepared to PROPAGATE the error, or that it’s a kind of error meaning ALTERNATIVE WAY for itself. I can only tell you that the second one is a very rare case – usually a kind of “I can’t do what you request me to do” function’s response means that your function is also unable to perform the task it has been written for. And it means that this function must report “accordingly”. Propagation means also that any other part of the code that calls this function must be prepared for that this function may report failure, and handle this failure properly. I guess you write the functions before you use them, or at least you plan the API of the function completely before you start using it.

Only this is defensive programming.

7. This function may, but need not, end up with a failure

And, the most important thing: never, never ever do things like providing a function to someone else (including distribution of a library) and say in the documentation something like “this function returns XYZ in case of success and -1 in case of failure” (or NULL, or it throws an exception in case of failure). If you are considering using some open-source library, which has something like that in the documentation, then try to find some better documentation for it (it happens that there exist some!), and if it cannot be found, then – seriously – don’t use it and find some alternative. If there’s really no alternative and you must use this library – give it to somebody to (or do it yourself) reverse-engineer the source code and find out the exact conditions where failures may occur, then prepare the good documentation out of these research results. If you don’t keep an eye on it, you won’t be able to know what it really means for you that the function has failed (and this is a key information for planning the error handling). This way you’ll handle every erroneous value the same way, by crashing the current path completely. This is the straight way to paranoid programming.

If you are using exceptions to handle the problems, you are actually in a better situation – because you’re certain that either your caller will catch the exception and secure the program in a result of bad behavior of your function, or the program will crash. At least your function will not be responsible of making mess in the program. The only problem is that, occasionally, your function might have been called just to check if this works or not. That’s why exceptions should be used only in the following cases:

  • when it touches upon a condition that the function explicitly states that it must be satisfied before calling, and the user really could do it
  • when the function could not continue and provide the expected results, and the exception report in this case was explicitly requested (by a flag or alternative version of the function)

If exceptions are used any different way, and especially when the language uses strict exceptions (Java, Vala), exception handling is practically nothing else than a different (and usually awkward) way to handle error-meaning return values. They are even easier to be ignored. It’s very simple to catch all exceptions, provide an empty block at the end, and… continue the program, just as if all the results required for this function to continue, were already available.

8. Catch me! If you can…

It’s very good when assertions are used in the code and next can be used to trace cases when something’s wrong. It’s very bad, though, when you don’t even start to think, what might be the cause of happening this error. The fact that you can catch some erroneous situation doesn’t mean that you have to.

Let’s take such an example:

ptr = new Object;
assert( ptr != 0 );

Of course, let’s state that overloading of operator new in this code is not allowed. What exactly thing are you checking in this particular assert() call? Are you checking if you had made a mistake in the program? How can this pointer be ever null, if the definition of operator new explicitly states that it never does?

What? You may happen to find a non-compliant compiler? Ok, I guess you are really serious, so if you are uncertain about the compiler compliance, create an additional small test program, which will contain various instructions predicted to test the compiler’s standard compliance. This program will be compiled and run before even starting compiling the whole project. All things you may have doubts of can be checked beforehand and confirmed that they work. Once it’s confirmed, you don’t need any checks for that in your production code!

Of course, the compiler may be qualified as non-compliant, but you’d still like to use it. In this case you can provide some specific things, which will solve some of the non-compliance problems, BUT, again: it is extremely unlikely that you’ll be doing that. First, because it’s hard to find such a non-compliant and “in-use” compiler, which returns NULL from new. Second, because even if you find, most likely you don’t want to use it (you’d use some C++-to-C translator or some other language that is easily translated to C and use C compiler for this platform – it would be easier than dealing with various non-compliances of the C++ compiler).

This is not the only case – there’s a variety of possibilities for that. Let’s take even some of the things that are provided librarywise – like boost::shared_ptr. Ok, in C++11 it’s standard, but I just had some good example for that:

shared_ptr<Object> o( new Object );
assert( o.use_count() == 1 );

Think a while about it. It’s not just the fact that it’s stupid – that’s actually obvious, it’s not the point. The point is that you are¬†not testing your code in this case. You are testing the code of shared_ptr. Believe me, if there happened to be any opportunity for this problem to happen, guys from Boost team would have caught that before you’d ever had a chance to see this code. On the other hand, if there was any kind of error like that in your 3rd party library,¬†you can’t do much about it!

What’s the actual reason of doing assertions like this? Say, why do people do it, even though they maybe even know that these things are unlikely to happen? The only thing comes to my mind: they would like to ease their conscience that they “did all the best to prevent errors”. Or maybe not yet that. They just “found something that may be very easy to check”. Because this is easier, much easier than doing good design, good path analysis, good unit and component tests. Because the latter is boring, and adding such checks is so easy and it so much eases the conscience!

Paranoid programming is effectively not a result of wrongly understood rules of defensive programming. This is exactly like with all the other bad practices in programming: the lack of thinking and the lack of good work organization. Which makes it not unlike any other domain of work.

Advertisements
Posted in Uncategorized | Leave a comment

Why Java is not a high level language

… and there was never any attempt for that.

0. Introduction

Depending on various rankings, Java is either the most popular programming language in the world, or one of the most. No matter how trustful these rankings are, it’s undeniable to say that Java is big use and big business. And it has gained popularity very rapidly, if we count how old it is, and especially how big problems with performance it had since the very beginning.

It’s so funny, however, that the reason of this popularity is assigned to various treats that Java… well, doesn’t have. It’s said that it’s simpler, it’s more a high level language, it’s a true object-oriented language, and it’s more efficient for the software business (“time to market”). Actually all those explanations are bullshit, except the last one – but this last one is just a forwarding, not an explanation.

As long as “simplicity” is concerned, you can mention, of course, that the direct split into builtin value types (int, float) and librarywise defined class types, makes the language simpler. The default garbage-collected type of object and reference also makes it simpler. But if you take a closer look to Java 8 and compare it to C++11, you can quickly come to a conclusion that you can already forget any statements like “Java is simple”.

You have to realize that “high level language” is a language that uses high level constructs reflecting the logical constructs in the human mind. The function-based nature of C language, the class-based nature of Java, and the string-based nature of Tcl language, they all are the same as bytecode-command-based nature of Assembly Languages: it’s simply a low level. The low level language isn’t an “assembly language” or “system language”. It’s the language that bases on one strict simple “nature” that is used to implement everything.

So, when¬†it comes to a language meant as “high level” what we really mean is that it should represent some basic logical concepts of data types (such as number and string) and execution environment (threads). Not necessarily unlimited integers, but at least ability to easily create a data type that limits the use to required range (in contrast to having integers for everything, with ability to use some value out of given range). One of mostly known models of such a language is Ada. When it comes to a model of an “object oriented” language, the most famous language is Smalltalk.

The problem with Java isn’t that it’s not “like these languages”. Java might have been a high level language and a true object-oriented language, still having C++-derived syntax and even possibly other treats borrowed from C++. The only reason why this didn’t happen was: the intent of Java’s creator was to make it a better C++. And I didn’t say that they did it – they at best have created a “better C”. And here are the explanations, why.

1. True object oriented language

I have an ambivalence whether saying that Smalltalk is the only true object oriented language is a truism, or exaggeration. Nevertheless it can be a very good model of an object oriented language and a source of information what a true object oriented language should have: it should rely on object itself to perform a task and expect that the object do it its own way (that’s why it’s called “method”).

In order to simplify adding common “methods”, the term of “class” has been introduced, and it’s used by majority of OO languages – not by all of them, though, which means that it’s not the obligatory part of an OO language. The most important is just this: rely on objects. Hence the name. Of course, those OO languages that relied on classes, have evolved lots of various rules about using classes, and managing the software changes around them – however, the central part of OO isn’t a class, but object – however defined.

So, a true object oriented language is the language that:

  • Relies on objects (not on classes)
  • Doesn’t use any other entities than referring to objects

What does it mean “relies on objects”? For example, no matter what the meaning of particular thing in the program is, all operations are defined per object, including whether the object can do it or not. So, for example, in every place where some data is required, there can be either an integer, or a string, or a file stream, or a book, or a procedure block (say, lambda function) or even a “nil” – the only limitation is that not every operation would accept it. But that’s not because of the object’s type (there’s no such thing; in some rare cases it may be its class), but rather because the operation that particular procedure would do to an object is not supported by it (“duck typing”). For example, in Smalltalk when you do a+b and assign the result to c, the + method is called for an object designated by ‘a’ with one argument, ‘b’. It really doesn’t matter what they really designate, it’s only required that the operation a+b be able to be done to them, in particular, whether + can be done to ‘a’ with ‘b’ (yes, Smalltalk supports defining operators as methods of objects, should that be of any surprise).

The “class” in such a language is only yet another object that is being used as a delegate that provides methods to be invoked when the object receives particular “message”. But the fact that the object is of some class or of a class that is derived from some class, should be rarely a checked condition. Maybe in some cases it may make sense to check if an object understands some set of messages (“protocol”). But usually it should be just called particular method for, with responding to an exception that it doesn’t understand it.

So, first, the “class” is only a helper to prepare the object in the correct form – and not every language is using them. Self (Smalltalk dialect) is one of exceptions, but for today readers Javascript would be a better example. Yes, Javascript is an object oriented language (although not fully object-oriented, as it uses integers and strings as values), and it doesn’t use classes. To make an object have a method, just assign a function to one of its fields (there’s just a small helper in the form of ‘new’ operator, which just calls a given function, having a newly created object seen through ‘this’ symbol). But the method call syntax does execute a function referred to in the object’s data – the object, through which’s field the method was called, will be inside seen as ‘this’.

Second, in a true object-oriented language everything should be object, with no “exceptions”. In Smalltalk everything is object – an integer, a string, even a method, even a class, and even a “nil”, which is considered as something being “not an object” (say, a universal marker for a case when an object cannot be put in this place – it has its own unique class). If not, we may still talk about having a “true object-oriented flavor”, but not that it’s a true object-oriented language.

Third, as a consequence of relying on what object can do in response to a method call, the only typing in such a language should be dynamic typing. If you want to make use of anything that relies on objects, at least for this part of the program you should forget static types. So, whatever relies on static types, it’s not object-oriented at all.

There exist various OO systems, which at least can be considered “true object-oriented” even though this concerns only part of the language. There’s for example Objective-C, where the whole object system is a kind of “alien feature” applied as a patch to C language, and there’s just one static type of reference-to-object, named “id”. Similar feature is in Vala and C# – the “dynamic” keyword. You can use a variable of such a type, assign an object to it, and call a method – the call will be resolved at runtime. It’s not required that the method be known prior to compile the call instruction.

So, in Java there are entities that are not objects, it uses static types also for classes, and there’s no possibility to call a method on an object, if the static type of which the reference is, does not define it (even as alternative, as in Vala and C#). Theoretically you should be able to do it using reflection (by searching through its methods), but anyway there’s no direct language syntax dedicated for that (and to some extent some C++ libraries also feature reflection). So, the object system in Java isn’t “true object oriented” – it’s C++-like.

The creators of standard libraries in Java were likely to be completely unaware of this. The majority of all APIs in all Java libraries strongly relies on “OO features”, meaning in this language that it’s based on classes. Java has this OO feature as a “central feature” and something that the whole API relies on. Such a thing makes sense in Smalltalk, or even Objective-C – but in Java APIs like this are exactly the same clumsy as in C++ due to weak OO features (MFC is one of the most dire examples of this mistake). From the OO design point of view this is the most stupid language design decision ever made – but this has nothing to do with the business point of view.

The fact that a method can be only called when there’s a certain definition for it provided, has important consequences. For example, in Java you can keep an object through a variable of type Object. But you can’t call a method named indexOf on it, can you. Of course not. The only way to achieve it is to cast this value first to a reference to String type (say, that’s what you meant), and only then you can call this method. And it’s because there is no method named indexOf defined for Object.

This causes trouble, for example, when a framework gives you access to some stored object of some base class, but it may be an object of some class derived from it. Even though you know that this is the object of your class, you can call your new methods only after you cast it to your class. This is a typical Smalltalk way, but in C++ and Java it causes clumsy API.

This fact also strongly influences the hierarchical structure of the design and method naming. For example, if you want to call a method – which will be then overridden by the user – in C++ (and Java), you have to have some class that defines it and call the method through the pointer to that class. Then, your class must be derived from that class because that’s the only way that the method call be effectively redirected to your implementation. None of these things is true for Smalltalk. In Smalltalk you just get the object and call the method, as there’s no such thing as “pointer to some type” in Smalltalk – it’s just some variable that designates an object.

But, on the other hand, you cannot name your method just “open”, which – depending on the context – may be expected to open a file, a window, a gate of the garage, or whatever else. In C++ if you want to open a window, you get the window, which is known to be at least of a class derived from Window class, so you know that this method can be only an override of Window::open. Any File::open or Gate::open may exist simultaneously and none of these has a thing to do with each other. In Smalltalk, if you had a method named “open”, and the code would like to call “open” on the object, then any possible version of “open” (although number of arguments matters – here it’s empty) is accepted, no matter what you meant in particular case.

All these things only confirm what I’ve already stated: the object system in Java is exactly the same as that in C++. And static types for objet-oriented system is a burden, not a helping feature.

So, of course, it’s sad that it took some time the Java creators to realize that an OO language that features static types must have something of a kind of C++ templates. As Java is posing a “real OO language”, mainly by making the API depending only on objects so that everything is done OO way¬† – but practically this has very little to do with OO itself, it’s either of: doing it within the frames of “class” term, create some weird term to excuse doing something by only playing with objects, or just patch the language with some specific unique feature that helps use some particular idiom.

If you ask a question like “what’s the damned reason for this language to have these jinterfaces” or “why must I make a whole new class just to pass a code for execution to a function”, you usually get the answer “because this is object-oriented language”. It’s really, exactly the same stupid bullshit as heard also from some undereducated C++ fans as if overloading and defining operators were “object-oriented” treats of this language.

Jinterface? Well, this is just something that the Java language understands as interface – not what interface in software development really is. The “normal” explanation what interface is is the set of, say, “ways” how to use particular type or set of types. If this would be something like “interface for a class”, at best it may be something that collects methods (and their signatures) that a class should define (“protocol”), which is said to be “conformed to”, if given class defines all of them – but not that the class explicitly declares it. If something containing basic definitions is explicitly declared as being part of the class’s definition, it’s a base class (although only from static types point of view – in Smalltalk you don’t have to declare anything to be able to call a method for an object). Java generally introduces several entities proving that its authors didn’t understand their correct meaning – like for example “jackage”. This is something like namespace, but in Java world it’s called “package”. Anyway, back to the point.

So, how much to do with OO has that “jinterface”? From OO perspective, it’s just an abstract class where all methods are abstract (and this definition is still more C++-like than OO-like – in Smalltalk there’s no such thing as “abstract method”; the method can be called without restrictions, in the worst case it just redirects to doesNotUnderstand: ). As classes are just a “helper feature” for OO, not the precondition, then so the jinterface is. The fact that this “interface” plays almost the same role as class in Java (it can be used as a type for references and provide definition of methods, so it’s enough to be logically treated as a class) only confirms that this is just a special kind of class (and it’s a class in C++ sense). In practice, it’s only a method to overcome the limitation of classes, like lack of multiple inheritance. In Smalltalk we could have at best something like Objective-C protocol, that is, a set of methods of which all should be implemented in the class. But the conformance means that all methods are defined, not that a class explicitly declares it – older classes can be checked against newer protocols.

And what about listeners? If you think that this is more object-oriented than lambda functions, as lately added to C++11 (and Java 8 as well), you’re completely wrong. In Smalltalk – and likewise in Objective-C – you can treat a block of code as an object and also call methods for it. This functions more-less like lambdas. So, it looks like that these “lambdas” are much more OO than listeners. Java 8 has already admitted that as it has introduced lambdas. And these listeners, in order to be usable, had to be armed in additional language features in Java: anonymous classes and their closures (a method created in an anonymous class has automatically access to the variables of the method in which this object was created). Anonymous class that derives some explicit class, especially with this additional closure, is something completely unknown to all other OO languages. And this has still nothing to do with OO features. This is just “a set of features to make the use of listeners easier”.

That’s not all. If lambdas were in this language since the very beginning, maybe this could have been done with some special, unique name of the method. But now the creators have tried to make it able to be used with existing APIs that use listeners – so they just “adapt” them to the required class. Well, this had to be somehow composed with the existing form of class-based replacement for pointers to functions (actually a virtual method is nothing else than an index to a “virtual table” keeping function pointers) and overloading – both being the treats borrowed from C++ and not existing in Smalltalk.

In result, all OO treats in Java are:

  • done C++ way and very far from Smalltalk
  • armed with additional specific problem oriented features
  • based on classes, not on objects

And all things that “force using OO style” practically just force to use class-based features.

And I repeat: don’t get me wrong. I’m not saying that Java is bad because it’s not like Smalltalk and much closer to C++. It would be even funny to say that a language is bad because of that, as I am a C++ developer and a great fan of that language. So let that be obvious that what I mean is that the biggest power of Java comes from the fact that it’s based on C++. It just makes me laugh this whole hypocrisy that tries to deny it.

2. Integers

The C language is accused of having too much roles played by integer numbers and “alike”. This “alike” includes also pointers. And these complaints are also spread to C++. Actually, in this “high level wannabe” C++ we have characters, which are integers, booleans, which are integers, bit flag containers, which also can only be integers, pointers, which also can be more less integers, and of course integers themselves. I bet that every experienced programmer, who “feels” what “high level language” really means, knows that this was done so in C just because it’s easier to implement this in the machine terms, not because it has anything to do with program logics. From high level language we should expect that it implements bit flag containers as just containers of bits, strings as value types no matter how many characters they have (including 0 or 1), booleans that are just two values of its own type, and pointers that cannot be “arithmeticized”. Of course, we not necessarily expect that integers have unlimited range (the “gmp” integers can be added optionally), but at least that integers are used only as integer numbers, we can do arithmetics on it, but nothing else.

So, out of all these things the only thing that Java has “achieved” over C++ is the lack of pointer arithmetics. All the rest of stupid things are hilariously incorporated.

Ok, let’s even admit, in Java boolean and char types are completely separated from integer types. But how many people have paid attention that it’s the “character type” itself something characteristic to low level language, not a fact that it’s treated as an integer number?

Could we have lived without char type? It’s more than obvious – of course we can. If we had a language-builtin “string” type, which is a value type, it may be empty, may contain just one character, and may also contain multiple characters – why would we need a char type? So what, what would str[i] return (or, say, the “at” method)? A string! A string with just one character. Just like Tcl does in its [string index $str $i] instruction – which is only a simplified version of [string range $str $i $i]. Moreover – thanks to that Tcl doesn’t have any “char” representation, it was just a piece of cake to add UTF-8 support to this language, which was completely transparent to all existing code, just the matter of changing the implementation. While in Java you have the name “char” coming from C (and C++), in which it was an 8-bit integer, but in Java it’s 16-bit (ha! see how smart – they declare that it’s a 16-bit, but not an integer :D). Of course, this doesn’t make it unable to use UTF encodings (Java’s String is using UTF-16 encoding internally), but what do you expect to have when some input character at specified position happened to be a 32-bit character? It’s impossible to return this character because it wouldn’t fit in char value. So, String has a method named “charAt”, which returns char value being either a character at specified position, or a surrogate, if the character cannot be represented by char value. This can be checked, of course, and if needed there is another method named “codePointAt”, which returns an integer this time, which is the numerical value representing the character. As int is declared to be 32-bit and it’s enough to represent any unicode character – but, well, not as a character, though. You can also get a string containing just one character, but heck, to get a string of one character from string s on position N, you have to do s.substring(N,N+1).

Why has Java this solution? You can look for excuses why it’s using UTF-16 representation internally, but this has some reasons – it completely doesn’t matter and doesn’t explain, why Java contains charAt() method and char type. This has completely nothing to do with converting to array of bytes because this should be treated as a “specific representation”, into which you shouldn’t have a need to look (and this is so in Java). Why would you need just one character at specified position? If it’s in order to have it glued to some other place – you can glue it as a string, too. If to convert to bytes – you have a much better solution to “encode” the string. String is heavier than a character? Smalltalk has already found a solution for that – Java could have set rightmost 8 bits to 1 and this would mean the UTF-8 character itself as a “string”. Anyway – there’s completely no “business” reasons to have charAt() method that returns a (explicitly!) 16-bit char value. But just one – to remind C++ as much as possible.

The String type is yet another flower. In a high level language it is also never an object – it’s a value. You can assign it to another variable, you can glue it and overwrite existing value. There’s no such thing as “null string”, just by the same reason as there can’t be something like “null integer” or “null colour”. And this is how std::string in C++ works.

Not in Java. In Java you have the same thing as in C, with just a slight exception, that in case of dynamically allocated strings, in Java you don’t have to free() them. String is just a pointer to something, it can be null, and this way, it should be tested for “nullarity” before doing any operation on it. Thanks to that you have lots of occasion to make mistakes and needs for testing the string for both nullarity and emptiness. Not even mentioning comparisons – even in C++ you just do a == b. Fortunately in Java you don’t have to do a.compare(b)==0 and you can’t repeat the stupid C-derived “if ( !a.compare(b) )”, but a.equals(b) doesn’t look much better, if we’d treat Java as a high level language.

Bit flags is even funnier thing. The best thing I can imagine for having a set of boolean flags is to have some container of bits. Either as a vector of boolean values, or as a constant size bit container with compile-time constant indexation. And this is even how C++ does this, with its vector<bool> and bitset. If you want to use a set of binary flags, use bitset. You can easily compare it with a mask, do shifts, selective bit replacements and so on. And you are not limited to fixed 8-based length.

So, this is exactly what I would expect from a high level language. Wanna flags? Take a dedicated type, bitset. Wanna number? Use integers.

Not in Java. Not only does not Java feature anything like “bitset” (let’s admit, in Java it’s not possible to define this thing librarywise, but it still could have been done as usual in Java – as a builtin type), but also all bit set things are implemented just like in old, good C – by integers. It has all the bitwise operators, which are predicted to work on a flag set, only for integer numbers, including bit shifting operators, moreover, right one is in two flavors – signed, when the leftmost bit is copied to itself, and unsigned, when the leftmost bit is set to 0. Is anyone using it? Of course, bit shifting is one of the operations done on integers on the machine level. But this can be at best used to get a better optimized division by 2 (shift right does the same as division by 2, but it’s much faster). Effectively this is for making algorithms most possibly efficient. What is worth such a feature of, in a language, in which performance doesn’t really matter? Moreover, it still has optimizers (even though only as JIT), so this kind of optimization still can be done. The only reason of having &, |, ^, << and >> operators in C was to provide access to low level assembly instructions. They may make sense in a high level language, as long as you explicitly declare that it’s a set of boolean flags and you are doing an operation on a value with a mask. But not as “and, or, exor, shift” – as “set bits”, “clear bits”, “extract bits” and “slice the bitset” (shifting can be used for implementing “slicing”).

Similar thing is with indexOf from String. From a high level language you should expect that if indexOf informs you that the searched character was not found, it won’t just return -1, letting you still do operations on it. You’d expect to either get an exception (bad idea in this particular case), or return some special value that would lead to this result, if found, or lead to nowhere otherwise. A high level language should afford to a concept of “optional” variables – actually the role of them is perfectly played by value wrappers. So, if indexOf returned Integer (not int), it would return null in case if not found. You still have to check it, but at least it won’t produce a stupid, but good looking positive integer if you add something to the result blindly, but do NullPointerException instead.

And, finally, the integer numbers. First weird thing is that when they have already made all the numbers the same names as in C, they still haven’t added the unsigned modifier. This changes the rules a lot (and this is why Java also has unsigned right bit shift operator), while it looks ridiculous to have types named “byte”, “short”, “int” and “long”, and they are 8-, 16-, 32- and 64-bit types respectively. Probably in future we’ll have also “quad”. Ok, I understand that there must be a type named “int”, and that it must be of 32 bits. The truth about freedom of definition for sizes of integers in C and C++ did not get into practice. Of course, there was a change between 16-bit systems and 32-bit systems, where “int” type changed its size from being equal to short to be equal to long. But the practice after introducing 64-bit machines for C++ compiler is that “int” is still 32-bit, just “long” changed the size to 64-bit, while in the C++11 standard a new “long long” type has been introduced to represent an integer wider than int or long (on 64-bit systems it’s actually the same as “long”). So, in practice, what’s the deal of giving these integers so many various names? Their use is practically none. The mostly used integer type is int, in some special situations there’s a 64-bit type (long in Java, long long in C++ – yes, I know, C++11, but long long existed a long time before as extension). Types like “short” or “byte” is something you can only see in some library that interfaces to some C library. So, the only sensible set of integer numbers for a high level language is: int, which is 4-byte by default, then integers like int1, int2, int4 (== int) and int8, or even int16 – for cases where they are really needed. So why are there these funny names? The same reason: to be like C++. The “byte” name is already something that happened to be a user-defined type assigned to “unsigned char” (alghouth in Java it’s still signed), and it was a good enough replacement for “char” from C++, for which the better assignment was to be a UCS-2 character.

I agree that this set of names is the same stupid in case of C++. Of course. But this was already seen at the times when Java was designed. C++ must have them because it’s still being implemented for various platforms and still has some of C legacy. But even C++ has int8_t, int16_t, int32_t and int64_t types (the last one defined as long long on 32-bit systems and long on 64-bit systems, causing this way problems with printf format). Java designers could have made them like this, adding just a universal “int” equal to int32_t – especially that they have predicted it to work only on one platform. They would do it, if their goal would be to be a high level language. But they just wanted to be a better C++.

3. Pointers and null

What is NULL? It’s something that has been introduced in the C language. If you think that this has anything to do with Smalltalk’s nil, you’re completely wrong. There’s no such thing as “not a pointer” in Smalltalk. Well, you can say that there are no pointers in Smalltalk (I prefer to say that all variables in Smalltalk can be of pointer to object type only), but this is how it is there: this “not an object” is just a unique object that does not respond to any calls. But you can still try to do it. This won’t result in any crash or any data destruction.

Some may say that it’s obvious. Not exactly. When you have NULL in C, you should check a pointer against NULL before dereferencing the pointer (or somehow be sure that it’s not by any other premises). In Smalltalk you can do it (for example, when your function allows nil to be passed in the place of object), but normally you don’t have to check it. You can always blindly try to call a method for this object – and it may fail because the object is nil or it may fail because the object somehow does not understand the method specification (I know it’s called “selector”, but I’m trying not to use any terminology that is specific to Smalltalk and different to what is in Java and C++ for the same things) or it may even fail because of any runtime condition – all these things should be somehow planned handling for. In C you just have all of that, but NULL is special – you shouldn’t try to derefer it because it results in an undefined behavior (at least in POSIX system, with virtual memory on, we know that it results in termination on SIGSEGV).

So, Java just changed this undefined behavior into NullPointerException (if we agree that SIGSEGV is what you get, or something similar on Windows, rather than undefined behavior, this is just a cosmetic change). For example, if you check whether a string designated as s is equal to “equal”, you should do in various languages the following thing:

  • In Smalltalk, you do s = “equal”
  • In Tcl, you do $s == “equal”
  • In C++, you do s == “equal”
  • In C, you do s != NULL && 0 == strcmp(s, “equal” )
  • In Java, you do s != null && s.equals( “equal” ). Or some hackers propose “equal”.equals(s)

So, compare now the way to do that in Java with rest of the languages, and you’ll see which of them is the closest equivalent. Just by a case, the equals() method gets Object as argument, even though the intent of it would be to compare it with another string. Well, in C you can also pass a void* value as argument to strcmp.

4. Reflection

Before stating whether the “reflection” feature in Java makes it high-level or not, you have to realize first what the reflection is from the language implementation point of view.

So, if someone has missed that part, I’d try to remind you that both Java and Smalltalk are languages predicted to be working only on one platform, which is¬†a Virtual Machine. It doesn’t mean that you can’t find reflection in languages predicted to be machine-compiled. It does mean, however, that when you have a virtual machine, you can plan it anyhow you’d wish – if you have a physical platform, you usually have nothing and the only way to provide any kind of “reflection” is by using some extra layer between the¬†“train” (language) and the platform. Often at the expense of performance.

But this isn’t even important. Important thing is what advantage you have from reflection (especially important if you reconsider it in the frames of high-level language). That’s why I have to remind you one more time that Smalltalk uses dynamic typing only and the only “static type” in this language is a reference to object. Because of that, reflection in Smalltalk is just available occasionally because in this language it’s inevitable in order¬†to provide the dynamic type system. If we have a language with static type system – as is Java, C++ and even, say, Eiffel – things change a bit. In these languages reflection doesn’t have the same usefulness as in Smalltalk and I’d even say that reflection in such language provided way more limited advantage to anyone.

The only “usage” of reflection I found so far for Java is Java Beans and the implementation of some scripting languages that deal directly with the Java objects (Jython, Jacl). So, as you can see, for things not connected to writing any software in Java language at all!

And additionally,¬†you have to pay attention, what really happens in this particular case. The java.lang.Object type is already a kind of “object orchestra”. Reflection in Java is limited to this exactly thing – you don’t have reflection for builtin value types (Java fans will say that it’s still not possible to create global objects of these types – I prefer to say that it’s rather because it’s impossible to provide reflection for them). So, java.lang.Object is simply a core class for the whole object system, and there’s just one object system in Java. That’s all. The reflection is provided for the¬†standard Java object system (being part of the Java standard library), NOT for Java language.

If we realize that, we can simply follow that in a statement that in C++ you can use a variety of object systems, and a designer of that object system might have predicted some form of reflection. This is done in case of Qt and Gtk+ – reflection provided librarywise.

So, now it should be clear, why C++ doesn’t feature reflection as a language – because its runtime library doesn’t predict it, as well as very little part of the language depends on its language runtime (surprise!). These parts are only exceptions and RTTI.

If you want comparison with high level language, here it is: Ada. Does Ada feature reflection? To some very limited extent, yes, but it’s generally not much more things than in C++. So, anyway, this thing does not make the language more or less high level.

5. Threads

Java features threads. Ha ha ha. Good joke.

Java programming language provides just one thread-related feature in the language – the “synchronized” keyword. And it’s only needed because this language does not feature RAII – with RAII this could have been still defined librarywise. All the other stuff in this language, despite that it requires some language system support, is defined librarywise anyway.

Of course, being defined librarywise doesn’t automatically mean that this is not a high level construct. But it may mean that for some language, especially when the language provides libraries with very little abilities to define an API. This is how it is in C and this is how it is in Java – because all APIs in Java must be defined basing on classes. The only special construct, as I have mentioned, is anonymous class, and this is the most advanced thing you could think of, before Java introduced lambdas.

I won’t evaluate it. Just look at examples of using the Thread class, as well as some high level concurrency tools like Future class. So, the same question as ever: what would you like to see in a high level language as an implementation for concurrency?

I would like to see something like:

  1. An ability to define several procedures in place, which will be done in parallel.
  2. An implementation of futures and promises that can look in my code exactly as if I didn’t specifically use any special tool, just try to call functions as usual, read a value or assign it to somewhere else.
  3. A system of running parallel tasks that can pass messages to each other and the language interface provides me with a nice view of how this is running.
  4. Maybe some additional logical parallel features like coroutines

For example, I’d like that my procedure look exactly the same, maybe with some slight marker, regardless if I normally call a function, or make a request-response cycle, while my procedure is waiting (when a timeout is regarded, then this “function call” results in an exception). Same thing regardless if my value comes from a usual variable, or from a promise.

Why this is important? Because first, threads are simply low-level system tools, and second, it should be the tool’s problem to spread the execution into multiple threads, and I, as a programmer, should only worry about that the task is done. Execution, splitting, joining, synchronization – all these things should be worried about by the language system. I should only define a procedure, the language system should worry about parallelizing it.

So, what do we have in Java? Even though this language has lots of things that are a purposewise language support, this one doesn’t have any dedicated language support. Future is just a class, Thread is just a class, if you want to do anything with that, then create or obtain the object of this class and call its method. You can more-less achieve the “procedure split” lookalike using the listener idiom (let’s name it so – Java is such a special language that every idiom in this language must have a dedicated language support).

Many various things can be tailored to using the object interface (using class, object creation, calling methods, also never deleting an object), but there are many exceptions. I have already mentioned String as one of such exceptions. Thread is another such exception because it’s not just simply “an object” – it’s something that comprises a part of language system; the thread object is just a reflection of it. And also not the best representation of it. For such things the object-based interface is awkward and looks… well, very low level. Because it’s just written “how to use some low level tools to achieve the result” instead of “what the programmer intent is when writing this code”.

How much does this interface differ to what’s in, for example, POSIX thread interface for C language? Only such that in Java you don’t deal with memory management. But no need of dealing with memory management is way too little to be meant high level language.

6. Afterwards

Look: I’m not criticizing Java. I don’t say that Java is a bad language or something like that. Or that Java should not be used because it’s not a high level language. I’m just speaking the fact: Java is not a high level language, and no matter how much things the JSC will pack into this language in future, it will never become even close to the meaning of “high level language”.

I haven’t written about many other things, like exceptions (and why throw-new word pair in Java is like sinister-plot in English language), weak references, or the structure of classes. Probably you may find much more. These that I have mentioned are enough to confirm the main statement of this article.

On the other hand, let’s pay attention that in many other languages, which also pose to be high level, you can find many design flaws that cause that they are not, or not fully, or their “highlevelness” is compromised. For example, in Haskell language the string is represented as… a list (while list is the basic and language supported container) of characters. You just get characters and operate with them as with a list. This way you also have a string represented as an array (ok, list) of characters. I understand that the language needs some way to iterate over each one character in the string, but Tcl can do it, too – just do [split $s “”] and you’ll get the list of strings (not characters!), each being just one character string. It’s not the same as having abilities to iterate over the string by list interface and accessing chars. These single characters are still strings, while in Haskell, just the same as in Java and C++, you have an array of characters.

Pay attention also that the “proving in practice” is something that is only valuable in the commercial software development – academics may like various languages, but in this use programming languages make no money. And the practice is that in the commercial development still it’s the C language in the biggest trust (no, I’m not talking about the use, I’m talking exactly about the trust! – yes, that’s sad!), and Java is of trust also because it’s much like C. C++ is taking some parts of it over, but this is because its primary purpose is of use here: let’s use C++ to have access to some high level constructs and this way make our work easier and faster, and when the high level construct fails, we can always fall back to low level C-like. In a high level language you just don’t have where to fall back to.

Maybe then, despite the declarations, “we don’t need no stinkun’ high level language”. Maybe people like low level languages more than high level ones. Maybe the high level language concepts do not “speak” to people. I personally admit that it also didn’t speak to me in the beginning. Before learning any languages for today computers, I have been using only some BASIC and then assembly language. This way I was more used to low level concepts than to high level. And also I still haven’t found any language that can be meant high level and well enough acceptable. I prefer C++ not because of it being any high level, but because it provides an ability to be high level and develop high level constructs. So, it may be that high level concepts are still not mature enough so that any good and well acceptable high level language can be created.

But Java designers’ attempts were far from any approach like this. Java is as it is – designed to be very similar to C++, designed to remind the low level things from C and C++, designed to do everything just one way, designed to provide the user with a flail-simple way of coding to encode complex things. All the concepts that can be meant high level, known even at the time when the idea for this language came up, have been ignored. Of course, Java is voided of memory management, low level memory access, or treating every value as integer. But this was just cleaning up a part of most dangerous features. There are still lots of low level features in Java, and they don’t even have any high level replacements, as it is for some of them in C++.

Java is still business successful? Then its “lowlevelness” was part of this success. Sad, but true.

Posted in Uncategorized | Tagged , , | 3 Comments

The good, the bad, and the dumb

Cameron Purdy, vice president of development for Oracle’s application server group, has made a presentation showing how Java supplants C++ and, probably to even things up, also some cases when C++ supplants Java. The problem is not that I disagree with the first ones. They are just explained wrong way and the author seems not even know what “Java” really means. Being a VP in the strongly Java-focused software development group in one of the top software companies in the world.

1. Garbage collection

It’s been said already lots of things about garbage collection and this is always the case of either from Java fans that “every modern language must have it or it’s not modern otherwise” or C++ fans “this leads to bad practices and causes unacceptable overhead”. In the beginning there have to be said that from the technical point of view GC may have several disadvantages in the form of using more memory for the same task and making possibly overheads by running an additional process, sometimes locking the whole system’s access to memory. But it also has advantages of being more convenient for the user that they don’t have to worry about object ownership (not “releasing the memory” – if you hear that GC solves the problem of “releasing the memory” it means you talk to an idiot) and it’s also faster than manual (heap-based) memory allocation, as well as can lead to less memory fragmentation and stronger object localization. The last advantage is an important advantage over shared_ptr in C++, which suffers of poor performance and does not help in memory defragmentation.

But things told by this guy is such a pile of bullshit that it’s hard to believe how this guy became a VP:

Garbage collection (GC) is a form of automatic memory management. The garbage collector attempts to reclaim garbage, or memory, occupied by objects that are no longer in use by the program.

No. A programmer writing a program in a language that imposes GC as the only memory management, doesn’t even “think” in the term of memory management. They just create objects that have some contents and that’s all. For them there’s just no such thing as “memory”. GC was once well described as “a simulation of unlimited memory”. Important thing about GC is that it provides this virtually unlimited memory, not that it does memory reclamation.

A significant portion of C++ code is dedicated to memory management.

False. Of course, if your program is stupidly written, it may be true, but it’s not true that this is required or that GC significantly solves this problem. For example, a program that reads a text from the input stream, processes it, and displays postprocessed things, is able to be written in C++ using completely no explicit dynamic memory allocation. It’s not possible in C or in Java. So, in this particular case, GC solves some “problem” that does not exist in C++.

Cross-component memory management does not exist in C++.

Really? How come? My function can return auto_ptr<X> (or, better with C++11, unique_ptr<X>) and I can either assign the result to some other variable or ignore the result and the object is taken care of. A memory allocated in one component can be deallocated in the other. Unless you somehow “play with allocators”, of course. But if you do, you should be prepared that you should solve much more problems. Normally you can use a unified and universal memory allocation and it works also cross-component. I have completely no clue what this guy is talking about.

Maybe this guy is talking about Windows and Visual C++, in which the debug configuration is interconnected with traced memory allocation (which is indeed stupid idea). When you have components in your application and they are compiled in “mixed” configurations, it makes of course a memory allocated in one module not able to be deallocated in the other. But this is a VC’s problem, not C++’s.

Libraries and components are harder to build and have less natural APIs.

WTF? Libraries and components may be hard to build in C++, of course, but this has completely nothing to do with the memory management. There are problems with modules, distribution, platform specifics, also maybe there can be some module that uses specific memory management (in other words: in C++ you have more occasions to make a stupid code, but this still has nothing to do with memory management). What is “less natural API”? Is that a natural API that integers are passed by value and cannot modify the value passed, while arrays are passed by pointer and can be modified in place without restrictions? Only when you think “natural” and you mean “Java”.

Garbage collection leads to faster time to market and lower bug count.

Unless you made a stupid design and need to fight design bugs.

The garbage collector can be just “a good tool for things that need it” and from the implementation point of view it can have advantages and disadvantages and that’s ok. If put this way, of course, C++ can also use gc (provided by a library). There are several¬†things, however, that have to be pointed out:

1. From the programmer point of view, objects in a GC-only environment have two clearly defined treats:

  • The GC-ed object must be trivially destructible. You should even always state that objects created in Java are¬†never¬†deleted.¬†Some may argue that objects may have finalizers. But destruction means “reliable and timely reacquiring the resource” and finalization is not something even close. Finalization is closer to a “wishful thinking” about what other kind of resource requisition can be done when memory requisition is done. But as the memory isn’t guaranteed to be actually reclaimed, the finalizer is also not guaranteed to be called. That’s why if you think C++ way about destruction, GC-allocated object must be of a class that is treated as trivially-destructible. It means for example that if your object refers to a file and you no longer require the file, you should disconnect the external file resource from the local object explicitly. And indeed this is how things happen in Java – Java doesn’t close the file in finalizer, does it.
  • GC means object sharing by default. It means that the pointer to an object can be passed to some other function and written elsewhere, this way being a second reference to the same object, and (unless it’s a weak reference) shares the ownership with the first one. Some idiots say that with GC you don’t have to worry about object’s ownership. That’s not true – of course you have to. At least you need to think well whether your object really has to be owned by particular reference variable, or maybe it should only view the object, not own (co-own) it, and just become null in case when the object has been reclaimed. In C++ this shared ownership is implemented with shared_ptr (and it comes with weak_ptr the same way), and despite its poor performance, it’s good enough if you shrink its use only to explicitly required situations. The important part of it is: object sharing. As you know, object sharing is the main purpose of “those really really bad” global variables. It’s something that is shared by everyone. So, with GC-ed objects you allow every kind of object to be potentially co-owned. In practice co-ownership is very rarely required, however you can get convinced about that only when you have some experience with a language that features also other-than-GC memory management.

2. There are also two interesting consequences for the difference between GC and shared_ptr:

  • The shared pointer concept still means timely and reliable resource requisition. This means any kind of resource (not only memory) and any kind of object. The object is being deleted when the pointer variable goes out of scope¬†and it was the last owner. So even if we can’t state it for sure that the object will be deleted at the end of scope, we can at least make it certain that these are the potential places when it may¬†happen. This way shared_ptr can also use objects that aren’t trivially desctructible – in case of GC you can’t even rely on that the object will be ever deleted, nor even in which thread it will (would)¬†happen.
  • The object deletion is synchronous – that is, when the object is going to be deleted due to getting the last owner out of scope, every weak pointer to that object becomes cleared immediately. Of course, it still doesn’t change much in case when the weak_ptr user procedure is running in a separate thread, but at least it matters when you have them in a single thread. The weak_ptr becomes NULL only because it could only be a dangling pointer otherwise – but no matter that, you should never test this pointer for “nullarity”, you just shouldn’t use this pointer if you are not certain that the pointed object is still owned and still alive.¬†However this at least isn’t dangerous in case of shared_ptr because you can always state that if the object isn’t NULL, it’s still alive. In case when the last owner goes out of scope in the GC environment, there may pass some time between it going out of scope and clearing out all weak pointers due to object deletion, during which the object shouldn’t be referred to. This is how GC simply turns the problem of dangling pointer into the problem of “dangling object”.

2. GC is hailed of being resistant to object cycles, unlike the refcount-based solutions (including shared_ptr). But if you research this topic well enough you can quickly come to a conclusion that a cycled ownership is something that… should never occur in a well-designed system! The “cycled ownership” is something that “in the real world” occurs in just one case: a company may own the other company and that other company may own the first one, of course, only as a partial ownership (so this is also a shared ownership). Actually I don’t know how this is handled in the law, I rather think that governments all over the world just don’t know how to handle this, so they just allow it without restrictions (the problem may be when the ownership level is much higher and only after passing through some hierarchy branches you can see that you come back to the company once found already). But companies are special case – there is no similar thing in case of, for example, hierarchies in management in the companies. Actually this situation is handled by GC using the method that “if A owns B and B owns A and no other object in the system owns either of them, then both have to be deleted”. In other words, cycled ownership is just treated as if there was no ownership at all¬†(because this is like having two managers A and B in the organization, A manages B, B manages A and neither of them is managed by anyone else – can you imagine such a situation in reality, unless this is some state-held company in some banana republic?). So, if you didn’t mean the ownership, why did you use it? Shouldn’t you have used a weak reference for one of them? You don’t know, which object is more important than the other and which should own which one? If so, then you most likely have no clue, what your system is up to! So, you shouldn’t participate in the development, or you should first do your homework. For me, if there is a situation that there’s a cycled ownership, there should be some Lint-like tool that detects this situation and reports it as error. And GC should crash your program always whenever it detects ownership cycles, if you really think about helping developers make good software.

So, summing up, these advantages for the developer, hailed as a better “time to market”, only lead to worse designs and are more tolerant for logic errors – instead of helping clear them up. A C++ code, for which a good Lint-like tool is used to check things up – and, believe me, it can really reliably find all potential memory leaks, and even if this is only a great potential of memory leak, it’s still better to write this more explicitly – is much better in the term of time-to-market than Java code, where design errors are fixed on the fly by the language runtime.

One more interesting thing about GC is something that probably comes from inability to understand the difference between “Java PL” and “Java VM”. It may be a little bit shocking, so please keep your chair well.

Well, Java DOES NOT FEATURE GARBAGE COLLECTION. Surprised? Yes, really. There’s no such thing, at least in Java programming language. Of course, it’s thought of as if they were in the language, but it’s really not in the language. If you want to see gc implemented in the language, the only such things can be functional languages, including such far away from each other as Lisp, OCaml and Haskell. These are¬†languages that feature garbage-collection. Java programming language does not feature it.

Yes, I’m not kidding. Ok, let it be, Java programming language does not provide any ability to delete the object, at least as a language builtin feature. But it doesn’t guarantee anything like that as a language. There are some Java implementations (or extensions) that use deterministic or even explicit object deletion. This can be done by using some external library, not necessarily a language extension. The Java language requires that the program in it be well formed even though the objects are created and never explicitly disposed. This doesn’t mean that Java can only be implemented on JVM – gcj provides a native Java compiler using Boehm’s GC.

It’s because garbage collection is a feature of JVM. It means that any language you’d implement for JVM (including C++, if anyone will do it in future) will take advantage of garbage collection, even though its natural way of managing objects would be manual. In C++, for example, you’d be allowed to use delete on objects; this would just call the destructor and do nothing else. And yes, of course, C++ is predicted to be used with GC, just there is no “standard GC” to be used with C++.

So, the “GC-related” difference between Java and C++ is that Java language doesn’t provide any possibility to delete objects and the language runtime is expected to take care of this by itself, while in C++ you can use different memory management policies for objects, although the default policy is to explicitly delete objects.

2. The build process

Guys, come on, come on… How much does it matter for a large project, how many build machines you have to prepare and how strong they should be? It’s significant for a home-cooked project (maybe), but not in today environment. When today running a javac command takes itself more time than to compile the files (in some cases), what’s the real difference to compiling C++? In today machines it mostly just occupies more cores to compile. It’s really funny when you hear something from fans of a language, which is already accused of having problems with performance, cleared by stating that “machines and processors are getting better and faster so it shouldn’t matter much”.

That’s still not the most funny thing – this is:

Java has more approachable build tools like Ant and Maven, while C++’s Make and NMake are considered less so.

What? Ant is suggested to be any more advanced tool than Make? Come on…

First, Ant is, regarding the features, at best the same advanced tool as make. And what is “more approachable” in case of Ant, the XML-based syntax, which is meant horrible by most of the people? It’s not Ant itself what makes things better (let’s state they are). It’s Java compiler.

The Makefile’s role is partially to define the dependencies between definitions implemented in separate C++ files. The definitions are therefore contained in header files and the header files providing definitions used by another file are defined in the Makefile rules (gcc -MM is a command that can be used to automatically generate them). In Java this thing is completely automated by not having explicit header files – however the dependency problem is handed off by the Java compiler. You just pass all the source files that may have dependencies between each other to a compiler command line. You can imagine that there can be created a tool that by the same way takes all *.cpp files and produces the “result” in the form of *.o files – taking care by itself to properly handle the *.h files and the dependencies. There’s no such tool only because there are more advanced build tools for C++ that do this and much more.

The only thing that Ant handles is to associate the name of the target (possibly default) with source files that have to be passed to the compiler. That’s all. It’s really not more advanced than just a shell script in which you’d encode a command: “javac File1.java File2.java … etc”. Ok, maybe with some CLASSPATH. Make is a no compare to Ant – Ant also cannot be used to build C++ projects because – surprise – it doesn’t feature targets connected to physical files. It’s merely because this part is exactly done by Java compiler.

Maven is different – it’s really an advanced, high level build tool. Enumerating Ant and Maven in one sentence as “two different more approachable build tools” is just WorseThanFailure(tm) and I just can’t express how stupid must be someone who says something like that. Maven, first, imposes a strict source file layout, and second, it manages external Java packages automatically. You just need to specify where your sources are and what packages you use – rest of the things is taken care of by Maven, including version control and upgrades.

But, if you point that out, you have to remember that in C++ world there are also tools that provide advanced build systems. Examples are autotools, Boost.Jam, qmake and CMake. A very important tool that solves the problem of providing modules in C++ is also pkg-config. Having that, you just add the package name to the list and you don’t have to worry about specific compile flags and linker flags – you just add the name (not all packages provide entry for pkg-config unfortunately, but it has quickly become a de-facto standard). You still have to add the include file in the sources, of course, but this has nothing to do with the build system. And, well, the syntax is awkward? Yes. But I really think that XML syntax is even worse. I have once written a make-like tool in Tcl that was predicted to be able to extend into a highly automated high-level build system, just had no resources to work on it. I’m mentioning that to point out that the lack of such a system for C++ as Maven is probably not the biggest problem in this language.

3. Simplicity of Source Code and Artifacts

Yes, I admit, C++ still needs a lot of work in development for this case. But if you are really seriously done with some Java projects, you know that saying that “Java is only *.java or *.class” files is maybe true as long as your background is at best homegrown or academic. Experienced people know that if you are doing some Java projects there will be a lot more kinds of files to deal with, like:

  • *.properties files
  • *.xml files
  • *.jar files
  • *.war files

And they are really indispensable in serious Java projects. Additionally some people state that having all methods explicitly defined inside class entities make the class unreadable. I don’t know if I can agree with that, just wanted to point out that having everything in one file is not something that can be unanimously thought of as an advantage. The Ada programming language, for example, is also using header files, despite that it doesn’t use #include directive.

You can also quickly point out that header files is the only thing that makes any addition towards what is in Java. Rest of the mappings are simple: *.cc is like *.java, *.o is like *.class, *.so (*.dll) is like *.jar. If you try to point out that there are also executable files, don’t forget to mention that in case of Java you’d either have to create it manually as a script that calls java interpreter with passing the explicit class name to run its main() or you have exactly the same situation if you compile to native code with gcj.

4. Binary standard

And this is my favourite: this guy completely doesn’t understand the difference between “Java programming language” and “Java Virtual Machine”. JVM is just a platform, a runtime platform that can run programs, for which the programs can be compiled and so on. And Java is not the only language in which you can write programs for JVM platform.

There’s a variety of languages implemented for JVM, maybe not all of the existing ones, but at least there is an interesting choice: Scala, Groovy, JRuby (Ruby), Jython (Python) and Jacl (Tcl). Interesting thing about the 3 last ones is that they normally are scripting languages. You can write programs in Tcl language, looking as normal scripting language program, in which you can create Java objects, call its methods and even create classes and interfaces – this is possible due to reflection feature. As long as there is a language implemented for JVM, you can write a program in this language, not necessarily in Java. It’s also not a big deal to provide a kind-of compiler that produces the JVM assembly code from the Jacl source.

On the other hand, Java is also just a programming language and it can be implemented for any platform, it’s not just fixed to JVM. One of compilers that can compile Java programs to native code is gcj (coming from gcc collection). I have even heard that using this compiler to compile Eclipse may result in much better performant code, comparing to compiling with javac. The resulting code is, obviously, not binary compatible with those compiled by javac.

So, unfortunately for the author, Java (as a programming language) doesn’t have any binary standard. The only one thing that has a binary standard is JVM – but heck, this is just a runtime platform, for the God’s sake, what’s the deal with binary standard! Every runtime platform must have something that is considered “binary standard”. What, you’d say it has the same form on every physical platform? Well, .NET has it too, LLVM has it too, Argante has it too, even Smalltalk VM has it, and the OCaml binary representation has it. Guys, come one, what’s special in JVM’s “binary standard”?

5. Dynamic linking

This was partially explained with the module problem for C++. Yes, C++ still uses the old C-based dynamic libraries, which simply means that if some feature is “resolved” by the compiler and cannot be put into *.o file, it automatically cannot be handled by dynamic libraries. But heck, what’s the “DLL hell”? Looxlike this is one of many guys that think that C++ works only on Windows. In POSIX systems there’s really no such thing as “DLL hell”.

But, again, this is also specific for JVM. Yes, unfortunately. Java programs compiled to native code suffer exactly the same problems as C++ on native platforms.

Or maybe this is a problem with dependencies and versioning. Oh dear… you shouldn’t’ve really suggested a problem like that. That’s one of the properties of Java’s “modules” and one of the problems you can get quickly and roughly convinced when you work with Maven. Imagine: you have a library for Java, and you are using in your code a class that this library provides. You do “import com.johndoe.bulb.*” in your code (about 90% of Java “coders” have no clue that this is the same as “using namespace com::johndoe::bulb” in C++, that is, nothing more than shortening the name). But you are using in your code some feature that is not provided in some earlier version of the library. Now… can you specify something in your sources that would require particular version to be used? Or, if you have multiple versions, slotted versions, specific versions etc. – well, this can be done, as long as Maven manages things for you. This only fixes the problem that has arisen in the programming language – kind of patch.

Do you want to see a perfect module management system? Look at Tcl. It comes with “package require” command that specifies the package and optionally its minimum required version (creating packages before 8.5 was horrible of course, but 8.5 version comes with new feature called “modules”, that is, the only thing to do to make the package available is to put *.tm file in one of search directories). In Java language you just don’t specify the package (in the physical sense, not in Java sense – it’s maybe better to call it “jackage”, just like “jinterfaces”?), the package itself is searched through all existing packages in the system to which the CLASSPATH leads, which maybe provides the symbols used in the source file being compiled. In the default environment you are just unable to specify the version of the package you are actually using. And moreover, you can accidentally have multiple such packages in your system and the first found, by searching through CLASSPATH, is considered as the required one.

Why so many words about that? Well, mostly because this exactly is what “DLL hell” means, if you just take your time and try to search for this term. By the same reason you have a “CLASSPATH hell” in Java.

6. Portability

This is the funniest fun ever.

Comparing a portability of programming language, which generally describes how it should be implemented with the best effort to become the fastest programming language for particular platform, and a portability of a “one platform language”?

In English language, specific for software development, the verb “port” means to take the source project and aim to adjust it to the needs of a different platform than that for which it was initially created. For example, you may have a Linux application and you “port” it to Windows or to QNX. So, “portable”, in the software development specific meaning, means that the application written in this language can be (at least there is a possibility) preferably easily, or even effortlessly “ported” elsewhere.

If you measure the true value of “portability”, the only way is to state how many runtime platforms can be thought of as currently being important on the market and have any higher level programming language implemented for that, and then realize, for how many of them the particular programming language has been implemented. If we state that there are something about 10 such platforms, say we include .NET, LLVM, Windows/x86, SunOS/Sun SPARC, SGI, MacOS/PowerPC, JVM, Linux/ARM, Argante and VxWorks/PowerPC, C++ is currently implemented for 8 of them, so its portability is 80%. For the same set of platforms, Java programming language achieves 10% of portability.

Worseover, if you really take seriously the comparison of a native-compiled language and a virtual machine, note that the JVM is also not implemented in all of these platforms. From this point of view – as long as I didn’t make a mistake – it only achieves 70% of portability. I didn’t even mention some limited platforms, for which C++ is implemented, although with a limited number of features, like for example, you cannot dynamically allocate memory (it doesn’t make C++ non-standard, it just requires that every call to “new” results in std::bad_alloc and every call to “new(nothrow)” results in nullptr).

Well, platform specifics, well, so many #ifdefs etc., guys, come on. Maybe it was true 10 years ago, maybe this is because of required some very detailed platform specifics and it’s needed for the best performance or, say, a specific lock-free code (CAS) – although in Java you usually don’t think about it simply because if you use Java you don’t think about performance. I understand that there are lots of #ifdefs used in header files, but please, if you don’t like #ifdefs, just don’t look into the header files. These header files are using these specifics in order to make your application code best performant and not having to use them also in your application code. Since C++11 standard all compilers are the best C++98 compliant and no serious software producer uses a compiler that doesn’t support it.

You may say that this portability means that in case of C++ you must perform the complete test procedure for a prospective new platform (that is, always do some porting job), while in every place where you run a Java bytecode it will always work and behave the same, so you don’t have to test this program in every platform. If you really think so, you’re more than childishly naiive… For example: try to think about just a simple thing as filesystem. Try to access a file being given its path in POSIX systems and in Windows. This is just a “no way”, in both C++ (at least standard) and Java. Windows-specific path won’t work on POSIX and vice versa. The only way to achieve portability is to take some base path out of which the other should be drawn, then add path elements so that the path can be composed. State we have the “base” variable that holds the base directory – you can do it portable (yes!) way in Tcl:

set dir [file join $base usr home system]

and in C++ with boost.filesystem:

dir = base/"usr"/"home"/"system";

but in Java you have to worry by yourself how to compose the path. In a string. Ok, the system gives you a property that returns the path separator character used in current system and you “only” have to worry about gluing the string properly (Tclers and Boost users will be still ROTFLing). Just pray that your application never run on VAX VMS, where the file path “alpha/beta/gamma/mu.txt” looks like “[alpha.beta.gamma]mu.txt”. Do you still want to say something about Java’s portability?

7. Standard Type System

Facepalm. One of the most important thing in C++ standard is the standard type system. Of course, it not always comes with a fixed size for every type, but who said that it’s not a standard type system because of that?

Actually this slide doesn’t even talk about standard type system. It talks about the features in the standard library. Maybe C++ deserves to have XML parser and database connectivity library in its standard library, but heck, this first required that the standard be modularized. It’s very hard to define the C++ standard this way and it’s hard to convince the standard committee to do it. And it can’t be done without modularizing the standard because this way lots of C++ implementation would have to declare non-compliance because a C++ implementation for some tiny ARM system to be used in a refrigerator does not feature database connectivity. Even though if this is provided, a lame-legged dog wouldn’t use it.

Actually there are lots of C++ libraries that feature XML parsing or database connectivity. Maybe just no one needed them as a standard. I would even predict that probably they are in Java because there must have been some initial libraries provided for this language or otherwise it wouldn’t attract attention. Same with GUI. I really don’t find it a problem. And somehow rarely anyone is writing GUI programs in Java. If you think about Eclipse, then don’t forget that SWT, which is used for GUI, is completely implemented in the native platform (for example, in Linux it’s implemented on top of Gtk). Why SWT, not the “standard” Swing or AWT? Well, guess.

8. Reflection

Hello? Is anyone there? Is this reflector really lighting? Ah, well…

Did you remember to have ever used reflection in your Java application? I mean really serious Java project, where you write an application running in the web application server; no other kind of Java development is serious. So, did you? Of course not. Why? Simply because it’s a feature just “for future use”. There are some tools that use it, like for example, Java Beans. But user code, application development? Guys, come on, if this language required from the user that they use reflection in their application code, users would quickly kick it in the ass and use other language.

Reflection is something that allows a language to easily implement things like runtime support, adding plugins to the application, or having some more advanced framework. Comparing to Java, C++ has lots wider use in the software world, and not each one of them requires this feature. But, again, if it’s required, there still are libraries that allow for that, for example, Qt library features reflection, at least for QObject-derived objects, but it’s obvious that there’s no need for anything else. So, you need reflection? Use correct library. The fact that Java supports reflection by default doesn’t mean that it’s such a great and wonderful language, but that the use of Java language is limited to use cases where reflection makes sense, or at least where it can be sensibly implemented.

For a usual Java programmer, it is really of completely no use.

9. Performance

Although performance is typically not considered one of the benefits Java has over C++, Purdy argues that garbage collection can make memory management much more efficient, thereby impacting performance.

Rather “thereby making the performance a little bit better than horrible”. Of couse, GC improves performance of memory management (and it can be even easily shown with GC for C++, especially if you compare it to shared_ptr). It doesn’t come without a price, though. The price is increased memory use. There is usually kept some memory that is not yet reacquired. Also the unified memory management help Java a bit. But Java easily wastes this performance by having overheads provided in java.lang.Object, and all user types can be only defined using classes derived by default from this one. This one even more increases the memory usage. Think at least about that how the performance decreases also by the high memory use itself! Maybe it does not impact the application itself, but it definitely does impact the system and this way may affect the application from the back.

In addition, Java is multi-threaded while C++ does not support multi-threading.

Yes, at least when you think about the “standard C++98” (it’s changed in C++11), but it somehow did not disturb people writing threaded applications in C++. So what’s the deal?¬†It’s easy to suggestively say “does not support multithreading”, but if you said “it’s impossible to write multithreaded C++ programs” you’d be lying. So, isn’t it better to just shut up?

Do you want to see how threads are¬†supported by the language? Look at Ada. Do you want to see how multicore programming can work? Look at Makefile. Yes, really. This simple task automating tool (because this is generally what make is) can run tasks in parallel, as long as they are considered independent. This is the real support for threads. Java’s thread support is just a thread library plus one slight keyword that performs acquire/release for the whole method or block (which is only provided for convenience just because Java does not feature RAII/RRID – this thing in C++ can be implemented in a library, too). I completely don’t understand why this is any better than Boost.Threads but the fact that they are not in the standard library (in C++11 they are).

C++’s thread safe smart pointers are three times slower than Java references.

Well, even if it may look strange if once he says that C++ does not support multithreading and next that C++ has something thread-safe… Really, only three times slower? They are¬†ten times slower than C++ pointers using Boehm’s Garbage Collector. And, one more important thing – this guy is probably speaking about shared_ptr, not unique_ptr, which do not use mutex synchronization, so they are just like plain pointers.

I suspect the poor performance of shared_ptr comes from the fact that it is “screwing in” the reference counter to the object; it’s said that std::make_shared should ensure better performance because this one can allocate at once one big piece of memory that keeps the refcount and the object itself in a solid block; this at least decreases the number of dereferencing. However still a compiler support may be required to optimize out unnecessary refcount modifications.

And Java has HotSpot Java Virtual Machine (JVM), which features just-in-time (JIT) compilation for better performance.

For “a little bit better than horrible” performance, did I already say that? Ok, no kidding, yes, JIT really makes the performance better; an application running on long-run application server does not suffer any big performance problems towards a similar application in C++. There are just three small problems:

  • It doesn’t change the fact that these Java apps still gorge the memory with a terrible pace
  • It should be a really long run. Freshly started server comes with a really poor performance and it betters as long as the server runs.
  • C++ can take advantage of JIT compiling, as long as you use a specific compiler. For example “clang” compiler for LLVM virtual machine is using JIT compiling when it runs and this way produce an overwhelming performance. Important thing is that it’s doing it on a code that was already compile-time-optimized.

10. Safety

Well, if you are talking about Java programming language, it does not provide any safety by itself. The safety is provided by JVM, so it’s available for C++ as well (as long as I can make it with the implementation :).

As it comes to these 5 reasons why Java still has things to overcome for C++, there’s nothing I can say, but just reminding that there are still many people that state that Java can be a very good language for real-time programming, and its performance will be better and better. Well, most funny thing is that I heard these things already 10 years ago. And Java’s limitations are still valid.

11. Conclusion

Don’t treat these things on these slides too seriously. They really look as a good summary of the differences of Java to C++. But if you mention these things when applying for a C++ job (when you apply for a Java job, you are unlikely to be asked that), you may be treated as just not a professional.

Java had to rule the world of software development. It didn’t make it. Not because it didn’t manage to overcome the limitations. It didn’t make it because there urgently grew a market for C++. And the users of utility electronics, now widely using software, had no will to adjust to Java.

Posted in Uncategorized | Leave a comment

C++ concepts that wouldn’t be considered harmful

0. Background

Templates have been accepted in C++98 ¬†without concepts (although they were considered) because at that time no one saw any advantage of them. However, after some years of having long and bloated error messages when a template can’t be resolved, concepts now make sense. So lots of work was done to define such a feature for C++. The final form of concepts, implemented in an experimental ConceptGCC branch of gcc compiler, is quite well described on Wikipedia.

The works on this feature, as I can personally evaluate, were probably shifted to wrong direction, so it eventually got far away of the main goal. It’s much better that the concepts in this form have been removed from C++0x. But the problem still remains and has to be solved.

Bjarne Stroustrup enumerates several problems and misunderstandings concerning concepts, especially their latest version. It looks like the first idea of concepts has been only spoiled later. But it suffered also from one problem in the beginning: the syntax does not coincide with the syntax of templates, which makes concept a kind-of “alien” feature, not matching the rest of the language.

Because of that I have created my own proposal.

1. The basic syntax

A quick example: a template definition using a concept (that applies limitation on a type) would look like this – just like the original proposal:

template<class T>
requires LessThanComparable<T>
T min value(T x, T y) {
   return x < y? x : y;
}

and the LessThanComparable concept is defined this way:

concept<class T> LessThanComparable
{
    bool operator<( T, T );
};

Note similarity of the syntax between “template” and “concept”. Also similarly, if we want to provide this feature to a type that does logically the same as operator <, but with different name (a “concept map”), we’ll do:

concept<> LessThanComparable<std::type_info>
{
    bool operator<(const std::type_info& t1, const std::type_info& t2)
    { return t1.before( t2 ); }
};

Yes, this is also called “partial specialization”. Just like templates, concepts also may have (even partial) specializations (which functions like concept maps) and also can be instantiated (instantiation is required to “invoke” the match on a given type). Although their main use is to become a match for template entities that are expected to have particular features (concepts are always auto-applicable), additionally concepts have the following features:

  • can provide lacking features to types (“patch” a type)
  • can be derived (also multiply)
  • can be abstract (such a concept does not match anything)
  • can be bound in logical expressions
  • can be matched partially
  • can be used to synthesize a code template
  • can have multiple parameters, including variadic

The most general definition of concepts’ syntax is (names in .. are descriptive, fragments in ?{ … }? are optional):

concept<..parameters..> ..ConceptName.. ?{ <..specializationArgs..> }?
   ?{ : ..DerivedConcepts.., ... }?
?{ {
    ?{ ..specificStatements..; ... }?
    ?{ ..requirements..; ... }?
} }?
;

Where:

  • ConceptName: new concept’s name, if master definition, or existing concept name, if specialization
  • parameters: just like template parameters (no overloading or default parameters)
  • specializationArgs: arguments passed to the concept as specialization (just like in template partial specialization), used only when defining a specialization
  • DerivedConcepts: concepts that are prerequisites for the defined concept
  • requirements: definitions that should be available so that the concept can be meant satisfied: function/method definitions, requirement expressions, or requirements for elaborated types
  • specificStatements: just placeholder for future changes ūüôā

The concept usage is:

1. When defining an entity template:

template<..args..>
requires|desires|constrains ..ConceptInstantiation.. , ...
..TemplatedEntity..

2. When requesting a code template synthesis:

// synthesize code in current scope as working on given type
requires ..ConceptInstantiation.. ;

// synthesize code in current scope with new type name
using ..NewTypeName.. = typename ..ConceptInstantiation..;

Where:

  • ConceptInstantiation: ¬†an expression of ConceptName<args…>. Just a “requires” with this expression brings all type patches into all types for which any are provided by this concept, within the current scope
  • NewTypeName: a name of the type that is created by patching. The syntax using this should use only such concept that applies patches on only one type. The original type is then left untouched and the patched type is identified by this name.

New keywords: concept, requires, desires, constrains.

The definition of a requirement inside the concept must be in one of two forms:

  • interface element declaration (function, method, or type definition)
  • requirement expression (will be covered later)

The interface element declaration needs only to be provided in one of possible forms, however it’s only important that there is usually some way to call it. The real interface element may be defined different way, but still the way allowing it to be called with the same syntax. The following equivalences are allowed:

  • Type equivalence: real return type may be convertible to requested type and argument type of requested interface may be convertible to argument type of the real interface (when this is not desired, the type should be preceded by explicit)
  • An operator may be defined either as a method or as an external function
  • Non-explicit constructor A(B) requested in the concept is allowed to be satisfied by B::operator A() (again, use explicit modifier, if this is not desired)
  • A list of function or method parameters may be longer, with the excess parameters being default. Such a function is callable the same way as that mentioned in the requirements (again, use¬†explicit modifier after the closing parenthesis of the function, if this is not desired – especially important if, for example, the function is required to have a strict signature due to being taken the address from)

The syntax like “template<LessThanComparable T>” is not possible because concepts must always be supplied with parameters where used – just like templates. Also, they can be used with more than just one parameter:

requires LessThanComparable<T>

requires Convertible<T, typename U::value_type>

requires TypeIsOneOf<T, int, char, bool>

There still is a simplified syntax for a concept with one parameter, but it’s specified in place of type, where it is used, not as a prefix in concept parameters. This will be covered later.

2. Special variants of the syntax

a. abstract concept and empty concept

The basic form of a concept is a pure abstract concept. Such a concept is never satisfied (because it’s unknown whether there is any requirement):

concept<class A, class B> IsSame;

And this is the empty concept:

concept<class T> Defined {};

Such a concept is always satisfied (because it doesn’t impose any requirements). Note though that incomplete types are not allowed to be concept’s parameters.

The IsSame concept can be then declared as satisfied by specialization:

concept<class T> IsSame<T, T> {};

It means that in general IsSame isn’t satisfied, but in a case when some type has been passed to it as first argument, which is the same as the second argument, then it matches the partial specialization shown above, and this way the concept is satisfied.

b. concept derivation and partial specialization

Concepts can be derived. The syntax is the same as for structures and the meaning is also the same: the contents of the base concept are incorporated into the derived concept. In result, the derived concepts imposes its own requirements plus all the requirements that come from the base concepts.

You can also derive an abstract concept from another non-abstract concept. The concept doesn’t become less abstract because of that, but it carries additional rule: Any partial specialization that would like to redefine the concept for specified type must also derive the same set of concepts as the master definition does.

There are two main things that make concepts different than templates:

  • concepts must always have exactly one master concept definition provided
  • partial specializations are only allowed (and required) to cover particular requirements, as defined in the master definition

The first statement means that you cannot “announce a concept and define it later” – when you created an “announced” version (that is, abstract), then later you can only add specializations.

The second one means that you cannot add requirements in the concept specialization nor can you remove them (every single requirement must be covered) – the concept specialization does not “derive” any requirements from the master definition.

RATIONALE: You may consider, why lacking definitions cannot be taken as default from the master definition. There are two reasons why this shouldn’t be done:

  • The goal of this proposal is to make concepts similar to understand as templates. So if template partial specializations do not silently derive from the contents of the master definition, so shouldn’t the concepts do.
  • Even though it is sometimes required, it should be explicitly visible for the user that there are defaults used (or every requirement is covered otherwise)

However, as it is sometimes desired that definitions from the base concept be “silently derived” by the specialization, these are the possible methods for how to provide a syntax for that:

1. “self derivation”:

concept<> Container<MyContainer>: Container<MyContainer>
{
   size_t MyContainer::size() { return this->numberItems(); }
};

Here the concept derives, however not “itself”, but “such an instance of this concept that would be generated from master definition” (because until this partial specialization is finished, this is what this concept instance resolves to).

2. using “default” keyword as derived:

concept<> Container<MyContainer>: default { ... };

3. using “default” keyword in the specialization:

concept<> Container<MyContainer>
{
    using default;
    ...
};

The self-deriving syntax has one important advantage: it suggests the user that there was a derivation, which means that the deriving entity incorporates definitions from the derived entity, just like classes do. It’s still worth considering a form of deriving syntax. On the other hand, it may cause to be allowed that concepts are being first used and later specialized, which would result to have two different concept definitions for particular type under the same name (and this will most likely lead into problems). It would be then better probably to issue an error when a type specialization was provided explicitly after it was generated.

c. concepts with variadic parameters

Concepts may have also variadic parameters:

concept<class T, class Arg, class... Args> TypeIsOneOf
{
    requires IsSame<T, Arg> or TypeIsOneOf<T, Args...>;
}

How to make a “termination” version, stating that concepts cannot be overloaded? No strict idea, although this is the proposal (it may be considered later for variadic templates, too):

The auto-expandable expressions (those that use “…”) are treated special way and they are replaced with some special concept instantiation statement, let’s name it “Nevermind”, if “Args…” resolves to nothing. Then:

  • A and Nevermind resolves to A
  • A or Nevermind resolves to A
  • not Nevermind resolves to Nevermind
  • single Nevermind, as resolved from some expression, resolves to nothing
  • requires {…something that resolves to Nevermind…} resolves to nothing
  • Any concept or template instantiated with the use of¬†Nevermind resolves to Nevermind

In other words, the part that has “Args…”, resolved to nothing in particular case, will disappear as a whole and will drag the preceding operator with itself (and every higher level expression that contained this resulting Nevermind). If the expression was required to be nonempty by some reason, then when it resolved to nothing, an error is reported. So, the above concept, when used as:

requires TypeIsOneOf<T, int, char, bool>

will resolve to:

requires IsSame<T, int> or IsSame<T, char> or IsSame<T, bool>
  /* or Nevermind */ ;

3. Code synthesis and type patching

For example: how to define that an external function begin() and end() is required, although we can still live with methods with these names? There are two ways to accomplish that:

  • harder, first define a concept that will match a type that contains begin() and end() methods, then define a partial specialization for such a concept
  • easier, define requirements together with default implementation

Let’s try the easier one first:

concept<class T> Sequence
{
    typename iterator = T::iterator;
    iterator begin(T& c) { return c.begin(); }
    iterator end(T& c) { return c.end(); }
}

This way we provided a concept with default implementation. It means that this concept is satisfied for every type, for which the¬†begin external function¬†is provided (and the others as defined in the concept). However if there is no such function defined, the compiler will try to instantiate the default implementation. If the instantiation succeeds, the concept is meant as satisfied, and – pay attention – the “fix” for the type that requires it to satisfy the concept, is applied! It means, for example, that:

template <class C, class T>
requries Sequence<C>,
requires Convertible<T, typename C::value_type>
bool has(C& cont, const T& value )
{
   for ( C::iterator i = begin( cont ); i != end( cont ); ++i )
      if ( *i == value )
         return true;
   return false;
}

… in this example, it was possible to use begin( cont ) and end( cont ), even though there are no begin(C) nor end(C)¬†functions! Of course, only inside this function because only in this function the C type has been imposed requirements of the¬†Sequence<C> concept (and patches, by the way). In particular, C here has become not exactly the type of the 1st argument, determined by the template, but it’s C with appropriate type patches, as provided by Sequence<C> concept.

It means that inside the entity that required¬†that particular type satisfy particular concept, this type has also additionally all the patches that this concept has defined. For types that don’t satisfy this concept the template entity cannot be instantiated.

The harder way will be shown later, together with overloading.

Let’s try another case. You know that in a part of C++ standard library, formerly called STL (I personally call it “CIA” – Containers, Iterators, Algorithms – as it still needs distinguishing from the rest of the standard library), there’s a concept of InputIterator and OutputIterator, which have one common treat, that is, they are both “single pass iterators”. The normal way to use single pass iterators is to use the “*x++” instruction. What is not allowed for this one (and it’s allowed only for multi-pass iterators) is to use only one of these operators, that is, use * operator without doing ++ in the same instruction (in particular, it’s not allowed to perform next * after previous * without ++ in the middle, and same for subsequent ++).

However the method that was used to achieve it is awkward – it’s simply the * operator does the “pass” as a whole, and ++ does nothing. It makes that it’s practically possible to make *x twice, although it will behave as if *x++ was made. It is then desired that these both things, as they should not be separated, are done by use of exactly one instruction:

 x.input();

It would be much better then, if InputIterator and OutputIterator have only defined input() and output() methods respectively (and not * nor ++ operators). This would be then:

concept<class T> InputIterator
{
    typename value_type = T::value_type;
    value_type T::input() { return *(*this)++; }
};

concept<class T> OutputIterator
{
    typename value_type = T::value_type;
    void T::output(const value_type& val) { *(*this)++ = val; }
};

The << and >> operators may be also good for that, although the >> operator doesn’t give it a chance to return the result by value, but on the other hand it would make the std::cout object also satisfy the requirements of OutputIterator.

You can guess that now only the purely single-pass iterators should have the input/output methods defined. For the others it’s just enough that they define * and ++ operators.

It would be a nice idea to have such a change in the standard library – however the current algorithms that are allowed to work on single pass iterators (most of them are: for_each, copy*, transform*, merge*, but not sort) would have to be changed. Such a change is possible to be made in the standard library, although the * and ++ operators should still be provided, with an annotation that they are obsolete so that the user algorithms working on single-pass iterators can be adjusted to the new form.

Thanks to concepts, there will be no need to provide input() and output() methods for multi-pass algorithms separately.

4. Lesser concept satisfaction requirement: desires

One of the main reasons for which the concepts were designed was to have better error messages. This is what we often want. It doesn’t simultaneously mean that we’d like to force a type satisfy the concept, as we, say, don’t really use all the features that are imposed by the concept. Practically we can live with some of definitions lacking – for example, we need that the type provide a copying operator=, but not necessarily a copy constructor. We have a CopyConstructible concept that contains both of them and we’d like to be able to use it. However we don’t really want that this type be CopyConstructible; we just use CopyConstructible to check for copying operator=. Of course, we can always define a new concept that contains only operator=, but this sounds like a usual explanation of language designers’ laziness “well, you can easily achieve that by making a new class for 200 lines of code”. It should be able to provide a user with a fragmentary matching, so that existing concepts can be reused.

When you use “desires” instead of “requires“, then you’ll still have the same simple and good error messages in case of error. When the compiler is compiling the template entity, it will check what exactly features of the type are being used, and they are matched with the “desired” concept, then a new unnamed concept will be created, for this particular case, that consists only of such features of the concept that are actually used. It means that the concept matching in this case will always succeed, although the entity is still allowed to use features from a type not covered by the concept that is “desired”. If a required feature is not covered by the type, then:

  • if the feature is not provided by the concept, a usual “very long and bloated” error message is printed
  • if the feature is provided by the concept, the compiler complains that the type does not satisfy the desired concept of specified name (and some feature of it in particular)

Note, however, that weak concept matching only means, in practice, that if a type does not provide a feature being used, the error message would refer to a not covered concept rather than showing what feature is lacking (usually with a long and bloated error message). It doesn’t provide other concept features like:

  • overloading by concept type
  • patching the type with additional definitions from the concept (even if the entity uses them)

It means that, for example, if you “desire” and InputIterator, as shown in the example above, the entity is using x.input(), and some multi-pass iterator is passed as x, it won’t work, because the mapping from x.input() to *x++ will not be created. The compiler will show an error message that the concept is not satisfied, pointing this method.

TO CONSIDER: Use “desires explicit” to only have the type fixes, as long as their definitions can be compiled in this particular case. In this case the link from x.input() to *x++ will be created as long as * and ++ are defined. Or, simpler, maybe it’s better to use this solution as default – that is, even when “desires”, the type fixes should be applied. Although this still should not allow that this declaration be used to distinguish type parameters for overloading.

5. Overloading and partial specialization

Overloading is simple:

template<class Iter>
requires RandomAccessIterator<Iter>
void sort( const Iter& begin, const Iter& end );

template<class Container, class Function>
requires RandomAccessContainer<Container>,
requires CallableAs<Function, bool(typename Container::value_type)>
void sort( Container& cont, Function predicate );

Overloading resolution is possible, even for template-parameter types, as long as the requirements imposed on these types are mutually exclusive for at least one argument position. In this particular case we have both arguments have types of mutually exclusive concepts that they satisfy, although you can theoretically think that it’s not impossible to satisfy both CallableAs with given signature and being an iterator (although it would be stupid).

A type for which a requirement was imposed becomes a kind of another type. That’s why it’s possible to treat it as if it was a separate type, only inside a template that uses it.

Now let’s try similar things with templates’ partial specialization. Normally we define a partial specialization the following way:

template<class T>
class SomeClass<T*>
{
...
};

which means that we provide a specific definition of a SomeClass template, where its argument is of T* type for whatever type T. We can do the same with types that satisfy particular requirement:

template<class T>
requires SomeConcept<T>
class SomeClass<T>
{
  ...
};

or, simpler:

template<class T>
class SomeClass<typename SomeConcept<T>> // such T that is SomeConcept
{
  ...
};

Of course, please note that the T type used to specialize the SomeClass template is not the same T as in the first line. The last T is such a T that has been imposed requirements defined in SomeConcept, including default implementations. However it’s a definition of a partial specialization only for such types that satisfy SomeConcept.

As you can provide this additional thing for partial specializations of a template, you can do the same with partial specialization of concepts:

concept<class T> SomeConcept<T>
requires AnotherConcept<T>
{
 ...
};

In this particular case, “requires” does not mean that the concept (specialization) being defined imposes AnotherConcept on T, but that the concept specialization being defined concerns only such types that already satisfy AnotherConcept. All requirements being imposed on a concept being defined are defined always exclusively between { and } plus requirements provided by the derived concept.

Now we can try the harder way of providing the information about satisfying the Sequence concept for classes that have begin() and end() methods. First, we define a concept that detects whether the given class has begin() and end() methods.

concept<class T> SequenceClass
{
	typename iterator;

	iterator T::begin();
	iterator T::end();
};

Having that, we can make a specialization of Sequence concept:

concept<class T> Sequence<T>
requires SequenceClass<T>
{
	typename iterator;

	iterator begin(T t) { return t.begin(); }
	iterator end(T t) { return t.end(); }
};

Actually, we did it exactly the same way, as we still needed to provide the implementation that shows how the concept is going to be satisfied. The main difference is that this particular specialization is provided only for T classes that satisfy SequenceClass<T>, not for every possible class, for which particular default implementation can be successfully evaluated.

6. Check by usage

I’m not sure whether it’s a good idea for such a feature, however it may be meant useful. Instead of a definition to be provided that should be replicated in user’s code, we can define expressions that will be tried to be evaluated. The expression to be evaluated has also restrictions about the result type:

  • if the return type is void, it only needs to be successfully compiled
  • if the return type is bool, it requires to be successfully compiled, evaluated at compile time and it should evaluate to true
  • other types are not allowed
concept<class T> Sequence
{
	typename iterator;

	requires { (void)begin( T() ); }
	requires { (void)end( T() ); }
};

Please note though, that the practical consequence of the following code is that the requirements imposed are greater than they seem to be for the first look. We have used “T()” expression to magically create a value of T type. This magic, however, cost an additional requirement: now T is also required to have a constructor able to be called with no arguments.

You can prevent this, for example, by saying “begin( *((T*)0) )”. You can safely do it because this expression is declared to be of void type (as forced) and this way it won’t be evaluated (such an expression is never evaluated at runtime anyway, whether bool or void). Using operator comma would help, however in today C++ it’s not possible that this operator (not unlike any other operator though :)) be passed a void expression as argument. Because of that, you are still allowed to make a sequence of instructions, of which the last one must evaluate to void type (this is not allowed for bool type).

For boolean value of an expression, please note that the expression must be able to be evaluated at compile time. So, if you want to “require” an expression that will return a boolean value, even stating that you know that it will always evaluate to true, but it will call some function that is not constexpr, the compilation will fail. You’d better force (void) type for such an expression then.

The boolean expression can be done also the following way:

requires { std::has_trivial_destructor<T>::value; }

This expression can be evaluated at compile time, so it matches the requirement. It’s a similar requirement as for static_assert.

Check by usage is easier to define, but allows for making more mistakes. For requirements defined as function or method definitions there are many equivalences and additional statements:

  • argument or return type may be “auto”, which means that the exact type in this place is not imposed. Please note, though, that a concept that uses this feature is not allowed for constrained matching (see below) because it is not possible to determine whether the user entity is using the type correctly
  • argument types normally are allowed to undergo all defined conversions; if this is not desired, the argument should be preceded by “explicit”. It means, for example, that if the requirement is for “f(int,T)”, then the existence of “f(char,T)” satisfies the requirement; you should use “f(explicit int, T)” in order that the type be exactly int
  • same about the return type: if given return type can be converted to the return type in the concept definition, the concept is meant satisfied. Like with argument, you can force no conversion accepted by “explicit” modifier. This means additionally that if the requirement defines that the return type is void, the matching function’s return type may be anything
  • a constructor statement in a form A::A(B) requires that type A have a constructor with one argument of type B, or that B has a conversion operator to type A. If this is preceded by “explicit”, then only just one-argument constructor of A type is allowed (both implicit and explicit – here “explicit” does not disallow implicit A constructor, but conversion operator in B, from satisfying the requirement)
  • const T& is equivalent to T in argument types, although note that if type T has been simultaneously defined as non-copyable, it causes a conflict. In other words, if T is not required to be copyable, const T& or T&& must be used as argument
  • T& has no equivalences

7. Constrained matching

The concept matching can additionally state that the template entity, that is using the concept, is only allowed to use those type’s features that are defined in the concept. If a type feature is not defined in the concept, the template entity is not allowed to use it, even though the type itself defines it. The difference between “requires” and “constrains” is then that in the first case the template entity still can use features that were not described in the concept (although it will end up with a “usual bloated” error message, if the type doesn’t provide it). Here is then the summary of what happens when there are problems with matching the concept and finding a feature of a type, for all three cases, plus one – when the type is not matched to any concept at all:

1. when a template entity uses the feature that is provided by: not using any concept desires requires constrains
concept: no; type: no usual error message usual error message usual error message error: feature not defined in the concept
concept: yes; type: no usual error message error: type doesn’t match the concept error: type doesn’t match the concept error: type doesn’t match the concept
concept: no; type: yes not a problem not a problem not a problem error: feature not defined in the concept
2. when type just doesn’t match the concept N/A not a problem error: type doesn’t match the concept error: type doesn’t match the concept

It’s not covered a case, when we’d like to forbid using a feature that wasn’t defined in the concept (as constrains does), while not requiring that the type matches the concept, but that it only provide those features that were actually used (as desires does).

8. Type interfaces and logical matching expressions

Let’s explain first one thing. Like templates, concepts are being instantiated. When templates are instantiated, this results in a new class or function. When a concept is instantiated, it results with one or more type interfaces. This type interface is next used as a “patched” interface of a type that was imposed concept’s requirements on. As you know, when there are forwarding definitions in the concepts, which cause that there be some fixes provided to the type (like begin(T) external function that redirects to T::begin()), they are applied to the type, depending on what type of imposing was used:

  • requires: the type interface contains all things that are normally available for given type plus fixes provided by the concept
  • desires: the type interface contains all things that are normally available for given type, but not fixes provided by the concept
  • constrains: the type interface contains only things that have been defined in the concept, including fixes, but not things provided by the original type not covered by the concept (that is, elements of the type interface that haven’t been defined in the concept are removed from the resulting type interface)

Type interfaces may be also shared – one function may require several different types of arguments, so each such argument type is said to be provided a new part of type interface for itself. It means then that it is not unusual that one concept provide multiple type interfaces at a time (which could be similar to having an overloaded function – multiple functions under one name). However, when for particular type there was provided a type interface, then no new type interface can be provided for the same type anymore. This would be something like creating two different typedefs with the same type name. Because of that if you want to create multiple requirements for the same type coming from multiple concepts, you must bind these concepts into a concept expression. It’s especially required if you want to define, that a type should not match a concept.

You can then bind particular concepts into logical expressions using¬†and,¬†or and¬†not. Note that this is only allowed for¬†requires and¬†constrains (not desires, and constrains¬†may only be used as a defined requirement) and the expression has strong influence on the resulting type. The full expression must be passed just after¬†requires¬†keyword and, exceptionally, only the¬†and,¬†or and¬†not keywords are allowed for use, not the equivalent logical operators “&&”, “||” nor “!”. For example:

requires not Assignable<T>

requires Assignable<T> and CopyConstructible<T>

(RATIONALE: using &&, || and ! operators will suggest that the logical operations are being used with the C++ bool type. This is due to a tradition of C++, also coming from C; also some coding standards disable using the word-operators. The common sense then is that if symbolic operators are used, then the expression is surely an expression that requires boolean types, so it should be sure for the common sense people that concepts themselves can be used as boolean expressions, which isn’t true. Because of that it is reasonable to make use of already existing keywords that are predicted to be used in logical expressions, although the use of them is quite rare today, and they better compose with these more high level instructions.)

Two separate concepts applied for one type compose its type interface (see below) together, so it must be explicitly stated how it is built – if you then use two separate concept matches for the same types, you create two different type interfaces for the same type, which causes a conflict. In other words, if you have once used some type in one “requires” statement, it can’t be used in any other requires statement in the same entity anymore. If you use concepts that have more arguments than one, and the types are not equally distributed between used concepts, you’ll probably have to compose a complex logical expression to express it:

  requires MutuallyExclusive<A, B> and CopyConvertible<B,C>
       and ReverseIterable<A, C>;

It’s not unusual that one “requires” statement provide three separate type interfaces in result. It’s not allowed, however, that separate statements each provide two separate sets of requirements for the same type.

The situation is complicated in case when the type is a class that defines some type inside it. If a concept does not impose additional requirements to some type defined inside it (and it usually doesn’t), then using the internal type is a separate thing to using the external type. It means, for example, that you can define a concept matching for T in one requires statement and for T::value_type in another. This is allowed, but only in case when the concept being matched for T does not require that T has value_type and additionally imposes some requirements on it. If it does, then – just by a case – the concept imposed on T is imposed also on T::value_type (implicitly because it’s defined inside the concept definition), so this way adding a separate concept matching statement for T::value_type will also result in a conflict (and you need to compose them in a logical expression, if you want to solve the conflict).

RATIONALE: Why to require that type be used only once? The reason is to not allow the user code to become bloated and having too many hidden and implicit interconnections (it’s the same as not allowing classes to be open). In other words, if a user would like to make a bloated and messed up concept interconnections, let it define it as a complex, long and unreadable expression tree – and if it’s unable to do it, it’s good because probably we have saved this way someone’s one week of work.

Of course, there is a special case when it is allowed: if, by a case, two separate requires statements used for the same type require exactly the same concept to be imposed (that is, you can impose the same concept multiple times – two separate type interfaces in one entity are allowed, as long as they result in the same, just like two separate typedefs for the same name are allowed, as long as they resolve to the same type). It is reasonless for normal “requires” statements, of course, but may be helpful in case when there is some implicit requirement for some internal type.

Using things like “constrains not“, on the other hand, may cause erasing some unwanted feature from a type. Imagine, for example, that you have three classes:

  • A, that contains operator=
  • B, that contains operator= and assign()
  • C, that contains only assign()

Your entity is using operator= for whatever type is passed, and there may be passed classes like A, B and C. You’d like to use this class’s operator=, however if the assign() method is provided, it is preferred over possible operator=.

You can’t make it normal way because if you provide a concept that provides a default implementation for operator= as redirecting to assign(), for B class operator= will be used, not assign(), as it is required. For this particular case you have to use something like that:

concept<class T> AssignOnly
{
     void T::operator=(auto x)
     { return this->assign( x ); } // restore operator=
};

concept<class T> AssignOnly<T>
requires Assignable<T> and HasAssign<T>, //operator = and .assign
{
     constrains not HasAssign<T>; // but erase operator=
     // now restore operator=
     void T::operator=(auto x) { return this->assign( x ); }
};

Here “auto” has been used, which just means that we don’t care what exactly the argument’s type is. Note that in expressions defining the requirements the “requires” may be also followed by a requirement, which is merely the same as deriving a requirement. You should only remember that, by the same rule, there can’t be two separate requirement expressions with concepts that impose requirements on the same type.

Of course, the truth is that in this particular case it was simpler to make the user entity use assign() instead of operator= and make a concept that only defines “assign” with default redirection to operator=. In this case it will work. But there still may be more complicated cases where various conditions have to be imposed on various parts of the type in a concept and in this case such a simple inversion would not be possible.

Concept expressions may also be used to create an integrate concept with simplified syntax:

concept<class T> ValueType:  DefaultConstructible<T>
                            and CopyConstructible<T>
                            and CopyAssignable<T> {};

Note that this is a syntax for deriving concepts and the rule of not using the same type in separate expressions applies here as well. Note that you can also create an abstract, though deriving concept.

9. Explicit type interfaces and simplified matching syntax

Normally, without the concept, the type that is produced when instantiating a class template isn’t exactly the same as the template defines. There are created all internal parts (fields), but from all the features (that is, methods) there are extracted only those that are actually used (SFINAE rule). It means that the resulting instance of a template is a specific set of definitions composed out of what has been used by the type.

It’s not exactly the same in case when the type is imposed a concept on. In this case, all things that are needed to satisfy the concept, are instantiated, even if the template entity is not going to use some of them. However, as far as the other features are concerned, the selective use of features is still in charge.

Type interface is something that comprises a full definition of type’s features, that is, includes all natural features of the type and those that have been additionally imposed on a type as added from the concept. When you define a template, and one of template arguments has been imposed a concept, this type-argument becomes a type interface. It doesn’t matter whether the concept has only one parameter, or more. Even if the type is just one of several parameters of a concept, the succeeded match of a concept causes that for this type several additional statements might have been imposed.

It’s also allowed that such a composed type, out of the type itself and all additional definitions provided by the concept, be created explicitly:

using TypeInfo = std::type_info
         requires LessThanComparable<std::type_info>;

This syntax is required because there may be a case to say something like:

   requires Convertible<MyType, typename OtherType::value_type>;

However, exceptionally for concepts that yield one type interface (usually it’s the same as a concept with one parameter – not sure whether there are cases when it’s not true – note that concepts cannot be overloaded nor can have default parameters), we have a simplified syntax: we can use the “instantiated” concept as a type:

using TypeInfo = typename LessThanComparable<std::type_info>;

or, the old style:

typedef typename LessThanComparable<std::type_info> TypeInfo;

(CONSIDER: it would be nice to provide this “patched type” availability by declaring it inside the concept. For example, by using the following syntax:

concept<typename T> GoodLooking {
    using type = concept typename T;
    ...
}

you can make the patched type T exposed as GoodLooking<SomeType>::type instead of typename GoodLooking<SomeType>).

Note though that the full requirement for being able to use this syntax is that the concept imposes requirements on only one type at a time (not that it only requires to have exactly one argument), in particular, the imposed requirement may yield exactly one type interface.

Let’s repeat also the example from the beginning:

template <class C, class T>
requires Convertible<T, typename C::value_type>
bool has(typename Sequence<C>& cont, const T& value )
{
   for ( C::iterator i = begin( cont ); i != end( cont ); ++i )
      if ( *i == value )
         return true;
   return false;
}

Similarly it can be used with template partial specializations:

template<class T>
class SomeClass<typename SomeConcept<T>>
{
  ...
};

And, of course, it also works with C++14 generic lambdas:

auto L = [](const typename Integer<auto>& x,
             typename Integer<auto>& y)
{ return x + y; };

Please note that the type interface may differ, depending on the concept matching keyword:

  • desires: the type interface is exactly the same as the original type
  • requires: the type interface contains all of that the type provides plus possibly additional declarations provided by the concept
  • constrains: the type interface contains only those declarations that are mentioned in the concept, in a form that is actually provided by the original type (features that do not match anything in the concept are removed)

10. Concept as a code template synthesizer

Normally a template may only be a function template or a class template. When instantiating, you can only get a single class or a single function from the template. Sometimes it’s required that you generate a set of functions or classes by making a simple magic one shot definition. In current C++ if you want to achieve this the only possibility is to use #defines.

Concepts have additional feature that allows them to bring the additional fixes made to the code into any other scope. Normally such fixes are available only inside the template entity that required some concepts to be satisfied. The following statement brings the fixes just to the current scope – that is, this makes that std::type_info becomes, in the scope where this declaration is provided, the same as TypeInfo declared above:

using LessThanComparable<std::type_info>;

After this is specified (see LessThanComparable in the beginning), you can safely do (until the scope with this “using” ends):

 if ( typeid(X) < typeid(Y) ) ...

Using this method you can also provide a whole set of functions to types, which’s definitions you cannot change. It includes also operators, such as those provided by std::rel_ops. Using std::rel_ops namespace makes that since this declaration these operators are available for all types and only require to have == and < operators defined, which not necessarily is what you want. Using the “using concept” statement you can provide this set of operators only to exactly one specified type. This is a very useful feature for enums:

enum eSize { SMALL, MEDIUM, LARGE };
using OrderableEnum<eSize> // adds operator <
  and EquivalentOperators<eSize>; // adds rest of the operators

11. Name clash problems

Of course, names of types and of functions being defined are assigned to particular namespace. But the concept may be defined in other namespace than that when it was used by a template entity. In practice it means that the names of functions or classes being used in the concept requirement statements shouldn’t be the same as the namespace in which the concept was defined.

Names are, then, qualified in the namespace where they are being required. It means that the concept is “instantiated” in the namespace, in which it is used. It doesn’t mean much the exact namespace, in which the statements are instantiated. What is important is whether at the location where the concept is used, particular functions are already available in the current namespace – no matter whether they are defined in this namespace or are imported from another one. On the other hand, you can always put a namespaced name in the concept, and in this case the function is looked for in exactly this namespace. Note that this is the case when “constrains” can be made use of. If you wanted to use, for example, std::begin because this is mentioned in the concept, and you mistakenly used just “begin” and did not make “using namespace std”, the constrain won’t let you use it, even if you have a begin() function in the current namespace that is defined for this type.

12. Things that are not going to be supported

One of the main things that are not supported are explicit concepts, that is, concepts that are not satisfied by default, unless explicitly specified in the concept map. This was the default in the original proposal, here it is not supported at all.

This feature can be easily achieved using the following way: define an abstract concept with a name that you’d like your concept to have, then define (empty) partial specializations for types that you explicitly define that they satisfy this concept (the “whitelist” method). Or opposite way – define an empty concept, which means that every type satisfies it, then define a partial specialization as an abstract concept, for types that, exceptionally, should not satisfy it (the “blacklist” method).

You want to put additional requirements for them to have? What for? Isn’t it enough that you just specify that the type should satisfy it? Ok, if you really need this, you can use concept derivation: define a concept normal way, then define another concept, that you expect to be explicit, as abstract, and then derive it from the previous one. This way you’ll have an abstract concept, which can be only explicitly defined as satisfying the requirement, and additionally it imposes some requirements on a type.

I also haven’t described in this proposal any kind of “axiom”. I’m not quite convinced to that it’s worth to implement such a feature. But I think it could be just added to this proposal without change, in the original form. The only problem is that I’m not quite convinced that this should be a part of concepts. This rather should be a part of class definition, so that the class type defines what simplifications can be used by the compiler. Then, concepts may have this axiom mentioned in its requirements, but for concepts this thing will still be derived.

I also haven’t added anything about “late_check”, however I think this feature may be accomplished by using the “desires” keyword, which already means that the underlying expression may be imposed requirements, but not as a must. I haven’t done any research for that yet.

13. Additional things to be considered

One of the important additional changes to be considered is to provide multiple default implementations for a concept’s feature. It would be selected as the first one that works. For example:

concept<class T>
LessThanComparable
{
    bool T::operator<(const T& oth)
       { return this->before( oth ); }
    or { return this->precedes( oth ); }
};

I’m not convinced that it makes sense. The same result (and more readable) can be achieved by using also different concepts and make concept maps basing on partial specializations for classes that satisfy some other concept.

This brings back the discussion about accidental concept match: the fact that a class has some method with some name and with matching argument types, need not necessarily mean that this method can be used as a replacement for some lacking feature. It was better, for example, to provide LessThanComparable specifically for std::type_info because we know this type has the before() method that is to be used for ordering, so for the rules of CIA (STL) this is equivalent for operator<. But it doesn’t mean that any other type that has before() method, which accidentally returns bool and accepts an object of the same type, has specified meaning.

The practice is that if we want to make sure about correct feature recognition, then there must be used some “common sense” things. Operators like == or < are meant common sense. Also operator << meaning “send to” and >> meaning “receive from” (not shift left or shift right) can be also meant common sense. But operator >= for the operation “push on the stack” is not a common sense.

However this discussion may unnecessarily bring back the doubts whether it makes sense to blindly match the concept by method name, while it’s already proven to be a minor problem. First, usually concepts are not based on just one method. Second, concepts should be build to¬†help developers, not to protect them against anything. For this one then, it’s better to use common sense in defining concepts: the before() method for preceding is not the common sense – common sense for this operation is operator<. Likewise, common sense is to have size() method to return the number of items the object has, while any numberItems() or length() or anything alike is a specific naming for particular data type, so the concept specialization should be provided for this one.

Another thing to be considered is to make it possible to create requirements “on the fly” when needed for some other definition. For example, you’d like to define a concept specialization only for types that have operator=, but you don’t want to define a separate concept that will only detect whether operator= exists for this type. This way, instead of “requires Assignable<T>”, you’d say “requires { auto T::operator=(auto); }”.

I have an ambivalence about this. From one point of view it will allow users to create very bloated code definitions. On the other hand, without this feature users will be again forced to “spell up the name and consequently use it”, which works always against users. That’s why “auto” is such a great feature in C++11 because it doesn’t require that users define typedefs and spell up some name.

There is one more thing that comes to my mind. Possibly there should be provided some method that allows to define a concept for some type, and then some partial specializations, but disallows that this definition be open for further extensions. For example, it would be nice to be able to define an IsSame concept, which will not be allowed to be later specialized a way not intended by the original concept definition, that is, to define that char is same as int. Similarly, there should be some standard concept that says that the type is one of builtin integer types. Such a concept should not be possible to be later extended for the other types because this way a user may break the basic statement of the concept’s logical definition.

The problem is that the concept together with its specializations can’t be defined as a closed entity (like a class is). The only feasible syntax that comes to my mind is that after all specializations have been provided, you should say at the end:

concept<> BuiltinInteger final;

or something like that. From this definition on, no additional partial specialization of a concept may be provided.

I know this syntax is awkward, so maybe someone else will have some better proposal. Note that any better proposal needs that partial specializations be alternatively provided also inside the concept definition. If such a syntax can be proposed and well explained, then the concept can be marked final and then any external concept specialization will be rejected. The problem is that if such an additional thing is provided it increases the complexity of the solution – mostly because this creates an additional difference to template partial specializations.

One more thing that came to my mind is that a concept may additionally help during development when an interface is going to change. For example, in current C++11 we can have a code that is independent on function’s return type:

auto n = GetFreeSpace(device);

You expect that the return type may change in future or be different in other code configurations. By using auto you make your code independent on possible return type. However, you still would like to use this value some way, and you want to be sure that if your code wouldn’t work because of some basic statements violated towards this return type, it will be early detected. So we can say instead:

Integer<auto> n = GetFreeSpace(device);

This way it’s still auto, but it’s additionally checked if this type also satisfies requirements imposed by Integer concept. Similarly we can impose such a requirement as PointerTo<T, P>, which checks if T is a pointer to P, that is, not only T is P*, but also unique_ptr<P> or shared_ptr<P>, and whatever else type declares that it is a smart pointer.

14. Summary – can this be easily explained?

Simplicity is one of the most important thing for any new feature to be added to C++. C++ is already a complicated language, and as it can’t attract people who just don’t accept languages that are that complicated, it’s best to keep new features simple enough.

The explanation for concepts should start with examples of standard concepts, standard entities that use concepts, and how to interpret possible errors reported by the compiler. Then, how to do it yourself. Let’s begin with the well known Comparable:

concept<class T> Comparable
{
    bool operator==(T,T);
};

Objects must be comparable in order to be found. Because of that the find algorithm imposes this requirement on the value type:

template<class Iter>
requires InputIterator<Iter>,
requires Comparable<typename Iter::value_type>
Iter find( Iter from, Iter to, typename Iter::value_type val )
...

If you pass as ‘val’ something that doesn’t have operator ==, the compiler will complain that your type does not satisfy the Comparable concept because it doesn’t have operator== with specified signature – not that “bla bla bla (1000 characters of text) your value_type doesn’t have operator== with bla bla bla (1000 characters of text) signature”.

However, if you have to find an object of SomeClass, which can be compared for equality, but only using some EqualTo method, you can declare this especially for this class, as a partial specialization:

concept<> Comparable<SomeClass>
{
    bool operator==(T a, T b) { return a.EqualTo(b); }
};

This causes that objects of type SomeClass can always be compared by operator==, as long as the attempt happens inside the entity that requested that the type is imposed Comparable concept on. Of course, alternatively you can define an external == operator for this type, but operator== is in a lucky situation to be able to be provided as a standalone function. It’s not possible with all operators as long as with methods. Additional advantage is that you don’t have to provide global definition of ==, so you don’t make a garbage in the other’s code.

You can define your own class or function template, stating that you expect that some type satisfies the concept, just like find is shown above. If you have more than one requirement to be imposed on a type, however, you must bind them in one logical expression, using ‘and’, ‘or’ and ‘not’ keywords.

Using concept requirement imposed on a type, you can more strictly define what this type is, and distinguish it from the other types assigned to template parameters. This way you can also use concepts to make partial specialization of templates:

template<class T>
class Wrapper     // master definition (general)
{ ... };

template<class T>
class Wrapper<T*> // partial specialization for pointer types
{ ... }; 

template<class T>
requires ValueType<T> // partial specialization for ValueType
class Wrapper<T> { ...  };

// or simpler:
template<class T>
class Wrapper<ValueType<T>> { ... };

and overloading

template<class T>
requires SequenceIterator<T>
pair<T,T> range( T a, T b )
{ return make_pair( a, b ); }

template<class T>
requires Integer<T>
integer_range<T> range( T a, T b )
{ return create_int_seq( a, b, 1 ); }

Satisfying a concept is enough to distinguish the type to the other that doesn’t satisfy the concept. Although the overload resolution will fail in case when you pass such a type that satisfies both requirements (from two overloaded functions) simultaneously.

You can define requirements for types (beside things like types or constants inside the type) in your concept by two ways:

  1. Provide a declaration that describes what should be available for your type. In this case the requirement is meant satisfied if provided feature can be called the same way
  2. Provide a requires {} statement with an expression that should be possible to be performed on the object of this type, which can be either a void expression (then it’s only required to compile), or a compile-time constant boolean expression (in this case it also must evaluate to true)

Please note that incomplete types cannot be checked for satisfying the concept.

This feature has the following main purposes:

  • support overloading of function templates (for arguments that are of template parameter type)
  • provide better error messages
  • create ability to synthesize code templates (without preprocessor and code generator)

Currently that’s all for starters.

So I hope I have covered everything that is needed and expected from concepts in C++. It’s at least a good starting point.

strong

Posted in Uncategorized | Tagged | Leave a comment

Sugar Tax

I have been searching for a good “shorthand” for “don’t pay for what you don’t use” for some time. This term came in handy, especially that there are lots of meaning of that in the world.

Sugar Tax album cover
The “Sugar Tax” is explained on the OMD’s home website: The title Sugar Tax refers to everything sweet having a price, particularly in relationships. The actual track Sugar Tax, ironically, doesn’t appear on the album due to it being unfinished prior to the release of the album. (Well, yes, it has been released later, on a “Pandora’s Box” single – you can also listen to it on Youtube.)

I have named my first “book” (let’s name it so) about C++ “C++ without cholesterol”, as referring to something light, without any heavy burdens. However today, armed in a bit better knowledge (for example that the advertisement for “cholesterol free” products were for “margarines”, which are hardened vegetable fat, that is, instead of the fattening cholesterol we get carcinogenic hardeners), I think focusing on sugar when talking about fattening, penalties, burdens etc. is much more appropriate.

Sugar Tax is an interesting topic in general – some time ago when searching for this phrase on Youtube I found a spot from some American TV (it might’ve been this one), where one of politicians referred that in US the Sugar Tax is considered. It would have been a “giant leap for the mankind”, at least the American one, in fighting against obesity, but unfortunately it eventually wasn’t claimed. You can guess why. Or, if you don’t, just watch “Food, inc.”.

Anyway, let’s stop this digression. There’s an interesting meaning of this in the world of programming.

The sugar (or, say, a spoon of honey)

Bjarne Stroustrup knows very well what Sugar Tax is. This knowledge has led to creating the C++ language. The reality at that time, when Stroustrup was writing his Ph.d. thesis, was that you could have a choice among various programming languages, which usually fell into the following groups:

  • Low-level languages, usually assembler, or something with horrible syntax, not much further from the assembly language, operating on the machine level (Fortran, Algol, BCPL, later C)
  • Functional languages, really logical and… not coinciding with the thinking manners of the majority of programmers (until today): Lisp, ML, maybe others – I don’t know which of them existed then
  • High level imperative languages (Simula, Smalltalk, Eiffel, Ada) – usually provide a very useful tool for a programmer requiring them to pay with lots of patience, also sometimes money

The problem was at that time that as the low-level languages were using not so complicated compilers and they easily mapped to assembler, so simultaneously they were usually easily available – but to achieve strict logical structures you needed to write a lot, and better to comment a lot if you don’t want to lose the real meaning of your code just after you wrote it.

This problem was partially solved by high-level languages. They provided a developer with various high-level facilities and, well… as it’s called today, “syntactic sugar”. However usually for this “syntactic sugar” you had to pay “implementation tax”. Compilers of these languages were usually very slow (remember that computers were really slow this time), moreover, many of them were running under some kind of Virtual Machine (Smalltalk, Simula). Not only did it take a lot to compile anything, but there was also usually a big runtime penalty.

How then did it happen that anyone was using them? It went variously in different languages, but the general rule was always the same: computers are getting better and better, faster, cheaper, so we don’t have to be so strict about performance and size; instead let’s give a programmer a good and useful tool so that they can finish their task quickly. This rule didn’t disappear until today – moreover, I would even say that it didn’t start yesterday, but much, much earlier. Practically this approach didn’t change since these times.

For example: Garbage Collector? GC is known since almost 50 years (let’s state 1970 year for Lisp) and the greatest modernizations of the GC algorithms and implementations were done last time maybe 20 years ago. Speaking about GC as about something that “a modern programming language must have” is very, very funny.

Same things about dynamism. Does anyone think that dynamism in languages (like in, for example, Python or Ruby) is something invented in last years? Dynamism, including self-recompiling in runtime, has been implemented in Lisp in the first implementation in 1962 year. It has been later used in many other languages (notably Smalltalk). Dynamism isn’t any modern – it’s the opposite, very old-school. Remember that the most primitive way to implement high level statements in a machine is to interpret them in runtime. Much more complicated thing is to translate these high-level statements into instructions of the execution machine. And this is exactly what compilers do.

You can think maybe then the JIT compilation then should be the next level of modernization. Well, JIT compilation refers to very old solutions in Lisp, it was also implemented later in various dynamic languages (notably Self), even though today Java and C# are most widely known of JIT compilation (there is also a JIT compilation possible for C++ – see LLVM).

If we want to speak about something “modern”, that is, developed and implemented last time, and of course widely accepted, there is first of all static type checking, static analysis, early (pre-runtime) checking. Skipping scripting languages, this is what has been provided by all of Java, C# and C++. That’s why Java has been recognized as a good replacement for Smalltalk, although the main difference between these two languages, but the syntax, is that Java is using static types.

My attempt when creating my book was to show that C++ is exceptionally a language that doesn’t fall into any of these two groups: high-level languages and low-level languages, in particular, it doesn’t integrate their disadvantages. It seemed to be impossible to integrate only the advantages of them, and partially it has been achieved different way: putting the performance in the first place, as it is for low-level languages, while adding some advantages found so far only in these “high-level languages”.

The principle “don’t pay for what you don’t use” can be easily shorthanded for C++ that this language has a minimum “sugar tax”. Although some may argue that it also doesn’t have too much of this “sugar”. This way it practically becomes a new representative of “low-level languages”, however with great knowledge to be assimilated in order to become productive.

The tax (or, say, a spoon of tar)

It’s not true that all languages that are high-level are so slow. There have been done many optimizations, usually with the use of JIT compilation, that can increase the performance significantly. However it doesn’t mean that the use of languages like Java or Python come without penalty.

Despite that these high-level solutions (not only languages, as it’s hard to treat things like QML as programming languages) are researched for achieving the best performance, usually there are no much things you can do. Java is a very good example of that: you can try to emulate your value type with a class of immutable objects, but it doesn’t change the fact that every object has a potential to become a mutex and so provides some things derived from java.lang.Object, like the “wait” method. It means that every object occupies more memory than it is excused by its functionality, and additionally this language works with GC, which – in general – requires some buffer of unused memory for objects temporarily being a garbage that was not yet collected.

The good name for this thing as “sugar” has also another meaning. For example, I am one of not many lucky people, who are physically impossible to become fat. I can eat enormous amount of sweet stuff and this will leave no fat in my body. I don’t know then what happens with this whole sugar I eat, but as it can’t be “used” by my organism, I state it just dumps it. The result is that when I buy any food, it’s often sweet because the food companies put lots of sugar into everything – however, I make no use of this sugar because it’s too much of it for my organism to use (if my organism didn’t dump it, it would turn it into a fat tissue). This is a good example of paying for what I don’t use.

But sugar is sweet, sugar makes the food a pleasure to eat, and this factor increases the number of people that like this food – despite that most of them become fat because of it. It’s the same with software: programmers like to use it and you can find more programmers to use this language – however the software that comes up from their hands (as one of my colleagues told me last time about Firefox) has lots of small fat dispersed throughout the whole body.

The penalty of laziness

Ever heard that Java is a high-level language? What a fool told you that?

You want to know what a real high-level language is? Check for languages like Eiffel, Ada or Simula. To some extent, when using some specific features, C++ is also a high-level language (especially with the features of the new standard, otherwise known as C++0x), but comparing to Eiffel or Ada, C++ is only a high-level-wannabe.

Languages like Java or C#, or even Python, are no high-level languages at all. Ok, to some extent Python may be meant as high-level language, however only when putting some specific features in the first place, which are practically unused. These languages are really low level languages because they just define the machine with very strict rules and provide a language that allows to operate within the rules of this machine. In other words, these languages comprise just a simple mapping to an execution machine. Also Tcl is a very low level language, although it doesn’t even pretend to be meant any high level language.

What do high level languages have? High-level statements, that is, statements that map to logical terms and are being located highly above the machine definitions. Does Java have at least one such construct? Well, Java language has only one thing that does not map directly to the definition of the execution machine: gluing strings by operator + in one instruction is translated into the use of StringBuilder objects. Ok, maybe also generics exist only in the source code (they are all replaced by Object class) and the nested and in-place-derived classes need some tricks to accomplish (they are later extensions to Java and that’s the reason). All other things are just direct translation to the machine. Threads are also being used in strictly technical way. High level threads should be defined directly in a language, or there should be some high level construct used that matches your task, not technical details to accomplish that. Java has some of them (futures and promises), but the limitations of interface definition abilities makes it sure that this same thing can be done also in C with similar result.

So, what is the “sugar”, that is, the thing that the majority of programmers love in languages most?

It has been believed by the creators of well featured high-level languages that things that people want from programming languages is a syntactic sugar that supports expressing logical statements. The popularity of Java, not overcome by C#, and the range of language features being in practical use in Python, proves that the support of high-level logical constructs absolutely isn’t what people want.

The popularity of Java, in which specifying an action to do may be only done with the use of in-place-derived class:

  x.addListener( new SomethingListener() { void onSomething() { /* YOUR ACTION */ } } );

while in C# it is:

  x.addListener( new SomethingListener( /* YOUR ACTION */ ) );

and in C++0x it is:

 x.addListener( [](){ /* YOUR ACTION */ } );

and in which ‘delegates’ (as in C#) proposed by Microsoft were blocked until the eventual court battle, also doesn’t tell the whole truth yet. Of course, Java or C programmers are not only Koreans, who are eager to prove that a Korean runner with tied legs will win with randomly running European (the most funny thing is that many Koreans achieve this!). After all, Java is very restrictive about global variables – you cannot just normally make a global variable in Java (although the smart programmers already overcame this limitation by using, lauded as advanced level of programming, Singleton pattern). There are also programmers, who like it by other reasons.

This is the truth – Java + design patterns is exactly what the people want. First, they get the language that doesn’t have “pointers” in C++ sense (that is, a higher level of addressing) – the pointers in this language are still used, but their use is mutually exclusive with variables. Second, they don’t need to worry about the object’s ownership (actually they at best didn’t get convinced yet that they should worry, but no matter). And finally, they get a language in which everything should be done just one way – even the construct of program’s design may only be object-oriented.

So, did the features of C# work for the users’ disadvantage? It depends for which users. For the users that are programming for the domain in which C# is typically being used (and competes with Java), they surely did. And it even started with Microsoft’s delegates that were added to J++, say, an important predecessor of C#. This just made the language more complicated, so more things to learn when want to use this language. It means that Sun did a really good job by not allowing Microsoft to add delegates to Java, and also prevented against adding any other new features.

Would lambdas in Java hurt it similar way? Actually they won’t – but it’s only a good luck. It’s because this won’t influence the existing libraries. In existing libraries (and in Java there are lots of them) you’ll still have to add callback actions using in-place-derived classes (or even not in-place). There’s just no way to make any use of this feature in some existing library that was using the standard Java way, that is, by providing an object with overridden method. If you are using an existing library, you still have to use the old way. You can create a new library and require that callbacks there are passed using lambdas – but in this case no-one will use them. The problem is that there is no possible translation nor interoperation with the existing solution, that is, the in-place derived class. You can also allow both ways (by overloading), so your library will be maybe used, but 90% of users will still use the old way of passing callbacks. Effectively adding lambdas to Java will be exactly the same reasonable as it was with adding list comprehensions to Python – just a super-duper language feature, which isn’t used by a lame-legged dog (*polish saying, although a bit abused :)).

Adding lambdas to Java is not the same as adding lambdas to C++. In C++, for example, the idea of callbacks (at first used by STL) is usually accomplished by the use of function objects – that is, just anything for which the operator () can be called. Lambdas just use exactly the same thing, so you can use lambdas in C++ together with a library that was written 10 years ago, which just rely on objects with operator () (maybe it uses boost::function or std::tr1::function). In Java it won’t work this way because the only way in Java to pass some “procedure to execute later” is to pass an object that has a method of a predefined name (each library uses various different predefined names and lambda will just introduce a kind of yet another its own one). Even though lambda is an object of some class and executing the lambda would be exposed via some its method name, the function that gets the callback object needs always an object with a method that has a specific name (moreover, it’s also a specific class, while every library defines its own).

Contrary to appearances, Java has lots of common treats with C language. Somewhere else I have already mentioned that, for example, both these languages have a “string” type that can be both empty and null (although some libraries, like Qt, sometimes repeat this stupid statement). Similarly, the only way to pass arguments to a function in both these languages is to pass by value. Both are also quite easy to learn, although not the same easy to use. The C language is even a bit superior to Java: it allows for type aliases using typedef.

Sugar VAT

It’s for some uses a bit better term: as when mentioned above, when I buy sweet food, containing a sugar, which isn’t absorbed by my organism, I practically pay for what I don’t use. VAT is something a bit different – it’s a tax that is payed by the “end-user”. So, you can guess, why the use of such languages is so easily tolerated: it’s because the programmers aren’t users of this software. It would be just an income tax in this case. But when this is a product for some non-programmer end-user – the end-user pays this tax, so this is a VAT. Of course, today Java has much less sugar tax than it had before, so it was possible to create tools for programmers in this language (Eclipse, Netbeans, IntelliJ Idea). But it’s not about only Java – also about languages used for web pages, like PHP.

A very specific software is being produced with a high Sugar VAT. And also it’s not a development of present times. Many products for this kind of marked have been made in Smalltalk, and Smalltalk has quite a big Sugar Tax. It makes that if this Sugar Tax is payed as VAT in particular situation, software producers are even very keen to use such languages because it’s not them who will pay this tax.

Of course, an important factor in this is that the software being produced with these languages has lower costs of production. So, for systems that has lots of specific cases to manage and relies on well known and well defined middleware layers (in particular it needs to provide an interface to operate with data kept in a database), the main factor that decides about selecting a language definitely isn’t performance. Because the most performance critical thing in this whole system isn’t the interface, but the database engine (and, in web applications, the network capacity). It practically means that the interface can be programmed in not only Java, but also Tk, Python, Perl… well, no. Not Perl. Too much security risk.

Unfortunately, it pays well for users that have lots of money to waste. Many people prefer to know though, for what they pay, and surely prefer to pay only for things that make value for them.

The content of sugar in sugar

The JIT compilers can wipe out many performance penalties, as its main purpose is to shorten the path that leads through execution points in order to achieve the same final result with less number of instructions. But there are not much things they can do about the memory consumption, in particular, they can do practically nothing. Oh, let’s say that at best they are able to rearrange the memory access requests to enhance locality of memory usage, so that the real memory access penalties can be decreased. But eventually if an application is using enormous amount of memory, better memory localization will not help much anyway.

Saying that GC may provide better results in performance than manual memory management, which is a reality today in C and C++, is largely overestimated. It doesn’t really matter, which performance penalties of manual memory management are avoided by GC. Yes, GC will allocate memory faster than the standard C++ allocator (because it takes less time to find the correct block). Yes, GC will improve locality because it is allowed to move an object to a different place of memory and this way it’s even able to decrease the memory fragmentation. But does the speedup in memory allocation compensate the performance penalty of running the mark&sweep cycle? What would be the result of comparing the size of memory lost due to fragmentation and the memory overhead with not yet recycled objects? I have never seen any comparisons for these things, while the hails for GC and his obvious outperforming of manual memory management are heard very often.

And it’s still not all because this above defines the theory, I didn’t yet start to talk about the practice. And the practice with GC-based languages is that they definitely make more use of dynamically allocated memory than it is with C++ (not with C – this language uses more dynamic features than C++, which is also one of the reasons why its better performance than C++ is purely mythical). C++ makes wide use of stack allocation, which is really fast – to allocate a memory on a stack you need to execute just one simple assembler instruction. Allocation of a dynamic memory comes with a big penalty anyway – you need to find an appropriate block of free memory or carve a new one and register this block in the memory allocation table. It’s orders of magnitude slower. Proponents of GC oppose that stack allocation leads to worse memory localization (which isn’t quite true because newer processors are being added additional memory cache for stack), but this performance penalty is negligible stating that for types of small size it would take more cases of creating objects than operating with them. And if someone would like to say that typically in the best solutions using JIT the allocation and deallocation of one small object is as fast as using stack for that, there’s one thing that stops me from believing: The IBM’s Java compiler was hailed that it provides a really great optimization that can in some cases unwind an object used only inside a method into local variables. Also, there should have been reasons why C# has structs and “stackalloc”. And note that Java and C# are considered today the best existing solutions using JIT compilers.

Additionally, in a language with manual memory management you can change the memory manager for either specific type or specific portion of code. It means that you can also use a specialized allocator, that is, an allocator that is predicted to work with objects of exactly one type (or even just with the same size and base class). Allocation of such object is still faster than that of GC, and deallocation is practically the same. Although it partially shares one disadvantage of GC, that is, it must keep a pool of unused memory allocated for the process. But as the name shows, it should be specialized, so it’s predicted for objects, which’s number should be stable, while they are very often deallocated and allocated again. If the number of objects is not changing during the system run, this disadvantage is insignificant. While when GC is used for the whole process and non-exclusively for all objects, the amount of unused memory may be significant. So, GC is a kind of sugar – not because you don’t have to worry about releasing memory, but because you don’t have to select the best matching allocation algorithm for particular purpose and data type, you always use the universal one.

Simplicity of a language is another case of sugar that needs not contain high amount of sugar. For example, operator overloading is one of features of C++ (although lots of languages feature it – C#, Ada, Eiffel, Smalltalk, Haskell, and probably many others, too) that is very badly evaluated, as something that decreases the readability and comprehensibility of the source code. For example, in Java when you can see “a + b” you know that it may only be adding two numbers or gluing two strings, never some crazy function execution, for which some crazy programmer has added such a crazy interface. It makes that if you create your own value type, or such a type is provided by some library, the only way to operate with it is something like that:

Ethereal x = a.at( t.number() + y.number() ).update( z.number() + x.number() );

Which can be encoded in C++ as

Ethereal x = a[t+y]->update( z+x );

Yes, operators is something that may have different meanings, depending on what things take part in the instructions (their types, in particular). In some cases (like standalone operators defined in a namespace) it may even depend on context (that is, whether any “using namespace” was declared in the enclosing block). For some people, like me, a reasonable use of overloading operators will increase readability of the code. However for majority of programmers it’s most important that the general meaning of an instruction be always the same, regardless of what things participate in the expression (that is the meaning be contextless). In Java, for example, when you look at the following instruction:

   System.out.println( "haha" );

you know that:

  • System and out may be classes or variables, if variables then maybe fields or local variables, never anything else
  • println is a method of a class designated by out (out may be its class or may be a reference keeping an object of this class)
  • this instruction designates calling a normal method of object designated by ‘out’ or a static method designated by System.out class

This last statement is also very important because there is a big difference between calling a method and, say, calling a constructor: when a constructor is called, the ‘new’ keyword stands always before the calling expression.

The same instruction in C++ may not exactly designate the same thing. It’s even easier with the “System.out.println” part, as these names would never designate a class name – it must always be an object. However, println may be either a method of object System.out, or a field of it, that is of a class that defines operator().

Similarly, if you create such an instruction in C language:

x->y->perform( a );

then there are several things you are sure of:

  • x and y are of pointer to some structure type (so -> operator simply derefers the pointers)
  • the “perform” field in y is a pointer to function
  • if the pointer to function type of “perform” declares that it takes one argument of type ‘int’, and the compiler did not issue a warning about incompatible types, ‘a’ is definitely of some integer type (int, short, long, char)
  • conversely, if type of ‘a’ is int, then the pointer is to a function that gets one argument of an integer type (int, short, long, char) or even a pointer type (although the most recent compilers would warn in this case)
  • the ignored return value, stating that the call does not return a pointer to some allocated object that would be leaked this way, does not have any additional meaning in the program

You can’t be sure of that in C++. In C++ the above instruction may have also different variants:

  • x and y may be smart pointers with overloaded operator -> (which need not do just simple dereference)
  • perform may be either a method of class of which y is a pointer, or a y’s field of either a pointer to function type, or some class that has overloaded the () operator
  • stating that perform is a method, it can be the only such method with one argument, or a method, which has default arguments at least starting from the second one, or it can be one of overloaded methods with this name (fortunately, “perform” can’t be defined simultaneously as a field of a class with overloaded operator() that takes two arguments and as a method that takes one argument, even though the overload resolution for this case would still work in theory)
  • stating that the argument of a call to “perform” (whatever it is) is of type ‘int’, the ‘a’ expression may be of any of integer types, or of any class that defines a conversion operator for type ‘int’
  • and conversely, even stating that the type of a is ‘int’, the argument type for “perform” may be either int or any other integer type (also char and bool) and any class that has a non-explicit constructor that takes one argument of type int
  • and additionally, this call might have returned a value of some class, which has been ignored, and as a temporary object it may undergo immediate destruction that involves calling the destructor, which may perform some additional action

From the perspective of the majority of programmers the cases described above for C++ is enough reason for never using a language that features operator overloading. Well, moreover a language that features also destructors and automatic implicit conversions. C++ has all of these features, and Java has none of them.

But from the perspective of a real professional the matter is much simpler: how do you read this instruction? Of course, you read it as: from an object designated by ‘x’ extract the ‘y’ member, from the object so designated extract the ‘perform’ member, and call this member passing ‘a’ as argument. Despite all the possible variants of what the real meaning is of this statement in C++, this explanation will be always the same. Of course, crazy programmers may put various crazy meanings of what the -> or () operators designate, that’s true. But for a professional it doesn’t mean anything because crazy programmers are not participating in software production, at least because they are very easy to detect and they are given a choice of either stop doing crazy thing or leave the software team and organization. Anyway, the number of possible real things being done by this instruction for a professional C++ programmer doesn’t really matter. What is significant in it is its logical meaning. For example, if we have -> operator, it means that some member of something that is designated by x is dereferenced. It really doesn’t matter if -> is an overloaded operator and if this maybe does some complicated things behind the scene. It still means the same: dereference. Same with (): it just performs a call. It doesn’t matter whether it’s a function, a method, a pointer to function, or an object that has an overloaded () operator. If () is used, it means “call”, no matter what is being executed.

The straightforward difference is that in case of Java, the language grants you that the statement you are reading will have always the same meaning. In any language that features operators overloading, and especially in C++, the language doesn’t grant you much – it only grants you that there exist a valid definition for an operator that allows for this statement to be valid – whether builtin or user defined. In specific cases it may mean that despite that you’ve learned a language, you still have to learn again, some particular library rules in this case. Of course, I’m explaining the general feeling of the majority of people – the publicly available libraries make really very little use of operators overloading; usually they overload the () operators for function calls and [] for some “indexing”. The only case I know of a really excessive use of operators overloading is Boost.Spirit.

Does that mean that “simple” languages are languages that aren’t used in software production organizations? Definitely not, of course. But definitely they are used in very specific organizations: either in “headless” organizations (like the bazaar-style headless programming), or commercial organizations, in which the performance of the software doesn’t matter, so they don’t care for employing programmers that can be trusted (that they are not crazy programmers). This is one of the reasons that Java and PHP rule in the software that generally relies on having a storage in a database and the presentation layer on the web, and also that Java and C are the most popular languages in open-source software. It’s usually cheaper for an organization to employ people that don’t have to be tested for trustfulness, as you can easily let them make software and be sure that it won’t run crazy by just giving them a tool with strict limitations.

And this is the sugar, in case of these languages. And yes, it has to be paid for – by end user.

But this solution cannot be used in an organization, which has to produce software with strict size and performance requirements. Even stating that computers are getting faster, better, have more and more memory. Because maybe computers do, but the requirements for features in the software increase, too.

The sweetness of sugar

Although C++ has become one of the most popular languages in, say, non-web software, it has lots of opponents, which seem to be more visible than people, who like it. The people, who value performance and want to have full control over the program, usually say that in C++ you don’t have full control over every part of the software (although I have been always saying that if such a case happens in C++ and it can be opposed by a similar case in C where you have full control, it’s only a result of the people’s indolence). The people, who value high-level languages, say that this sugar in C++ is little sweet. That the features that C++ supports are too little “logical”.

The real reason is that practically no language is “supporting real logics”. As I have already pointed in another article, logics is logics, it’s very fuzzy and this way not possible to be defined strictly. So, following thing is that there can’t exist a language that better supports “the logics”, or even some specific logics. There can only exist languages that contain some high-level statements that support many various logical ways of programming. The uniqueness of C++ is that it has lots of abilities to create various kinds of API, which means that it’s easier than in any other language to create an API that allow to express the logical statement of the user in the terms needed by the module. But of course, it probably doesn’t support the logical structure of any other, say, object-oriented language because, simply, it’s not just “object-oriented language” (it’s a multi-paradigm language). There’s no wonder then that people who were using other high level languages will never get used to use C++. But people interested with doing software engineering should never listen to what they have to say about C++.

The complexity of C++, on the other hand, may be an advantage of one type of programmers (those who desire abilities to describe logical statements clear way) and simultaneously a disadvantage of the others (those who want that the language have simple rules). You cannot make a language that fits both. Another thing is, though, whether both types of programmers are same suitable in software production.

The matter of taste

Well, some languages are sweet because of the sugar, some others because of an artificial sweetener. For me the artificial sweetener was just a piece of, say, chemistry that was going to fool my taste – well, it never did it. The artificial sweetener has been always for me something obfuscated, it never tasted even similarly to sugar. When I made a mistake once and bought a “cola light”, I have quickly learned the names of acesulfam, aspartam and sacharine, and that I should read the list of ingredients before I buy.

Maybe there are some people, who can’t distinguish the artificial sweetener and sugar. By the same reason, there are people, who cannot distinguish between the features that make the language useful and features that make the language easier to learn.

Don’t get me wrong – I’m not saying that C++ should be used in every part of software production. I understand that for C++ there can be lack of appropriate libraries (no one has every thought of using C++ in this domain), or using the shared ownership connected with no defined place of object deletion can increase the speed of making software. I don’t try to complain that so many programmers are dumb because they prefer Java over C++. In practice, Java is not a competitor for C++, the same as it isn’t a competitor for Python. I just would like to point out that there is a strict connection between the fact of bigger popularity of Java over C# and C over C++. I’m not sure, but it also seems for me that this is also connected to changing Smalltalk to Java.

Remember that many years ago a programmer should have been a really smart guy. Such a programmer should have learned very illogically looking rules and solve tough problems. But as the requirements of software increase and there is needed greater number of programmers, the number of really smart programmers cannot be increased so easily as the number of all programmers, including those that cannot understand pointers (Joel Spolsky says that it’s about 3/4 of all IT students). That’s why today there are lots of people, who need not be smart guys, but they just have to write software the best way they can.

So, I’m not saying that I have something against languages like Java, or C#, despite their quantity of sugar tax. I just think that it makes more sense to use C++ than C, same as there’s more sense to use C# than Java (but probably not from the perspective of some kinds of software business). By the same reason as there’s more reason to eat things that are either not sweet at all or at best naturally sweet (fruits) than candies or carbonated beverages stuffed with aspartam. But the life can only teach to us that there will always be these two different kinds of people, who care, and who don’t care. There may be always an economical explanation to make money from both of them.

The fattened and the jogging

Although I start to think that if such a thing happens to the programming world, it means that programming has become simply too little challenging. What does it mean? Well, it may be something like what can be challenging for a soldier when the war has finished. On the other hand, if one’s well payed for little challenge, will they be payed more for big one?

There is one thing that’s challenging in software development. Well, last time the speed of processors is going to get to the kind of speed of light. They just can’t be faster anymore. However they may be cheaper. It means that in the near future you can’t count for a faster processor, however you can count for more cores – even in a number of 100 or more. In such an environment, programming must be different – you no longer define a sequence of instructions; instead, you should define several instructions with dependencies between them, something like today the “make” tool needs in the Makefile. For such an environment, none of today the most popular languages may fit. Thread libraries are ridiculous in case when you are encouraged to create 10 threads in your simple program. Even futures and promises can’t help you much if they have to comprise 90% of your instructions.

Will there be any economical excuse for such a software? Well, this is at least that much probable as that the energy sources will be exhausted, some Armageddon will happen, or simply the world will collapse under the burden of permanent crisis. For today the best rule would be: don’t try to look into the future because there will be no promise.

Posted in Uncategorized | Leave a comment

Mind the gap…


MIND THE GAP BETWEEN THE TRAIN AND THE PLATFORM. Well known for the people living in London.

In case of programming there is also one important gap that you should be aware of.

Can I write programs, daddy?

Programmers in the beginning are being taught methods of programming that are at first little invasive. Anyway, that’s one of problems when trying to start learning programming techniques with low level languages. This is not usually, but always a very bad idea. Because of that it’s¬†usually a bad idea to start learning from C language.

Although I haven’t so far found a programming language that could be used as a first language, just to teach programming. From my point of view, the first language should be imperative (because it’s easier to imagine that a computer just executes commands), well structured (with clean syntax), should be expression-based (because the majority of languages in future use will be), shouldn’t contain any weird syntax features, and shouldn’t have any nasty tricks that the programmer needs to remember. All these rules deprecate the following languages, in order: ML clones, Pascal, Tcl, Python and Perl. Languages like C, C++, C#, Java (and other similar) are excluded from potential candidates even earlier.

Probably the only suitable language will be Visual Basic. Well, since some time I can see that BASIC or some of its mutation (featuring structural statements, procedures with local variables – that is, everything that is meant basic in every today imperative programming language) is the only language that may suit for learning programming basics. But even BASIC may contain one of very dangerous features, that is, the GOTO instruction.

Of course, especially for making an “educational” language we may make some special version of such a language that is castrated of this instruction (although there are other languages that do not have “GOTO”, for example Java and Tcl). But such a language should at least contain enough statements so that the use of such instruction is not necessary at all.

If we don’t have such a language, we should use some other, used in “real world”, but important thing is to teach to programmers important rules that strongly limit the use of language’s features. Because in the first time they would really hurt themselves.

I’m growing!

It’s very good if you are trying to experiment. That’s the best way to learn a language and programming in general. It’s getting worse, though, if you are getting too far with this experimenting and forget that you’re drowning in the pleasure of experimenting and getting farther and farther from the common sense.

The problem is, though, that this getting too far starts earlier than most of the people think. For example, many people think that overloading operators is a feature that should be prohibited by organization’s rule set. Because allowing for that will expose a field for making crazy things. I have never found in practice that this be a problem. In practice, it’s C programmers or “they want me to write C++ but I hate it” programmers, who are the source of majority of problems with obfuscated code in C++ projects.

Experimenting should be a way to make you wiser. Experimenting shall not be focused only on testing language capabilities, but also on checking whether particular kind of language features, especially used some particular way, can be used in real programming. That is, for example, whether you understand what’s going on there when you next look into this code. Whether adding new things to the code will be easier.

The gap

You can compare experimenting with a toy. You can have a toy train and play with it. When you pack some artificial people into the train’s car, it’s not a big problem that some of them will stuck a leg in the gap between the train and the platform. If you repeat the same with a real man, real train and real platform, you’ll quickly find out that falling into the real gap really hurts. So, the toy can be used to prevent unnecessary hurting, but the fact that it doesn’t hurt may be a reason that the prospective programmer won’t learn anything.

During experimenting with programming you can find a lot of various interesting solutions that can be useful… but need not. For various weird solutions you should take special care.

Operators overloading, for example, is one of the places where you should take special care. I even think that this technique is so rarely used in real programming because people are worrying whether this solution is designed well enough. The problem isn’t with the operators, but rather whether the API, that is designed with the use of operators, will be comfortable API. If you look at that from this perspective, you can quickly find out that it’s not operators that cause problems, but problem is to well design API. There’s a lot of APIs in many libraries, which are very badly designed – and they don’t need operators to be badly designed, it’s enough that they use functions and variables stupid way.

What can be a cause of incorrectly overloaded operators? Let’s take a signal-slot mechanism (called ‘delegates’ and ‘events’) in C#. There are used “+=” operator to connect a signal and “-=” operator to disconnect it (specifying the slot in both cases). The problem is that you practically cannot express the “connect” operation by using any operator – this “+=” is only a best possible wish to do it. And even though it would be good to specify adding numbers, also to append a string to another string, also even good to add new elements to the vector – “connect” is something that doesn’t have a logical explanation that can be interpolated to use “+=” symbol! This idea was definitely stupid, and even creators of Vala language, which was strongly based on C#, have realized that. Even though at first they repeated the “+=” statement for connecting signals, they now changed this to “connect” (as a method call) and made “+=/-=” API deprecated.

One of the problem is that “x += a” expression speaks “append ‘a’ to ‘x'”. Connecting a signal to a slot isn’t anything that is even close. Even though there is any appending, it’s only an implementation detail, so it shouldn’t be even exposed. The problem is that connection is a kind of registration – it means that this registration causes that a new “node” has been created and we get a reference to this node. This reference can be later used to disconnect this slot from the signal.

And here is another problem, with the “-=” operator. In order to make this API possible, a slot should be searched for and compared with each one element among the slots connected to given signal and only then once found it can be disconnected. This operation may also end up (potentially) with search failure and some way to handle it should be predicted. While it’s much easier and faster to just get the reference to the node where particular slot is saved in signal’s data and remove this node. But in this case we need something that will be a result of “connect” command and we can save it and later pass as an argument for “disconnect”.

This is one method. Another method is to make “slot” not a single function (or binder), but a real “slot object”, which also contains additional data, among others also a list of signals to which this slot is connected. As single disconnecting is rather a rare case and usually disconnecting happens when the receiver object needs to be destroyed, an alternative implementation is to store both signal data in the slot and slot data in the signal. This way a slot may simply disconnect itself from everything it is connected to, as well as there’s no need to save the reference to the node anywhere (it needs a term of destructor in a language, so it can’t be done this way in C#).

The problem with += and -= operators used for signals, then, is not the problem of using operators where functions should be used, but using inappropriately designed API. An opposite example, the “<<” operators used in C++’s iostream is, although far from bit shifting, both correctly suggestive and well matching the needs of API design.

But operator overloading is not the only thing that may look dangerous – moreover, they are relatively safe comparing to, for example, GOTO¬† instruction, as I have already mentioned at BASIC. There’s always someone who would like to use it the following way:

a:
while ( some condition )
{
    if ( something )
        goto b;
    ...
}

b:
while ( some condition )
{
     if ( something )
         goto a;
     ...
}

This stupid thing is even possible in Java, which is theoretically voided of this dangerous “goto”.

It’s merely the same thing with so-called “Singletons” used in Java. As I have already mentioned elsewhere, Singleton in Java is just nothing else than a global variable. Usually when you don’t know how to organize data sharing, the solution is simple – create a global variable. Well, theoretically global variable is just unlimited access to some shared data. Practically a global variable is the best way to shadow any true intentions standing for the written code and contribute to more bugs in the software because of not full ruling over what’s going on in the program.

A programmer must do the first hurt to themself to get the experience that this is a really bad way of programming.

A car that drives the driver

It’s used to say that during learning car driving the student should focus on that they drive a car, not a car drive them. This problem seems to happen sometimes to programmers, too.

In particular, programs seem to live their own life, on which the programmer has little influence. It sounds silly, but how can we describe other way a situation, when the program went out of control? Of course, nothing wrong can happen when the program isn’t running, but programs are not being written in order to be printed on a paper and hung on a wall, but in order to be run and work.

A programmer, who is experimenting with new, wonderful methods of programming, may forget to stop, when they’re getting close to the edge of the platform, and forget to check if the train is there.

A good way to verify the real results of programming is just to stop developing some code at some point, leave it for, say, one month, then return to this code and try to continue the development work. Even one month is enough time to forget what was on your mind when you wanted to write this. If the time is longer, and in the meanwhile you were trying to do any other development work, returning to this code may result in something like: Well, who the hell wrote it? How drunk had I been when I was writing it? Why the code is written as if the author would like to make the reader’s life harder?

It’s worth that you make this kind of tests before you approach to real programming. You can also have someone else to participate in this (or even interchange with the code written). This will also help you realize that if you later can see some idiotic code, you will remember this: this idiot might have been you. Experience like that may help you see “the gap”.

Call me nobody

Nobody is perfect, of course, but programming is, like any other advanced engineering domain, a “not for children” thing. Falling into “the gap” may cause harm not only to those, who write crazy code, but also to the people in the company where the code is being developed. Minding the gap is sometimes more important than finding the right solution. Creativeness is like a speed you can achieve in running through the underground station, but remember, there’s no barrier at the edge of the platform, where the train can be sometimes found. And the train doesn’t perfectly stick to the platform. By running too fast you can easily get stuck into it.

Generally, I would say that the language rules, good practices, various constrains etc. is not created to replace thinking, but at most to decrease the meaning of thinking in programming reliability. In other words, these rules exist in order to minimize the size of the gap. But the gap cannot be zero size. I think it’s obvious that the gap may be zero size only in case when the train is strictly bound to the platform. But it’s rather unlikely that such a train be able to go anywhere.

The best gap

Simple thinking about various engineering domains is that there are some perfect conditions, which are only theoretically defined, and rather cannot be achieved. This may be a case of a perfect gas definition in physics. Such a gas does not exist because the gas consists of particles, as any other thing, and these particles would have to be zero size in order that the gas be perfect. But this is something that is just physically not achievable; usually we think about perfection as something that we always want.

In programming, and also in many other engineering domains there are lots of cases when we can define perfect cases, but, well, perfection isn’t what we want. One of the cases is that the communication between two threads is decreased to none in perfect situation. This is because only in this case every thread can freely work only on their task and isn’t blocked by anything else. But this is a perfection that we don’t want to achieve because if we do then there’s no need for these two threads to be in one process, as there are no communication between them. This is, again, a gap, that we want that it be as small as possible, but if there’s no gap – the train will be one with the platform.

In practice there are lots of similar cases for that – which isn’t wondering as in every engineering domain there’s always a battle of various factors and finding a balance is one of the most important things that engineers do. The role of an engineer isn’t to find a biggest gap (because the passengers wouldn’t be able to enter the train) nor the smallest gap (because the train wouldn’t be able to move along the platform). The role of an engineer is to find the best gap.

There’s no point in explaining what the best gap is. If you are engineer, you know it well enough. Just remember, when you’re ready with this, that there are always people, who will put a leg into even the best gap. That’s another factor that can mess in this… ūüôā

Posted in Uncategorized | Leave a comment

Valuables and objectives

0. Introduction

Whatever method you’d try to use to describe the world of data, it can never be just one kind of beings to be treated the same way. Some have tried. They all failed, of course.

Audacious. People think that they are creating programming languages as they want to, so they are able to make the language do and look as exactly what they like; if there’s anything that they don’t like or see it harder to understand, they will replace it with something else. As usually, it results with the “short quilt” effect, as I have already described in my previous article.

No division on objects and values? Sure! Let there be only objects! Wonderful. Now let’s fight with problems of “comparing by identity” and “comparing by contents”, the terms of “shallow copy” and “deep copy” etc. – just in order to not have “values” in the language.

One of the main problem, especially in an imperative language (or even partially imperative) is how to treat the fact of changes done through a variable. That’s why partially this problem is solved by using immutable objects for representing values. Of course, this doesn’t fix the problem as a whole because this rule is not possible to be imposed on all value types (especially user defined ones).

The only possibility to not introduce distractions in a language is to allow user explicitly define whether the type has to be a value type or an object type. Many languages, though, use value types, but do not allow a user to create their own. Also many programmers (especially those, who like these wounded languages) are questioning the existence of values and objects. Let’s try to decompose them all.

1. Welcome to the real world

Do then values and objects exist in the real world? Are they only a human spell-up or something observed?

To all appearances, it’s very simple. As objects describe things that are material, values describe things that are immaterial. In consequence, object is something identifiable that may have specimens, while values base on pure definitions and may only “hang in the air”.

Now it’s time for a serious question – just coming from the OO theory.

Can “green” be one of car’s attributes?

You’d say that of course it can. Don’t be so fast. Some people would say that only “a color can be car’s attribute”.

WTF? A car can have a “color” attribute? Yes, a car is color, probably it’s also speed, power and comfort. LOL.

A car can be green, fast, strong and comfortable. Or it can be red, slow, frail and cumbersome. We can at most say that a car has an attribute of type color (which is green), an attribute of type speed, power and comfort, respectively. So, a color can be at most one of possible types for car’s attributes.

So: car is green, or in other words, car’s attribute of type color is green, so, one of car’s attributes is green (not “color”). Color is then an attribute type, not an attribute.

So, if color is not attribute, then is yellow an attribute? Well, not exactly, we’d rather say that “yellow can be an attribute”, but of course, we’re closer to the real world.

And now: is color something material? Well, physicists would say that color is only an optical wave of some length, but – unfortunately – this isn’t true. Color is not a wave, nor is it its length. Color is a human brain’s specific signal created in response to receiving a wave of some length (by the same way, luminance is a signal created in response to the wave’s amplitude – it’s not the same category as color, but it’s close; the luminance can be relative, depending on the current eye sensitivity). But anyway, the term of color is our shortcut, which is not a description of what an object possesses, but only something that can be a visible treat of an object. It can be the same as same type of attribute of another object, or different. But it’s just “something we can say about” this object.

But it’s merely the same as with numbers, either ordinal numbers or count (a curiosity: in Japan and Korean languages there are separate sets of number names for ordinal numbers and for counting numbers; the first set is usually derived from some Chinese language). Number also can be a treat, and it’s also immaterial.

These “immaterials” (I doubt I may say “immaterial things”) are called “values”. On the other hand, objects are things that are material and may have attributes, which are these values.

This is how things are in the real world.

The important thing about objects and values is not exactly that they are material or immaterial (I highlighted it just in order that you get the point) but rather the fact that objects may have specimens (or copies, or *-plicates – sorry, English is not my first language :)). I still don’t think I have used correct word – a duplicate or a copy suggests that two objects are same, while they may, but need not be. One object can be a copy of another, it can be then modified, so it’s no longer same as the source object. But even if we can say that the contents of one object is same as the contents of another object, they do not comprise the same object.

2. Creating worlds

In the software world objects are not material anyway, of course. But it’s the nature of existing in specimens the main thing that makes objects different than values. The main thing that makes us conscious of this treat is that objects may be referred to, which means that we may have two references, and if two references refer to the same object, it’s the same object. If not, they refer to two separate objects (not different objects because these objects still may look same!). By having references we can then identify objects.

Please pay attention that in the real world objects do not have any references identifying them by default. We may spell up some identifier and assign it to particular objects. That’s one of the reasons why (until today!) relational databases allow objects to have a special field with auto-increment, which is treated as an identifier for the object (the “primary key”). But we still may have no identifier and we can have multiple objects with equal contents.

In programming languages, fortunately, objects have unique identifiers by default, so you may treat it as an attribute of every object – that is, every object has an attribute of type “reference”. This reference identifies the object uniquely; if you have two references, they refer to the same object if they are equal, and they refer to two separate objects otherwise. A reference is then nothing more than just a property, which has one special treat: every object has it by default and the value of this property is unique for every object.

And also, values cannot have treats – they are treats. A value is always one single thing, you can name it, you can also identify the treat itself, but a value can be only equal to another value of the same type – values can be different, but not separate. Values also may (but need not) belong to some domain of finite number of possible values – practically also they are rarely finite, if we think about values in the real world. Numbers are infinite, as you know. You may think that colors are finite, but well, if you think that color values can be finite, you probably are a male and never discussed with a woman about colors. Nonetheless, we can talk about values that they are either fuzzy (like floating-point numbers) or finite (like enumerations), but they still are, more or less approximately, contained in some already existing (at least in theory) set of values. Objects do not. Even though the number of existing objects of the same type may be controlled special way (keeping the number of objects constant), usually we can create and destroy objects, so their number is usually dynamic. You would ask, if the objects which’s number is constant can be values. Well, no, although their references can be values of some specific type, as they all are taken from already known set.

There have been taken up many trials of how to make some “common” form of values and objects. The results are usually very sorry.

Theoretically, you may think about a value as about an object with one treat; this way the object symbolizes this treat.

Very good, but now try to compare them. Can you?

How does Smalltalk do it? Well, very simple way – by employing pointers to more roles than they are naturally predestined to. In particular, we make a pointer, theoretically, which has some least significant bits reserved for special purposes. These bits define whether it’s a pointer or an integer, while its real value is then recorded in the rest of bits. From the language point of view, then, there are objects (!) of class Integer, which really exist (!) and which can be compared for identity (that is, the “references” to integer numbers are integer numbers, shifted with some bits and with some special bits set, which are never set for references to objects). This is a smart approach, as effectively when you, say, add two Integers, you get a new Integer – although, such an “Integer” is never created anew, but rather it looks like a matching object is being looked for and found; you cannot create objects of this type, but it looks as if there is an object in the system that refers to every existing number.

Yes, it works. For integer numbers. It doesn’t work for any other type of value. For example, it doesn’t work for floating-point numbers or strings. In case of these, as with other value types, they are just compared by contents (using overloaded operator =). In addition, Smalltalk does not use variables, but rather named binders to objects’ references. These all things, together with garbage collector, allowed objects to “emulate” values. In order that this emulation be able to work, though, a usual variable semantics are not allowed (no modifications in place). For objects that are allowed to be modified in place (for example, collections to which you can add objects) the result is the “shallow copy” and “deep copy” semantics (so this emulation isn’t perfect either).

But anyway, Smalltalk did it best. Many other languages are only worse. You need to always remember that your type is a value type and you need to use correct comparison operation (== vs .equal() in Java or == vs === in Ruby and JavaScript).

3. Variance

Things get complicated once we regard the existence of variables.

A variable is something that exists in programming languages (not all of them) and – beware – can be set a value. Not an object. A value. This is generally agreed even in languages in which Integer is object. At least one thing must be ensured to exist in every programming language: a value of type “reference to object” (as it’s simple, integers belong to this, too). This is the only case where values are regarded in Smalltalk.

But variance is not only a statement that the value can be set anew. Variance means also that the value can be changed in place. Not every type of value predestines variables of this type to be changeable in place (reference doesn’t), however there are many types with this property – integer and string belong to them.

Of course, theoretically all such cases can be explained as creating a new value and assigning it to the variable. Well, yes, practically the operators like ++ or += have been created mainly because of convenience; they can be even theoretically expanded logically as creating a new value and setting it anew to the variable. Also (in case of integers, of course) the compiler can optimize them enough so that the “expanded” version (a = a + 2) looks the same as the “compressed” version (a += 2). However this is still thought of as about “changes in place”, even if this is not implemented this way. Note, however, that the ability to change in place (in implementation details) is maybe not needed in case of integers, but they are indeed needed in case of values of complicated types – just such as string (or a container of integers). How much effort does it take to add a new item to a list? Not much. But a lot, if it can only be done by creating a new list consisting of the old elements and one new element (or at least it can only be a language builtin list with dedicated strong optimizations).

A language that does not allow a user to create new value types must at least have a builtin management of the most basic value types. Such as containers and strings (container is a meta-type, not a type itself, so it may be a value type only if its element type is a value type – although it is, in most cases). Otherwise we have a mess and compromises. This happens, for example, with Java, which has one funny common treat with C language (a string that may be both empty and null). In Java when you assign a value of one string variable to another string variable, you just copy the reference. To prevent uncontrollable changes, such a string cannot be modified (see above), nor you have to worry about object deletion (gc), however the only way to glue subsequent strings is to just use the + operator (builtin for java.lang.String) that will just create a new String object getting two other strings as a source. This is, of course, very inefficient, so there is another type provided, StringBuffer. As this still can be inefficient because as a universal purpose type it has additional mutex protections, there is one more string type, StringBuilder, which has no mutex protection. Anyway, the StringB* types are exactly modifiable strings, which means that if you assign one StringB* variable to another and modify it thru the first, the change will be visible thru the second. But yes, of course, these strings are only just simple tools to optimize string operations. Ok, if this is still so bad, the latest Java compilers have also an optimization (!) that detect subsequent string concatenations done by operator + and implement these operations using StringB* objects (you can check how it is done by decompiling the java bytecode back to the sources using, for example, jad).

Well, and this is the language that is “much easier than C++”, which has just one type dedicated for strings, std::string. And even if some library creates its own, it usually works the same way. As long as interaction with some legacy C code isn’t required, of course.

Almost the same situation as in Java is in C# – however it has at least one problem less: comparing strings by operator == compares string values, not their references. But rest of the problems remain. Although they can be solved similar way as in Smalltalk – that is, it’s enough to forbid changes in place and overload the == operator, and you have a value type.

It would be much better for strings in these languages if they are totally builtin value type, just as, for example, double. What is the reason why it wasn’t done this way? There can be some explanations. First, because Smalltalk does it with objects. Of course, but Smalltalk uses objects for integers and floats, too. Another, because this is how it is done in C (you should use dynamically allocated memory to manage strings, if you want to operate with them advanced way). And this is probably the only sensible explanation. Because in any other language older than Java, which follows value semantics at least for bultin types, string is value.

It’s really not a problem to add various language features to make the string possible to be modifiable in place and to be a value. But we know, of course, that the real reason is that people think that string should be object, and comparing two strings is similar to comparing two cars. Well, let’s consider, why.

4. True nature

What’s the true nature of particular data types that predestines them to be object types or value types? We can define many things about it, like:

  • value types should be easy copyable (by “easy” I mean it does not require any complicated and effort-consuming operations)
  • value types should be easy comparable
  • variables of value types should not have significant identity

But, unfortunately, all these things are secondary. These are all language-dependent things, and no matter how much excused these requirements are, a language is not forced to do it, especially when it’s a language that suppresses the existence of value types. So the reason why it’s natural for some data types to be value or object type should be searched elsewhere.

You can try to compare it with the things in the surrounding world, but the problem is that in software world nothing is material anyway. A database record, a book title and color are all exactly the same material.

You can talk about changeability – but this way Smalltalkers and Javers will quickly kill you with a statement that it’s enough that the class does not allow for altering an object. So changeability isn’t the reason either.

The main problem is that the values exist always. In every programming language, no matter how strongly a language fights with them. Even in Smalltalk there is one value type – it’s a reference to an object. A value type is, in effect, everything that could be a type of a variable.

But this doesn’t explain, why, for example, string is not an object type, but a value type.

It’s not easy to explain, but I’ll try.

Even stating that string is for you only an array of characters, if you copy this array to another array, and when you have a function ‘f’ that was called with the use of the first one, replacing in this call the first array with the second one, in no grade should change any results of calling this function, for every possible function ‘f’.

Of course, you may say that the same holds for large objects, as long as the function does not write to them (ok, let’s even say, objects of class that does not allow for altering the objects). So I have missed something important.

The value should be passed by value because if you have a reference, you have an object. So, I don’t even make a restriction that the function can’t write to this array of characters. It means that you can pass this array of characters to function ‘f’ as value, which means that the function ‘f’ can modify the variable via which it keeps this string and the string in the original variable is not modified. Alternatively, of course, you can make the receiving variable constant and in this case it’s not important whether you pass by value or by reference.

But why then the string should be copied by value, that is, why another object should be created and wasted so much place… No. Stop. I haven’t said that you should allocate a new piece of memory then copy the characters. I just said that you should copy a value. Just as C++ does: if you pass std::string by value, the receiving variable can be modified inside the function, while this modification won’t be seen outside the function. Not always should it mean that a new array should be created and copied from the original.

You rather wouldn’t do it in case of large objects. Not only because such copying could be expensive. Also because copying objects should be only done as part of “cloning”, while not every object should be able to be cloned. And also because you wouldn’t pass the whole object to a function just to pass some data to the function – you’d rather pass some of object’s contents to it. Or you’d pass this object by reference so that the function can retrieve these values by itself (that is, the function will copy some contents of the object that it needs). Also because you may do it many times, also when the object is being altered.

Also, string’s characters compose one single term and they are not independent parts – they all comprise one single entity, even though you might extract parts of it (not in every language you can also extract single characters; in many of them you can at most slice the string; this happens in some of early BASIC languages with MID$ function and, for example, in Tcl language that does it via [string range] and [string index] commands). Practically the reason why there can be a “character” as a separate data type, which is usually also a kind of integer in some languages, results only from efficiency reason. The fastest way to operate with a string is to have it organized as an even array of characters, where every character occupies exactly some number of bytes. Of course, this method you can’t implement variable-length-character strings with UTF-8 encoding (which is implemented, for example, in GLib::ustring from glibmm). So, effectively, the operations of extracting single characters in case of string is just an implementation detail, which is a kind of “extension” to the normal set of operations done to a string. Normal operations done to a string can be: concatenating, extracting a range, splitting by character (or expression), tokenization, search/replace etc. Actually treating a string as an object would put a limitation on all these operations as they would be unusual as for an object (because, for example, objects usually do not have a variable number of properties, even though some languages allow this for objects).

In other words – it’s not possible that you take multiple objects of some class, glue them together, and get one “integrate” object in result, then make a new object by “extracting” some parts of an object (of still the same class). As you know, extraction of object parts can be extracting sub-objects or accessing properties, while every part is different (of different class). You can’t, for example, extract some sub-object of an object of some class as object of the same class because classes of these objects are always different classes (a class cannot derive nor contain itself!). While all these operations are normal operations being done on a string and can be done on other value types. You’d say you can do all these things with trees? Yes… because tree is also a value type, as it’s a container.

Also, please remember that objects usually have stable contents. Despite that in many programming languages (Smalltalk, Python, also OO libraries for Tcl) it is possible to create (and delete!) object’s contents (fields) during runtime. It’s usually because in these languages everything is done in runtime (there is no such thing as “declarative statements” in these languages). This, however, does not support any real logics. This is only just because the language allows to do it. Creating new fields during the object’s existence cannot be compared to anything in the logics. If you think that you may, for example, have several people in a car, I’d say, at most, that to contain several objects we need a container, and containers are value types.

Note though that in some programming languages variables can be identifiable. It means, simultaneously, that they are objects. Yes. A variable of type int in C++ is an object. It has its contents, its identification, this identification can be compared. So, can something be a value and object simultaneously?

Not exactly. A type can’t be a value type and object type simultaneously (not in every programming language, of course, for example, in Java and Smalltalk they can’t, as variables aren’t identifiable in these languages, although it’s possible in C++, Tcl, Perl, Pascal etc., as these languages allow to identify variables). A type (both value type and object type) may create objects, and variables are objects – of course, only in languages where variables are identifiable. This still doesn’t change the fact that if we have object type, this type may create only objects, never values. And an object of value type is something like an object that has exactly one property, which is of this value type. That’s all. Same thing about an array of integers, where you can extract single elements and even modify them in place. Variables of array types are objects (in C++ you can identify a vector<int> variable). However you should be able to, for example, return the array by value (in C++ you can still return a value of type vector<int>).

So what about these value types that may be object types simultaneously? In summary:

  • value types: should be held usually by single variables and should be passed by copying and identified by value
  • object types: types that create only objects and these objects should never be copied when passing nor identified by its content

Never copied? Well, yes. Objects are not copied. Objects can be at most cloned. This is a totally different thing. As objects should be identified by their reference, then cloning an object causes that a new reference is created. While copying a value is like this “main identifier” is copied because the value’s identifier is the value itself. In other words, cloning objects increases the number of specimen, while copying a value does not (values do not have specimens). Moreover, copying does not concern values – copying concerns variables. We can at most say that some value can be copied from one variable to another. It’s copied because the previous value of the second variable is overwritten by the value read from the first one.

I’d say, there is one more thing that makes cloning different than copying. When you copy, you just take the value from one variable and make the other variable the same value (as values are always copyable). When you want to clone, you should first:

  • check if the whole object’s contents can be cloned; if not, the object can’t be cloned – that’s why the “noncopyable” idiom is often used in C++, as objects are clonable by default (well, usually because in C++ every class is a value type by default)
  • for each concrete (non-reference) value type contained, the target object should have them copied from the source
  • for each reference value type, you have to decide whether the object referred by this reference is some private part of the object (so it should be “sub-cloned”, that is, in the target object this reference should be set to a newly created object as a clone of that in the source object) or this has to be some “widespread” object to which the current object should only refer – in this case this reference should copied as value

Does it sound familiar for a C++ developer? Yes, this is exactly why copy constructors in C++ can be defined. Stupid people say that “it’s because if you have pointer in your class, then this pointer usually points to some owned object, so you should define your copy constructor in order to create a new object for this pointer because the default implementation will just copy the pointer value” (or will just say “it’s because C++ doesn’t have a garbage collector!”). What a bullshit! The default implementation is just copying all object contents according to their copy constructors and every pointer is just a value, so it will be also copied by value. So, if you keep a reference to some object, which is your private object and should not be referred to in the object’s clone (but rather cloned also), don’t use plain pointers to keep them; instead use a smart pointer that is especially predestined to keep pointers to “owned objects”, which will create a clone of the object by itself, when copying.

You’d say “you still have to define your copy constructor, you just do it in different place”. Bullshit again! I’m just explicitly declaring my intentions by the fact of using such a special pointer. This way I’m explicitly declaring why I’m keeping this pointer there and what of the purposes of pointer I mean in this exactly place. Because if I used plain pointer and a dedicated copy constructor, I suggest “this is only a weak reference”, but then I override this statement by special things being done in the copy constructor.

People who think that C++ is a stupid language because you must define copy constructor to specify the way of cloning pointed objects, usually forget that in languages that do not feature copy constructors API users must worry about the “shallow copy” and “deep copy” semantics. As I have already said, these are implementation details, which API users shouldn’t even see (in case of C++’s copy constructor this is really an implementation detail and this way it’s possible to hide this detail against the API user). The division for shallow/deep is very indeterministic and it’s practically impossible to define whether copying should be deep or shallow for particular data type, while division for value types and object types is much simpler to define – or at least it’s feasible.

But what about value types that are internally implemented as complicated objects, which need to be copied, or even cloned? Well. Please remember that the definition, whether the type is value type or object type, is purely logical and not connected to the implementation at all. For example, in case of gcc’s implementation of C++, where the std::string type is using implicit sharing with copy-on-write for its contents, can we talk about complicated copying methods? From the logical point of view, no, because this is an implementation detail. A user can see only a string being a value and if a user assigns one string to another, they can only see that values are copied. Whether there is any large object “behind the shell” of a string and whether this large object is shared between multiple “string values”, it’s still only an implementation detail. From the user’s point of view, they all are independent copies of the same value. Of course, users may be interested with some details in case when there are some race condition issues because of that (for me it’s still unknown thing and I’m finding various informations about this – I have seen lock-free algorithm used in some version, but people on forums say that only the refcount atomicity is ensured, not protection against simultaneous access; this problem is going to be partially fixed in C++0x by using move semantics, without implicit sharing), but this has nothing to do with the value semantics aspect.

Languages sometimes make value and object types treated different way, some others do not. One of problems of this mess is the C++ language. In this language its absolutely up to you whether you make your type a value type or an object type. You can make things like “int x = 0” as well as “int* x = new int(0)”. The language does not force you various things regardless whether your type should be value type or object type.

No language does the right thing in this matter, though. C++ is the best in this matter not because it does the right thing, but because it does nothing, which means, that it also does not do the wrong thing, in contrast to other languages.

The right thing would be to have separate keywords used to create object types and to create value types, and also make them not interchangeable, object of value types should never be created by ‘new’ operator, while objects of object type should only be created by operator new (also variables can be only of value type). Java does it? Well, don’t be ridiculous. Java also treats String and arrays as object types. Also, only value types should be copyable and comparable by operator ==, while for object types it shouldn’t be possible (only cloning should be available on demand, if this is something that objects may do).

All in all, there are some important things to be satisfied for value types:

  • every function receiving the value passed from a variable (being only read) should do the same regardless of which of variables was passed to it
  • value types do not have contents, but at most there can be done some operations on them that create new values

For object types, it does not matter what contents the object has – things that matter are usually which object you pass to the operation because the object is identified by its reference.

Should then a container be an object? Well, no. Practically container is a value in every case. It doesn’t even matter how they are implemented. Every container consists of nodes, while every node wraps an object of some value type (however these nodes are still implementation detail). There can be some controversy in this topic in case of intrusive list (a list where every element’s class must derive from the list node type). But this container is a value type, too – the only thing it contains is a reference to element; it does not even allow for a random access to elements.

I know, you’d say “doing a[2] extracts some parts of it, so it must be an object, moreover, you can assign to a[2], that is, you can change parts independently”. Well, actually you don’t change these parts independently because the values are in specific order (which does not hold true for objects). Also, doing a[2] does an operation on a container, not extracts parts of it. It’s similar as with complex numbers: a.re() does not extract the “re” part, but does “Re” mathematical function on a complex number that converts it to Real type containing the “real part” of the number (“part of the number” in mathematical sense – it has nothing to do with reading specific field in the “complex” type structure). The same way you can do operations like “abs” and “conj”, which this time do not just simply read fields, so this time you wouldn’t doubt that they are conversions rather than extractions. These operations belong to the same category, as re() and im(), and they all are conversions. The same way, a[2] is also a conversion of a container to the container’s value type. And what about setting via a[2]? Well, as I have already said, variables are objects. The ‘a[2]’ expression converts ‘a’ container into a variable of its element type.

So can a value contain multiple elements, then? This operation belongs to a category of “conversion”, and it can be compared to extracting the integer part and fraction from a floating-point value, or extracting a series of bits from an integer value treated as a bitset. So, values can also have parts, as well as you can compose parts to create a value. These things still do not make values have object semantics.

Of course, there is one more controversy concerning containers: they also have something like a “number of elements currently contained”. This is a read-only property. Well, I can’t fit this fact into any theory I have described above but just saying that the operation that retrieves the number of object is also a kind of conversion. Whether some operation done on an object or value is conversion or retrieving a property, is defined on the logical level. If we have already decided that our data type’s characteristics predestine it to be a value type, then any such thing, even implemented as calling a method to return us some value, loosely related to the contents (note that in some languages integers have methods, too!), is still a conversion. A lossy conversion, of course, but still a conversion (complex::re() is also a lossy conversion). Containers may contain lots of things like that. But don’t forget that there can also be things looking like properties in integer numbers – especially if you use infinite precision numbers, like those provided by libgmp. Such numbers may have a kind-of “property” that informs about the current number of bytes used to represent particular value. In case of value semantics, of course, this is still a conversion.

Another important treat of value types is that they cannot be hierarchized as usual objects. This is, practically, a simple consequence of not having contents.

Well, of course, you may think that integer and floating-point numbers can have a base type named ‘number’. But, well, not in programming languages. You may refine some common properties in them, of course, but it doesn’t change much. To be able to be hierarchized, there must be two conditions satisfied:

  • there should be some common operations on this type that can be made the same way
  • there should be some abstract operations defined in some base class, which can only be specified in the derived class

The problem here is that as value types should be kept directly in variables and passed by value, there’s no way how a possible derivation can be made use of. Of course, we can always define some types by classes deriving from some others, however the main problem is that if we want to have this derivation fact meaningful for anything, we should operate with them via references. Although in C++ it’s possible, it’s only possible because in C++ value types can be used as object types as well. If you strictly keep the rule to use value types only in cases that are typical for value types and same for object types, you won’t have any occasion to make any least use of the fact that the types are hierarchized. At most you can save some part of work when you define them, but that’s all (actually, it won’t be a hierarchization – it’s only a C++ feature that allows to reuse a type when defining another, which is only a tool usually being, among others, used for derivation – or, in other words, this is extending, but not subclassing).

Of course, you’d say that actually string can be derived from array of characters… but why derived, not containing? LOL.

5. Some certain statements

So, are there any important things that we can consider in order to make sure that some type should be a value type or an object type? There are several important things to consider then:

  • value types do not have contents, while in object types there can be extracted some contents or elements (especially properties)
  • values can undergo conversion to another value type, and moreover, there don’t have to be a rule that there is only one operation to convert between two types; there can be done some “operation” for that (an example is converting complex to floating-point type by using real() and imag() operations) – while object types cannot undergo any conversion (they can only be extracted parts from, including base objects)
  • there should not be “references” to value types ever; there can only be: optional types (a value or not – see my article about references) and pass-by-variable idiom (in C++ this can only be done by using C++ references, though; only C# defines strictly a pass-by-variable idiom)
  • as I have already said, value types do not undergo derivation, although this is a simple consequence of being not able to have contents (when deriving, there can always be extracted a “subobject”, that is, a separate, consistent part that is of its base class type)

Let’s collect them then:

Aspect Value Object
identity Identifiable by value (by itself) Identifiable by reference
Having contents Do not have contents. Any real contents or even objects implementing all things required for the value (including special tricks when copying or comparing) are at most operated behind the scene. Have contents. Parts of the objects may be exposed to public and may be changed independently. These parts are usually called “properties” and some languages provide support for them. Also subobjects in hierarchic structure of the type are object contents.
Conversion to another type May undergo conversion. The conversion may be done implicitly or may be requested by performing a deterministic unary function. It is not limited how many kinds of operation may convert from one type to another, nor whether such a conversion is lossy or lossless. Do not undergo conversion. Note that conversion of the object’s reference to a reference of its base type is a conversion of reference (that is, a value type in result) rather than conversion of object. You can’t define any operation that it “converts” one object to another unrelated type (unrelated because cloning subobjects may be treated as “conversion” to related type, but it’s still cloning rather than conversion). Whatever operation you’d spell up to be looking as “conversion”, this will always be either object cloning or reference conversion.
Having references If a language allows to create references to variables of value type, this reference is not significant in the type’s sense. Variables being referred to should be rather treated as object with one single property and referring to variables is usually used for pass-by-variable idiom (although this can be deferred by using separate objects and this way references to variables may also be sometimes parts of objects). Note that some languages do not allow for referring to variables (Java) or they at most allow for pass-by-variable idiom (C#) References are indispensable. Every object must have a reference or it does not exist otherwise. References are needed to both identify object and at having any least some “handle” through which you can do any operation on an object.
Type hierarchy Values cannot be hierarchized types, which is a straight consequence of not having contents. May undergo type hierarchization. Derived object types (classes) may be extracted base subobjects from.
Number of occurrences Either constant and unchangeable (if of such a type that has a finite number of possible values) or infinite (like numbers, for which always one more can be spelled up or it can be divided in half). Whichever takes place in particular case, the number of occurrences does not vary in runtime. Number of objects can be at most limited from upside or there can be some definite number of objects pre-created, but usually objects can be dynamically created and deleted (even if deletion is implicit). It means that the number of occurrences of objects may vary in runtime.

6. Typical examples of value types

Ok, so what types can be good examples of value types? As the rules are still a bit confusing, examples should make things more clear. Let’s enumerate them, then:

  • integer numbers, of course, as always
  • floating-point numbers, too
  • complex numbers, despite that they usually are defined as a structure of two floating-point fields (and they are also methods that return their values), they are not properties of this type – for example, you cannot set only imaginary part to some else value
  • vectors and matrices (in algebraic sense)
  • sets of number-based data: points, rectangles, shifts etc.
  • strings, as I have already mentioned
  • containers (even though they usually keep large objects and despite that they seem to have parts – they don’t actually; the elements kept by a container are not parts of the object; parts of the objects are nodes and nodes are never public; the only exception seems to be an intrusive list, but it’s still not – only list or set can be such a structure, and the main object keeping them usually keeps only a reference to the first and last element, moreover, it’s still not public)
  • iterators for containers (even though iterators for particular types can be very complicated and even not so cheap in copying)
  • various enumeration types
  • various tokens, descriptors, identifiers and alike (actually they all play very similar role as references)

That’s actually all that currently comes to my mind.

7. Foot in yourself shoot

Among the languages approaching to this topic, there are not many that make programmer’s life easier and only few that do not make it harder.

Smalltalk seems to be very bad in this topic. This is a language that would make references objects, too, if it was possible. Although it at least allows for easy simulation of value types by not allowing for altering and overriding operator = – although it isn’t perfect emulation because containers are allowed to be changed in place.

The really evil language is Java. Only some selected builtin types are value types (integers, characters, floats, and of course references), a notable exception is String (don’t try to tell me that it’s not a builtin type – only builtin types have operators in Java). Users cannot create nor even simulate value types, even such imperfect way, as it’s in Smalltalk.

Some approach to this topic was made in C#: with structs that are value types. However what a value type is it when it should still be created using ‘new’ expression? Other interesting thing is that it always has a default constructor, I may declare object without calling its constructor and then normally use it etc. This is only a set of my objections to the existing solution – because for me it also lacks of destructors and derivation (as extending only). Of course, value types in C# are provided only for optimizations, not to support the value type idiom.

A bit better is in Vala (in Vala values of struct types are created without using ‘new’ keyword) – although it’s hard to say anything about this language concepts as they usually exist in various mutations, controlled by annotations. A small annotation may make a huge and bloated GObject-based enumeration type, with assigned strings, memory allocation things etc. into a simple, int-based enum type. So it, for example, can make a “class” a simple value type by adding [Compact] annotation. Although this language looks much better than C# (while Vala was based on C#, I would say that C# creators can learn a lot from this language), especially regarding nullable and non-nullable types, however it still doesn’t have anything that supports the value type idiom.

C++ is weird. It doesn’t support the value type idiom at all – although, it also doesn’t disturb anyone to create them. Actually, every type created in this language can be a value type just by the fact that you can create a variable of this type and you can always create a function that will get or return such type by value (of course, unless the class’s creator didn’t privatize its destructor). It doesn’t support the idiom of value and object types, but at least it doesn’t disturb a user using it.

The D language seems to add some support for them by special kind of classes – however it also allows objects to be copyable and comparable by default, so this has nothing to do with value/object semantics.

What we’d expect in a language to support value and object semantics is that:

  • there are separate statements that allow to define value and object types
  • object are not copyable and not clonable by default (although there is some support to make cloning support easier, for example, if all subobjects are copyable or clonable, use them to provide a default cloning)
  • only value type can be type of a variable
  • the language should provide ability to create objects of value types, however it should also provide non-nullable references

I have no hope for supporting all these features in any of existing languages, although I can lay some hope on Vala. Unless it has some commercial support, though, it’s hard to suspect that.

What you can do about it is to make sure of using value and object types in your software and make sure you do not mix treats of these two in one type. Ever.

If you don’t, you will be fighting with very complicated topics of “comparing by value” and “comparing by identity”, or “shallow copying” and “deep copying”, which things are nothing else than digging in the implementation dirt because someone doesn’t understand the logical level of things they try to program for. Think about valuables and you’ll complete your objectives.

Posted in Uncategorized | Leave a comment