Sweeping bugs under the carpet

A bit differently to the other articles, this is predicted mainly for managers and team leaders, who should be especially interested with increasing software quality. Although many programmers may still get some advantages from it :). The first version was published on blogspot, May 2009.

0. Basic statements about software quality

Software quality is one of the most important interest of the organizations occupied with producing software. What is interesting, it’s much easier to write bad software than, for example, create a bad bridge or a bad car. Reasons are somewhat similar, of course, but producing a software, despite it being an engineering domain, is a high level of complexity, and this spreads to every aspect of the process, including quality.

When you produce software, everything is added levels of complexity (comparing to other engineering domains): planning, creation, and verification, too. That’s why every case of simplification is welcome. However with this simplification there are two main problems:

  1. Many topics are only “felt” or “intuitively understood”, not elaborated
  2. Incorrect approach to simplification results in “re-complication”

The process of “re-complication” can be compared to trial of compressing a container of water. As you know, water is no air; if you try to press it strongly from above, the container will expand horizontally. Water just can’t be compressed; you can at most heat it to make it boiling and turn into gas, which can be compressed. But again, increasing the pressure of the gasified water may shift its boiling point and make it condense. This is what happens when you try to restructure something which’s structure you don’t understand.

Things like that may end up with “re-complication” and spawn something that is harder to understand and use. This is what happens to various approaches to simplify the software engineering processes.

First of all, too many people forget that the most important characteristics of software engineering is that it’s a process of implementing some logical ideas using some intermediate statements, called often “programming languages”. In particular, it’s too often forgotten that the “logics” is the core of programming, not the language used to express it. A programming language is only a most wishful method to transform the logics into a working program, so it’s only a secondary product of software engineering. Whatever process support you’d spell up, stating that it bases on the programming language used to write the program, will be a secondary-product-based support, as well as secondary would be any help, any results and any simplifications that they would provide.

This highlights for us the following non-obvious characteristics of the software engineering:

  1. One of the most important things is transformation reliability (how good your translation from the logics to the programming language is)
  2. Learning programming techniques should more base on logics and less on the programming languages. Programmers whose thinking manners base on programming languages or platform specifics, will never make reliable software (also thinking on the platform level may often lead to bad code optimizations)
  3. Testing should also base on logics stated for the program rather than on what can be determined from the sources

The needs to focus on logics have important consequences. For example, it’s best if you don’t rely on only the source files to keep track of the logics of your program – it’s always much better if additional information is provided, which contains logical statements that cannot be reliably coded in the programming language. That’s why it’s very important that the program does not only consist of source files, but appropriate descriptions, documentations, diagrams, and many other documents are provided and accordingly maintained. Also all automatic tests should first of all rely on the logical statements of how the application should work rather than on whether the program produces correct results.

1. Correctness and incorrectness of the program

The program should behave as it is desired. This is the first mistake that programmers do. The program is created not to fulfill the requirements, but to fulfill the customer’s desire.

So, the first level of the software need is the desire. This is what we really want to have.

Next level is the requirement. This is what we described as what we want to have.

Next level is the HLD. This is what the designers understand our description of what we want to have.

Next level is the implementation. This is what programmers have implemented of what designers understand of what the customer has described that they want to have.

And, finally, we have a product. Product is an output of the final testers, who have confirmed the implementation. Effectively, the product is a finally-confirmed output of what programmers have implemented as they understood what designers have understood what customers have described they want to have. In simplest form:

Desire -> Requirement -> Design -> Implementation -> Product

Every arrow in the description above means a place where the bugs can be provided. When we’re talking about the software quality, though, we usually talk about the bugs that occur during the implementation phase. Because the input for the implementation is usually the logics and the the logics should be translated into a programming language.

The problem is that in general the program is meant 100% correct if the product is 100% consistent with the desire. Even if we focus on the implementation only, the correctness of the implementation relies on how it reflects the statements of the design. The problems with verifying it is driven from the following reasons:

  1. Some things needed for the implementation are not specified by the design. They should not matter for the final results, however there must be found a way to eliminate them in final verification.
  2. Desired results of the implementation should be translatable to logics, in order to have something reasonable to compare with the first statements
  3. Complete results of the implementation should be able to be translated to the complete logics, containing both things stated by the design and things needed to introduce into the implementation to block undesired behavior. Parts of the program that do not work as it is desired, should be blocked.

This is one of the reasons that the modern software production processes involve customer into the process. Because the real verification should be the verification of both ends of the chain: the product is correct when it is compliant with the desire.

Of course, we need these things that prepend the implementation. We need a design to create the implementation; we need requirements to prepare the design. But the first verification of the final version of the implementation may rely on comparing with the design, maybe comparing with the requirements – but the software company cannot compare it with the desire because only the customer knows what the “desire” is.

Of course, some companies think that they did everything correct if their product is compliant with the requirements. Such companies, however, should not be treated seriously because such an approach is treating the customer ridiculously. The customer is satisfied if they received what they desired, not what they were able to describe as their desire.

But focusing only on the implementation, there still are things that no program should ever do and there’s rather a requirement of every imaginable customer that the program will never do: crash or change the internal document data in an undefined or untraceable manner.

Bottom line is that in final result we should have a product, which is an implementation of, indirectly, a customer’s desire, which should do things that have been stated (which is the compliance verification) and not do things that were not stated (which is often forgotten).

2. Logical input and logical output of the implementation

No matter what theory was used do design and implement the program, the following steps are always true for the software implementation process:

  1. The design defines how the final program should fulfill particular requirements and states desired scenarios
  2. The executive part of the implementation is using variables and generally various other changeable data
  3. The external and system parts of the application the changeable data introduce its own logics, not always desired, however indispensable for implementing the design statements
  4. The final implementation contains statements required by the design and requirements, plus some additional logics

Every program must use variables. Even if shifted to very narrow role, they still are variables and they will always be variables.

What is a variable in general? It is some data that may be written and read by the program, which means also, that some procedures can behave differently, depending on the value of the variable.

Is the variable part of the logics of the program?

I should make here a very long text that has no reason, or just leave an empty space so that you can reconsider the above question. But I’m too lazy to do that. Just read this text and this should make you sure that … eee … know the answer? Now? So, what do you think.

Variables are not parts of the logics of the program. Variable is a tool we use to implement the logics of the program.

Ok, now few words about the terms I am using:

  • state: a logical unit that may change values when modified and that can be read a value from
  • state value: a value that can be assigned to a state at a time

Maybe it would be better to say “a variable is in state” than “a state has a value”, but I would have to remind that I don’t mean a variable in a programming language. This sounds better for me, at least because a variable is only used to express this state, not always perfectly. In these terms, “state” is something like a “stateous unit”.

We can also say that the program is in some “state”. It means – in these terms – that the state of the program can have appropriate state value. Of course, programs can be in many states; the number of the values that can be achieved by the program’s state is called the “degrees of freedom“. Degrees of freedom can be a property of the state, no matter how implemented.

Like it is with many other things, states can be hierarchized, dependent on each other, and so on, so finally, states may be simple (just one single Particle State) and complex (consist of many other Particle States or complex states). The whole program, then, has its own state, which is a complex state, and which consists of all particle states in the program, unified. The number of degrees of freedom for this state is equal to multiplied all its particle states’ degrees of freedom, minus tristate-blocked values (if you have a tristate: [nonapplicable,true,false], then the nonapplicable value does not produce any new result when multiplying with another state) and self-elliminating values (there may exist two equal values of a complex state despite some of particle states may differ).

There’s really a lot of things that can be told about this topic, but I’ll stop here as it’s not the subject of this article. I’ll try to describe it more thoroughly elsewhere.

Every program must work with states and so its particular execution statements rely on values of particular particle states, so these states define possible execution scenarios. How much such scenarios exist, it depends on the degrees of freedom of the state (depending on level or execution period, for which the scenario is defined, states on this very level).

So, our input was the design that defines what scenarios should be possible to be passed. Our output is an implementation, which contains many various potential scenarios, of which several have been allowed to execution and others have been blocked by appropriate ending them by a dead end.

3. Physical output of the implementation

First, let’s reconsider, what we really need to have a product.

To have a product we need to have a binary file with program and co-accompanied files.

To have the binary file with program we need source files.

And we can say “that’s all”. If we need everything we can “compile” into the final required form, we don’t need any other things.

And that’s the reason why many companies do not create or maintain other things. Creating and maintaining them costs a lot, so we spare costs by not having them.

Good. But what would you like to do with your source files next? Your product is ready, you can throw them out.

What are you waiting for? Why do you waste your disk space? Delete!

Why not? What will you use them for?

Ah, you want to improve it with some additional features? You’re going to create a new, enhanced product, with more functions, more useful, so that you can get more money from this?

Good, but you still have only source files! Are they any better than binary files?

What? You think you can just continue with the development? Very funny.

So, again. As I said, requirements is logics, design is logics, and implementation is logics, too. It means that the source files contain only the statements that explain how “some logics” should be implmented by the machine – but they hardly explain what this “some logics” is! Imagine that you have an engine and you’d like to improve it. You get the engine’s schematics. But how would you do any work to make it if you don’t know what’s the machine’s purpose!?

Another example, a bit more from software engineering: If you have a physical value of a distance and speed, then having given the base distance and the speed we can calculate the time as

double t = s/v

But we can write it also as

double a = b/c

What does that above mean? We have used three variables of ‘double’ type – what would this ‘double’ mean when you have a=b/c?

This is how the majority of statements look. Of course, some identifiers may be self-explanatory, there may be some comments, but it’s usually never enough to get notions about what logics it tries to implement.

And at this time, you wish you had some documentation…

Of course. Because only this additional “documentation” can contain any straight explanation about the logics that this code implements. So, if you want to make enhancements you need sources, but you also need some explanations about the logics – if you don’t have them, the only way to go with the development is to first reverse-engineer the code.

That’s why it’s important that the implementation is being created together with the LLD (Low Level Design) AKA DLD (Detailed Level Design).

I don’t mean LLD in the “waterfal sense” (which is indeed very stupid thing). LLD should be a part of implementation work.

The LLD is a bundle of documents that should be created during the implementation development and should complete the source code with various additional descriptions and schemes that will next help understand the logics of the program. And because these two things are so interconnected with each other, many parts of the LLD are created by automatic documentation generators (like Doxygen). Of course it will never be a “complete logics” because the logics will always be more or less floating, but it should explain enough logics so that everyone who will next try to continue the development will know what they’re doing.

If you have only the source code, which is, worse still, very sparsely commented, to be able to continue the development you need to perform the reverse-engineering first.

If you’re going to create unit/component tests for the code for which you only have source files, you can create only tests that will confirm the correctness of the source code for “whatever logics they may employ”, but not for the logics, for which the implementation was created.

If you’re going to confirm the correctness of the program, for which you only have source files… you’d better go to a fortune teller. Or roll a die.

4. The needs of reverse-engineering

If you think that you can “start coding” when you only have source files, you have probably never got involved in the strictly technical software engineering. Even having a full documentation, an engineer still should spend some time to study it – although this time will be relatively short. When there is no documentation, the engineer should spend much more time with analyzing the source files and – what is much harder – guess the business logic standing behind this code. If you think that “this reverse engineering will be still cheaper than writing a program from scratch”, you may be astonishingly surprised. You might have probably heard sometimes that “this code is so tangled that I would spend less time by writing it from scratch than fixing it”. Often it is said so by programmers because it’s more satisfactory (and less boring) to create something new than dig in the dirt of the existing code. But unfortunately they often just tell you the truth: the time spent with writing this from scratch may cost them less time than “reinstating” the existing code to full understanding. If you have some more resources to spend with this project and you think that it’s valuable for you to collect this information, you can give the project to two separate teams, one to create it from scratch, and one to reverse-engineer the old code. I would beg for 80% that the first team will finish their work faster.

The reverse-engineering by just looking at the program how it works and trying to duplicate it manually is the simplest method and theoretically is highest cost. Any platform-code-based reverse-engineering (disassembly-based) is complicated and practically impossible to be done manually, without very advanced and intelligent tools. This theoretically would cost less, but tools that can do it reliably are expensive. And finally the reverse-engineering of the program’s source code is theoretically cheapest. But it still costs time to spend with the source code to see how it works and what it does in practice, plus guessing what the author’s intention was to write it.

This job involves not only to navigate through the code, but usually also run it several times in various different conditions, maybe also under a debugger, and trace various execution paths in the program. Worse still, you should also consider that the code may be in some parts written incorrectly or some parts of it was a “wrong way”, abandoned and not used – just the developer had no time to delete it from the source. And of course, during this job you should create the LLD documents concurrently; otherwise this job is just wasting time.

Time spent with this isn’t that short as you may think. Restoring the logics from the existing code will always take more time than spelling up these logics in your mind (because when reverse-engineering you try to discover it using trial and error method). The only way to spare time in this first case is when you don’t have to understand all of the logics of the program (according to my observations, if you wanted to analyze everything in the program, you’d have to spend more than twice more time than it was used to write it). That is, you just need to understand some small parts of it, while you don’t even touch other parts (if my observations are fair, you should not analyze more than 1/4 of the total source lines; otherwise you’d better write it anew).

Time spent with analyzing the source code may be less than spending the time with analyzing the same program for which you only have binary files. I say “may be” because this is often an illusion. In average estimation, you should most safely state that it really doesn’t matter if you have source files or not, if you don’t have any least documentation. Moreover, if you have a description of what the program is doing and how it works, it would help you much more than source files. Of course, it happens that source files help you “continue development”, but this happens only if you have a big and expanded program and you need to modify only a small part of it. Analyzing the source code may have a big educational value, but well, keep in mind that this is not a value that directly contributes to works on the software.

Especially that the public software tools evolve, and today we have at our disposal many advanced and complicated tools that make our task easier than it was many years ago. If your application has been released in first version more than 5 years ago, it’s quite probable that a lot of manual coding and private utilities had to be written to make it work. Simply because there was nothing in this topic on the market at the time it was created. They can be now replaced by some existing library, usually even with LGPL license. That is, even omitting the fact that now that you have this application you are a bit wiser and can take advantage of lessons learned, the new version of the application written from scratch may take less time to write than it did for the old version of the application – due to much less source lines that need to be written.

5. Why simplifications and automation are so hard?

Pay attention that the source file in a programming language is an attempt to encode the program logics into a form understandable by the machine. The programming language is not the language to encode the logics, but just another attempt to automate and simplify the process of making the machine “execute” this desired logics.

Let’s even skip the advantage of portability that today languages (other than Java and C#) bring us, which is an advantage, indeed. Let’s focus on the simplification that the programming languages bring us.

Encoding a program logics in a language comprehensible for the machine requires to write it using appropriate machine commands. Enhancing the level of logics is possible when we have a program that can translate this level of logics into the machine language.

But it’s not the only role of the programming language. The programming language’s role is also to employ the logics of particular level.

If you ask me, what this “logics” is I’m talking about since a very long time, I wouldn’t be able to tell you in one simple sentence. Because logics is the logics. This is what you think the program should do. But what logics really is (say, “how it would look like”), it’s very hard to say. There can’t be any strict definition of logics because first thing you take up when you attempt to create the implementation is “how would we do it?”. And once you state such a question, this is where you translate your logics into computer language (so it’s not “logics” anymore).

That’s why many people may think totally different things about what the logics of some program is, even if they’re going to create the same program. You can think of the levels of logics like of the levels of network protocols, or layers of the application. This is basically the same thing: there is some underlying layer that would work for the sake of the higher layer so that it can exist and do its job. This is something like using a low-level transport protocol to work as a transport layer for some higher-level protocol – like, for example, SyncML protocol over HTTP.

So, similarly, you employ bytes of memory and FPU to implement your real numbers and you employ real numbers to implement speed, distance and time. The real number is not a computer level – computer level is only bytes of memory and floating-point operations done by FPU (or emulated). This is a logical level, then a higher logical level would be the terms of speed, distance and time.

Logics is fuzzy by nature. I’m not talking about the fuzzy logics of course. I’m talking about what various people think about some things being done some specific way. Logics is fuzzy, which means, that there are no strict rules, what logics is behind some specific task. For example, when I’d like to perform some operation for each object from the list, should I define the operation as “do something with the object and do myself for the rest” or “perform the following thing for every item one after another”? Is the second thing only an attempt to implement the first, or both are an intermediate description that can be understood by compilers? Is modifying the object in place a normal and quite logical thing, or logically I should treat every object as unmodifiable and rather create a new object instead of modifying the existing one?

Both can be logical or illogical (that is, an intermediate description). That depends on what you compare it with, depends on your knowledge about how to play with items created by a computer, how you perceive the surrounding world, how you would create similar thing not using a computer. Yes, it also largely depends on your imagination.

Also, some “rules” of what logics can be used to implement any higher level logics, should be developed and implemented by creating particular programming languages. But whatever level you’d reach in these attempts, there are two problems:

  1. The highest level of logics for the application is the description of this application itself – logical terms refine as the level of logics increases, but becomes more and more specific for the problem and this way we’d reach the application itself. Such a level is useless for our general simplifications because in a language of description created as a highest level of logics of application X is a language in which only application X can be written.
  2. Describing such thorough logics as statements in the programming language may be, unfortunately, really effort consuming. Using, for example, the ‘double’ type for speed, distance and time, is good enough, you’d write it quickly and make a readable program. You may want to use the Speed, Distance and Time types for them all, assigning them some special mark as the unit and define dependencies between these types and various equalities and equivalences between particular physical units (like N/kg being equal unit to m/s2). You can do it. This work, sometimes, may be really worthwhile. But really, really in very specific situations. Generally, these things will only contribute to extend the time to create the application and contribute to no advantages.

So, whether we want or not, we should be conscious of the following things:

  1. The application will be always written in a language, which’s level of logics is lower than the level of logics for the application
  2. Whatever automatic tools we create to help us make software, they will only work with a strictly defined language, which means, that they will not do things that require anything above its level
  3. Any tools that would verify the application’s correctness will work always only for that level of logics, for which we have source files, so they will only verify the correctness on this very level of logics.

However, it’s really worthwhile to make the language more schematic and using stricter definitions (that is, for example, that use strict types for variables rather than allowing a variable to be of any type). Only such languages can help you with encoding the rules for the correctness in the language itself. Even if such level of logics, which will be checked for correctness, is low, for totally-dynamic languages the level is much lower.

Dynamic languages, especially those totally dynamic, like Tcl, are no higher level languages at all. They are much like assembly language, there’s only a different platform that runs the program. Languages that do not employ static types at all or do not allow to create any restricted types are all low level languages.

As I have mentioned about abilities to start coding when you have source code, but no documentation, this is also similar when it comes to automation. Automatic tools that would process the source files will be only able to do a research based on the source files, but they will not guess the actual business logic. If the difference between the levels of the final logics and the source code is very high (and it usually is true for C language), you can only use tools that will find any “general purpose bugs” in the source code, which is important, but still not focused on the matter of what software is.

6. What useful automation can we count for?

Ok, now that we have several tools on the market, which can do for us many useful things by analyzing the source code, let’s try to reconsider, how useful they may be, stating that they are working for particular programming language.

So we can count for:

  • Determining improper use of language and semantics (that is, potential errors)
  • Checking code complication level
  • Checking for deterministic execution (for languages that have abilities for nondeterministic execution)
  • Checking for non-generic programming
  • Checking for bad style
  • Checking the testing level
  • Creating call graphs and class diagrams (inheritance and collaboration)

This check for “testing level” is used to check what fragments of code are being passed by particular test cases. It’s called “Code Coverage”.

As you can see, these automatic tools can only do things that can be described by the programming language or extract and evaluate the information that can be contained in the program (written in that language).

One of the important tools are those that can create class diagrams basing on the source code. These tools are really worthwhile; it’s really much better if you don’t create your code with blocks and generate then from tools. As a programmer, you should focus on writing the code, nothing else. A good, experienced programmer will write the code, from which class diagrams will be extracted by an automatic tool (for example, Doxygen), much faster, than someone who would create the class diagram and generate the code.

And if there’s something important that cannot be written in the programming language but is required that the tool better understand the structure? Then the tool should provide additional facilities that the user can use to help the tool guess what they meant by writing particular code.

The “lint” tools (like Klocwork) help determining many potential errors and dangerous situations, also help detect unreadable, too complicated code and many other things that decrease maintainability. The same about bad style.

The topic of unit tests is slightly more complicated, though. That’s why it’s hard to evaluate, how good a tool that works with unit tests can be.

Since generally all these tools have exactly one destination: help developers write better code. Nothing more. Developers, who use these tools can still write the same bad code anyway – the tool isn’t a magic bullet, nor it is any gauge that can be used to show the code quality. By the way, there is no such tool at all. The main problem is that the tool does not operate with the logics, which means that:

  • It can only check for “general purpose bugs” and “direct value-based result”
  • It can only check the correctness of the code that is present (but not detect problems because of a code that wasn’t written)
  • The measurement based on execution path is meaningless because it has nothing to do with the real logical layer – the execution scenarios

Which leads us to a simple statement: the correctness can only be checked by application-specific tools or manually, having the thorough requirements description. Both require effort and both are strictly “human-based”.

7. What things can be done to make our code better?

Well, I even have a better question: what can we do to ensure the maximum reliability of the quality?

Yes, I know this is a tough question. Especially that it doesn’t bring any help to managers.

But this is really all we can do to enhance the quality: make the quality conditions predictable. Once these conditions are predictable enough, we can at least know whether the quality is good or bad. Various tools and techniques can only help us enhance this reliability rather than the quality itself.

You can say: we should have unit tests, we should have testing procedures… yes, seems wise. But if you say that when the program is already written, you are not wise. You are dumb. (Although we have a saying: “A wise Pole after a detriment”.)

The most important factor that may enhance the quality’s reliability is the program’s testability.

You know, the most simple way to debug the program is to put printing hooks inside so that we know what the program is doing. If you think that this is an “obsolete” technique and really primitive as compared to debuggers, you probably never got involved technically in any software engineering, and never, for example, developed an application in real-time system. Don’t forget that when you are debugging a program, then every instruction is being executed much longer time than it takes in “real conditions” (which sometimes really makes a difference), moreover, the instructions that are executed similar time than the “external” ones, now execute with a really great time difference. Programs also often contain timers, handlers for some events that are synchronized in time, and the program when traced line by line just won’t behave the same way as without debugger. I don’t even mention threads, which make more mess in this.

This is the first thing. Second thing is that if you want to test some small part of a big program, you need that the program be in some particular state – and in order to enter in this state you need to prepare the program for this somehow.

It’s not a challenge to write a big program. It’s a challenge to write a big, testable program. A program is testable, if you are able to extract some parts of the execution statements, even tear off the current execution environment and run under controlled conditions. If some part of the code is not able to work in a “torn off” environment, it means that this part of code is not testable. It’s very easy to write a program that contains a lot of hidden, false, or unnecessary depndencies. Not only may they be not testable, but it may be even very hard to try some real proram scenario to run in a controlled environment. Or at least it would require a lot of effort to “configure” the program’s state that is the starting point of this scenario.

In short, it’s very easy to write a big program, which does things that you absolutely don’t know. It’s especially important in case of environments that do not provide any abilities to debug the program – for example, device drivers for embedded real-time systems. The truth for this case is – if you can’t reproduce it with the emulator (or even worse, you don’t have any emulator), you are in… a hole. The only thing you can do is to print some logs. Stating that you at least have something to print on.

Anyway, it is important that the environment in which you run the program be close enough to the production environment and provide testability. But much more important is whether your program is easy to do the following: take the function you suspect of providing the problem, reinstate the environment (that is, set the global state to the required value), run, and collect enough information that help you determine the problem. You can do it only if:

  • This function is stateless and deterministic. In this case you just feed the function with input values and expect some result after return.
  • The program provides access to this function via some text-based interface, or even a “scripting engine” built into the program

If none of these above conditions is satisfied, the function isn’t testable – which means that you don’t even have notions whether it works or not!

If you have functions like that in your program, it means (at least) that your program has low testability. The program is moreover little testable, if it relies on values in some global variables, these global variables are pointers to other variables, objects, these objects contain more pointers to objects etc. Moreover, if you are trying to hide the true nature of the object being held by some pointer by using the generic pointer (void*, gpointer, Object, id, or whatever it’s called in your favourite language) and casting. The ideal situation is when the programmer, when looking at some pointer variable, always knows what an object would be assigned to it, what it would do and why it is there. And, moreover, such a situation occurs rarely and objects are rather being obtained from the place that owns and manages them, rather than remembered.

At this point there are currently three digressions I should make because of things that came to my mind:

Too much data sharing. This is the true nature of garbage-collection-based languages. The only situation when the garbage collector helps is when the object is shared between some parts of the code (because in case of objects that are owned by just one owner, there’s no problem: the owner takes care of allocating and deallocating the object). Now it looks like that it’s always much better than the object is owned than shared. That’s why it’s usually better than you have a shared pointer to an object and use sharing only when you really need it, than providing “sharing for everyone” and encourage people use object sharing for greater scale, enhancing this way the “unreliability” of the source code.

The problem of global variables, Global variables, yes, have always been accused of making problems, which is correct. So, creators of various programming languages (Java, C#) made it impossible to create global variables and requested that every variable or function must be a part of some class (this stupid statement fortunately did not span to Vala language). But if something is useful (and, moreover, dangerous, don’t know why, they usually come together), programmers will always find a way how to simulate it or do something similar. That’s why the idea of Singleton is so widespread throughout the code in Java and C#, feigning as if a “high educational programming” was used, while this Singleton is nothing else as another mutation of a global variable (that’s a topic of another article – Singleton is a totally different thing in C++, in which it’s a tool to solve the problem of predicted dependency-based initialization, and different thing in Java, in which it’s just a global variable emulation).

Since the problem with global variables is not that they are global. The problem is that the global variable is one of the cases where the sharing of the data occurs. Because the core of the problem is not the global scoping – the problem is sharing. Global variables are only one of possible specializations of data sharing. Ok, I would say it much more directly: the real danger of how the shared data can be harmful for the software quality is when the data is written. Sharing is much less harmful, if we have only one writer. The real problem is when there are many readers and many writers.

Dynamicity and expressiveness work against correctness and testability. That’s one of the reasons why Java has become so popular – it provided a good balance between the advantages of Smalltalk (virtual machine, garbage collection, OO) and advantages of, say, statically typed languages (practically C++, but Java’s creators would respond with indignation for suggestions like that – Java was, still, based on Haskell!). The dynamicity of Smalltalk was the main cause of problems in this language and introducing static typization, although it strongly works against object-orientation and breaks the language design stability, it contributes to decrease bugginess of the software. Static types were created not only in order to enable more optimizations on the low level. Static types are also used to precise what you really want that this part of code be up to. These types will make any debugged code more clear. The more “universal” and “dynamic” your type is, the more time you’ll waste to guess what’s going on there. That’s why types that are “maximum expressive” are simultaneously “maximum error-prone”.

So, the following things should be taken into consideration, so that the code is most possibly testable:

  • Abstain from data sharing
  • Strongly control data recursion and cyclic dependencies (at best avoid if possible)
  • Kill off false dependencies between functions and data (try to decompose the data structures so that functions do not access more data than they actually need)
  • If a function doesn’t require state, do not put any state into the function (try to always make the function stateless and deterministic, if possible)
  • For states (data that must be shared), determine the degrees of freedom and scope
  • Be prepared to provide a command-interpreter functionality so that it can be tested externally
  • Do not hesitate to create more static types that more precisely define your logics

8. How to get known of how reliably our code is tested?

Remember one important thing, which I have already mentioned, but I’ll repeat it because it’s too often forgotten:

The source code only describes what the computer platform should do to implement the logics of my program. So if there is any bug in the program, it may be:

  • A “general purpose bug”, that is, a bug that will always be a bug no matter the logics this code implements
  • A “implementation bug”, that is, the implementation does not reflect the desired logics
  • A “logical bug”, that is, bug in the logics itself

There can also be various more detailed bugs, like “interoperability bug”, that is, two pieces of code are bugless by themselves but buggy when composed together. But let’s focus on these most simple parts (these complex incorrectness may result from these bugs above).

Automatic tools may only help you with this first one. The others require either additional code, application, library, whatever, to perform specific tests. Or just manual verification having requirements-based scenario description in hand.

If you want to make sure that your implementation does not contain bugs:

  • You can use “lint-like” tools, to check your program for possible “general purpose bugs”
  • You can make and perform test scenarios to check whether your logics is reflected correctly by the implementation, to check for “implementation bugs”
  • You can research your logics to check whether they do not contain “logical bugs” (you can also test your logics working in the implementation – perform logical scenarios)

You can also try to determine the existing logics that are in the code and try to compare it with the stated logics. Well, this seems to have notions of a really good approach. By the way, this is exactly what the group code reviews are predestined for: let everyone interpret the code that they are seeing and check whether everyone interprets it as the same logics.

But there is another problem: there’s no tool that would translate the code source back to the logics. Moreover, logics cannot be really written, nor there can be any machine-level possible translation between logics and the implementation. If it did, 90% of problems in the software production would have been already resolved.

The source of the problem is that the logics, which is a base for the implementation, must be elaborated by a human (who is unreliable and can elaborate stupid logics). Then, it should be implemented by a human (who is unreliable and can make bugs). Finally, the program is checked for correct logics, by a human (who is unreliable and can verify it incorrectly). Or it can be verified by automatic tests, which would be a strict and totally machine-executed code that reliably performs the verification. Of course, these automatic tests must be first written by a human (and guess what… the human is still unreliable! :)).

Whatever tool for verification, quality tracing, statistics, obtaining data, classifying and many many more… it will always end up with a statement that a) the problem of reliability is in human and b) you cannot produce software not involving human.

And that’s why writing software without bugs is impossible.

The only thing we can do to enhance reliability of the software is to decrease the bugs. And every method that can significantly help in detecting bugs and potential bugs and threats and everything else that can threaten the code reliability, is always welcome.

Effectively, you have two possible approaches of how to try to make sure of the software quality:

  1. Use automatic tools to verify the source code. You’ll only get known that the source code does not contain general purpose bugs. Maybe you’ll also receive information whether the code has some bug because it didn’t pass the testcase. Of course, you don’t have a tool that verifies whether the testcase is correctly written, so even the testcase pass doesn’t make you sure that the code is correct.
  2. Rely on experienced programmers and give them all possible tools that will help them analyzing the code.

The advantage of #1 is that you don’t have to rely on those unreliable humans. Whatever thing you’d use, you are sure that the tool didn’t make any mistake. The disadvantage is that you don’t really get any reliable information about the software quality. Note also that no matter how you can rely on these tools, they still weren’t written by “gods” or computers… Not even mentioning how naive they are.

The disadvantage of #2 is that by relying on humans you can miss. If you rely on “bad people” you’ll fail. And of course, people are always unreliable. However, the advantage is that if you rely on people, and you’ll not believe more to the tools than to their words (that is, if you just trust them), there’s a great chance that they will respond with trust to you and whatever statistics you’ll create, but won’t use against them, will be more reliable.

If you even try to make any statistic behind their backs and watch them – don’t ever let them know that you’re doing it. If you do, they’ll always find a way what to do to make these statistics display good results, of course, not by bettering the code quality, but by tricks (people are way too smart than the tools). You cannot show even one smallest trace that you are creating these statistics and use them for anything. Not even a trace for suspicions. What it means in practice is that you may only watch them and save the information for yourself. You cannot make any decisions or changes in managing work, if they base on these statistics – if you do, your programmers will detect you quickly and quickly they’ll trick the code so that statistic look better. So, don’t use this tool for yourself – give it to programmers so that they can make the code better.

So, as you can see, there’s no better method to make sure of how our code is tested than just check the logics of the program that is written and try to determine whether this is exactly what we want. And this can be done only by humans.

I don’t say “reliably”. Humans are always unreliable, you well know it :).

9. How to approach to testing programs in order to test reliably?

Contrary to what many information technology scientists believe, programs do not have “execution paths”. Programs have “execution scenarios”. And programs are correct only if they are prepared to only execute correct scenarios and they prevent executing bad scenarios. When testing a program, we do not test, whether the program correctly passed some execution path. We test a scenario (either good or bad) to check whether the program lets it correctly pass (if good) or rejects this scenario quickly enough (if bad), and if so, it correctly recovers. Success scenarios are those that should work and are defined by requirements. Failure scenarios are needed to be made because of logical constraints or platform’s irreliability and they are “discovered” during the implementation. Failure scenarios aren’t less important than success scenarios (I would even say, more important!).

You can compare it with an army. Some army got an order to perform some steps, which also include going through some path. Of course, the path is important, it’s also important whether the path can be passed etc. – but it’s not the core. When we want to check whether the army did the right things, we need only to know whether the army performed correct steps and, simply, fulfilled the orders. If all particular elements of the order were fulfilled correctly, we can say that the army has correctly fulfilled the orders (not whether it passed correct paths).

So, first, most important thing is to define scenarios. The “scenario path” may pass through various execution paths and the program may get in various states (out of all the states contained in its degrees of freedom). That’s why there points on this scenario paths where we have a kind of “common conditions”. It’s like when you ride from one place to another throughout a town. You can enter various crossroads and jump from one highway to another. But you can also meet various conditions at these roads (usually traffic jam :)).

We are writing the code in particular programming language just in order to fulfill all correct scenarios. So, remember:

  • You want to have particular scenarios passed, so you write the code that should perform this scenario
  • Parts of the things you are using during the code spawn the “common points”, that is, points where various different conditions may occur and they may influence on the kinds of scenarios that may be passed at particular conditions
  • By using functions, on which’s result you have no influence (like system functions in kind of memory allocation, resource gain etc.), some scenarios, of course not desired by you, are spawned
  • In result, there happen to be created scenarios, which were not planned and which may lead your program to do things you don’t want it to do. You must catch every case when such a scenario has a chance to be spawned

But there is another problem: catch and what? Well, this is a kind of different situations: there occur scenarios, which are created by some conditions which you must have accepted, if you wanted to use some particular functions, so effectively you must add new scenarios to your program logics just because there are scenarios that just are, that is, new situations that you don’t need to occur, but they will occur, so you must plan, what you’ll do facing them.

In result, all scenarios, that you have planned, plus all scenarios, that you were obliged to create to respond for scenarios “automatically spawned” by the fact that you use some particular functions. Once you have all these scenarios documented, you must then select the common points, and be sure, that:

  • Every entering the common point has limited number of degrees of freedom of the global state (limited to the scope of this point)
  • Every exiting the common point will take a path that is assigned to any of scenarios that were planned by you (including planned scenarios for error handling, that is, scenarios that do not belong to the initial statements of the logics of your program)

Now, the test cases. The test cases may be either unit tests or component tests, it doesn’t really matter. What matters is that you make sure that every planned scenario and every theoretical scenario is tried and that the execution conditions are under control. And yes, only if all these scenarios are used in these tests can you say that you have the 100% coverage.

The tests are not required to perform always one particular scenario, nor there is a need that each test test different scenario. You should just plan tests as you think would be most appropriate. In ideal case, tests should be assigned to particular scenarios that go from one common point to another common point. Note that it’s always best if you have common points and have them in “well balanced” number. If you don’t have common points, it usually means that you have them, but they are “little common” – that is, your scenario is spanning through too long path. The common point is something like a pair of execution point and value of the global state (for all particle states which are available at this execution point). So, in order to have the common points well defined, you should have a) well limited number of particle states at every such point, b) predictable number of values of the “local state” (all particle states within this execution point’s scope). Note: the identity of a common point is a particular local state’s value and the execution point. If you have another combination of states, it’s another common point. So if you have too much common points, it usually means that your execution points usually have access to too many number of particle states. This is exactly the case I was talking about as “false dependencies” and the overuse of data sharing.

The only problem is that this is not automatically measurable. As I said, scenarios are also part of logics, so by nature, it’s unable to handle by automatic tools. In other words, automatic tools, no matter how intelligent, will not write test scenarios for you.

The problems that may cause bugs, then, are not only the code that exists and behaves incorrectly, but also the code that doesn’t exist, but it should exist. Such a code would be required to prevent some scenario to go out of control (that is, to prevent spawning a “singular scenario”). There’s no tool that may detect this just by looking at the source code. When, for example, a function returns an integer value, the tool cannot guess, that the values can be -1, 0, and every value greater than 0. Or that some function that receives the pointer, can accept null pointers and treats them in some special way, while some other function, receiving the pointer of the same type, doesn’t accept null pointers and treats this situation as error.

Summing up: when you create a program, you should have scenarios describing how the program should behave. Stating that you will have to use parts of code, which’s behavior and results do not depend on you (depend on user data, system resources, other libraries or directly other’s code), you will have to create additional scenarios that will manage (prevent) situations that you don’t want to occur.

Effectively all test cases you create to test your program do not test the paths or the behavior of the code (in contradiction to what many theoreticians and managers think). They test scenarios. Scenarios are logics, additional (say “tightening”) scenarios are logics, and singular scenarios are also logics. The program is correct only if it correctly behaves for predicted scenarios, contains tightening scenarios that prevent unwanted situations, and doesn’t ever run into singular scenarios, that is, scenarios that lead the program to behave in unpredictable way.

Remember: a program may work correctly for every possible scenario that programmers can spell up, but users can spell up more scenarios than you can imagine. Note, however, that if you make sure that you have limited the “kind” of data with one “statement”, for which the program will always behave the same and you treat data outside this kind as rejected always, your program will always behave correctly. But in order to do that you need to make sure that the complexity of your data is managed by the complexity of the code; otherwise the above checks for correctness will be useless.

The most important lesson from this is in two points:

  1. Program is logics, correctness of the program is logics, tests are based on logics. Despite that the program and tests are encoded in some programming language, the languages are nothing but a way to describe the logics. Logics is the key. Always.
  2. There’s no possible automated way to check whether your program is correct. Tools can only help you detect potential problems. The basic thing you have to strongly adhere to when you write your program is that the program is written clearly, predictably and with high testability.

10. How to make sure that programmers are doing their job well?

Yes, it’s malicious.

Of course, there exists no way to check it! For example, if you are a businessman and some programmers work for you, and you have absolutely no idea about software engineering – you will not be able to evaluate it any way. There are no tools, no technologies, no methods that you can use to evaluate them. There is only one way to do it: employ someone, who is a good programmer and whom you trust. Of course, the worse programmer this guy is, the less reliable can be his observations. You should know also his character and be sure that his reports aren’t tendentious or over the top.

This is also not unusual in any other kind of business. Trust is something that must be built by years and can be wasted quickly. It’s also one of the most valuable thing in business. Of course, you don’t have to trust fully everyone you employ, you just need a key person on a key stage. Someone you trust that they know that you need reliable information to make business decisions, while they also trust you that you will not make wrong use of these reports.

If you do not base on trust and believe more to the reports done by automatic tools than to the reports of trusted programmers, you may start receiving reports that do not reflect the true situation. In every report there are things like “white noise”, that is, data that change the nature of the report, but they are driven from the “independent” causes. When no one knows that these data are collected and no one knows about the methods of measurement, there is a great probability that this “white noise” can even up. But when someone knows about the method you use and the data you collect and, especially, the conclusions you drive from them – you can be sure that the people being “measured” are much wiser than the tools that base on simple equations, and this “white noise” will turn into “red noise”, that is, the “disturbing, but evening up” data will turn into the data that “disturb, deform, corrupt and distort”. And in result you’ll get nothing but the “true lies”. When you get into this situation, your programmers will most likely focus on how to sweep bugs under the carpet, not to fix problems.

Who is dishonest, they will see enemies everywhere around. Be sure that you know it, especially when many things depend on you.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s