Why not to kill people who write documentation?

Well… as one of my friends says: “It’s not worth it – you’ll go to jail as if you murdered a human!”.

You can always say that if you are using a free library, “don’t look a gift horse in the mouth”. If you don’t like this library, don’t use it. Or write your own. Or fix it by yourself. This may also happen in case of a commercial-licensed library, but even then you can say: don’t buy it, use something else, if you don’t like it. Theoretically.

But on the other hand, why to distribute a library (or tool) for which the usage rules are to be yet determined by experimental usage? What profit may anyone have from them? The cost of use for such libraries actually grow towards what was expected at the first glance, and more importantly, they do not well contribute to good opinion about such a library, no matter how useful it is. If you release such a library, you are a liar because you lie about the real costs of using this library. Especially if you release it for free. You count for help? So say it explicitly: I need help.

I wish every library is so thoroughly documented as, for example Qt is. Unfortunately there is a lot of libraries that are recommended as good solution for something, but… well, if you have problems with using it, you should experiment (read: waste time), try to read more times to better understand (read: waste time) or ask on forums, of course, searching for information about it first (read: waste time). How “free” is a library that requires you to pay so much with your time?

Take a look at GWT API Documentation and find a class called “Grid”. Find a method called “prepareCell”. What does this method do?

Checks that a cell is a valid cell in the table.

Checks and what? Especially that this method’s return type is void (this is Java so a value cannot be also returned through referred variable). I have always thought that if a function is going to “check” something (not “make sure of that!”), it performs some verification and returns some status to indicate the result of checking. The “check” name not necessarily indicates only a verification; it can always make some fixes accordingly. But even if so, it’s nothing said about it.

Ah! It throws an exception of type IndexOutOfBoundsException. But well… if this is said so enigmatically, a developer should rather pay little attention on this and treat this exception as something that shouldn’t have happened and would be thrown only in a case when there is something that they could prevent against (because the same is said about, for example, String::charAt() function).

It’s not explicitly said, at which circumstances the exception is thrown (of course, the author of the javadoc comments didn’t specify it and the documentation mentions about this exception only because it’s automatically generated). It may mean various things: either this exception is a form of reporting the result by this function (that is, one of possible results, but more strictly, the only result), or this exception is thrown just in a situation when a user did not verify something important and let invalid index values to be created.

There are then two possibilities of why such a function exists:

  1. This function does some enigmatic “checks” and is mandatory to be called before operations on particular cell. It is provided only because it’s useless to check these things before every instruction that manipulates with the cell, so you should only ensure that it was called before you access particular cell.
  2. This function verifies whether the cell with specified coordinates exists; it returns normally, if so, and throws the exception otherwise.

I tend to think that the second one is what this function does, but it’s not said explicitly. And I showed here not the worst example of incomplete documentation. The fact that this is a good example of how idiots use exceptions in public API, which I have already described in my article about exceptions, is yet another “flower” in this funny library.

It reminds me one of humor pages when there was an explanation about the difference between ‘memcpy’ and ‘memmove’ functions: “memcpy function copies the memory, while memmove moves it”.

I have always said that if you want to make sure that your call will not end up with unexpected results, read the documentation of the called function and make sure that you have managed every case of result it may provide. This advice, however, seems to be wasteful, when you read such a documentation.

So, if you can’t rely on documentation, the only thing that you can do is to experiment. Of course, there’s nothing better than just read the sources (Qt library has been always distributed together with sources – just for a case when, even if impossible, the documentation would be not perfect enough). But unfortunately, many times even the sources isn’t a good source (!) of knowledge. Sometimes the source code is so dirty, complicated and indeterministic that it’s not possible to go through it by simple reading it. The ultimate thing to do seems to be just… run your application under debugger, and… yes, debug the library.

Yes, I’m not kidding. I was developing a Windows application at ~2001 year, using MFC. It was really fortunate that MFC came up with its source code. Otherwise some bugs would be just unable to find. It’s especially important in case of Object-oriented library. For example: I am overriding some method and the problem is that at the time when my method is called some data are not yet ready. So? How would you determine at what exactly point of code my method is called if I don’t have source code of the library that calls it? It’s not even needed that the library be object-oriented, it’s just enough when the library is using callback calls (polymorphism is actually a callback-based technique).

You say, the library should come up with good documentation, which states thoroughly, in which places my callback function is called. Na’ive :). Usually it would just say that the function does some specific thing… and that’s all. People are sometimes too stupid to remember that if a function is virtual (so, overridable), then the most important is to describe, at which circumstances this function is to be called and what’s this function purpose (so that the user’s implementation can be written in a conformable way). If the function is both virtual and implemented (that is, not abstract), then both descriptions should be provided: what the original implementation does and what the overridden implementation should do. Usually, though, the documentation is too weak to provide this complete information.

Same thing is about exceptions, return values, status etc. For example, there’s a lot of APIs, in which the author(s) cannot precisely describe, why a function may fail. They just write “it returns -1/returns 0/throws a XXX exception in case of failure”. And the function may fail because, well, it may fail. And the function, well, throws an exception, and, well, you should catch the exception in case this function throws it. While the basic thing about the failure for the API user is to grab the difference between errors that can be prevented from (by some early checks) and errors that may occur by independent reasons. This is impossible if a function’s documentation states, practically, that it can, well, throw an exception if it has a caprice to do it. It’s a pity that it never says that it can raise SIGSEGV if it has a caprice to do it.

I would like not to say anything more about MFC or otherwise this article will become thrice longer. Maybe something has changed since the last 5 years, when I was using this library last time, but this library had also errors in documentation – for example, a famous error in the Find dialog’s documentation, which states that the dialog should be deleted when no longer used, while by tracing the source code you can see that in response to WM_NCDESTROY it calls some virtual function PostNCDestroy that is in this particular dialog class overridden and implemented as “delete this”.

An example of a good documentation (beside Qt) is the set of POSIX man pages. For example, let’s look at the documentation for ‘socket’:

RETURN VALUES
     A -1 is returned if an error occurs, otherwise the return value is a
     descriptor referencing the socket.

ERRORS
     The socket() system call fails if:

     [EACCES]           Permission to create a socket of the specified type
                        and/or protocol is denied.

     [EAFNOSUPPORT]     The specified address family is not supported.

     [EISCONN]          The per-process descriptor table is full.

     [EMFILE]           The per-process descriptor table is full.

     [ENFILE]           The system file table is full.

     [ENOBUFS]          Insufficient buffer space is available.  The socket
                        cannot be created until sufficient resources are
                        freed.

     [ENOMEM]           Insufficient memory was available to fulfill the
                        request.

     [EPROTONOSUPPORT]  The protocol type or the specified protocol is not
                        supported within this domain.

     [EPROTOTYPE]       The socket type is not supported by the protocol.

     If a new protocol family is defined, the socreate process is free to
     return any desired error code.  The socket() system call will pass this
     error code along (even if it is undefined).

Now let’s look at the “other side” – let’s take, for example, Berkeley DB, which is a very old library, with very simple features, since last time supported by Sleepycat company, which was some years ago taken over by Oracle (so, yes, this “product” is finally supported by Oracle). To have a shortcut, let’s take a look at the documentation for BDB’s C++ API:

Opening Databases

You open a database by instantiating a Db object and then calling its open() method. 

Note that by default, DB does not create databases if they do not already exist. To override this behavior, specify the DB_CREATE flag on the open() method. 

The following code fragment illustrates a database open:
#include <db_cxx.h>

...

Db db(NULL, 0);               // Instantiate the Db object

u_int32_t oFlags = DB_CREATE; // Open flags;

try {
    // Open the database
    db.open(NULL,                // Transaction pointer
            "my_db.db",          // Database file name
            NULL,                // Optional logical database name
            DB_BTREE,            // Database access method
            oFlags,              // Open flags
            0);                  // File mode (using defaults)
// DbException is not subclassed from std::exception, so
// need to catch both of these.
} catch(DbException &e) {
    // Error handling code goes here
} catch(std::exception &e) {
    // Error handling code goes here
}

It’s not the point that the complete API is not described, for example, not all open flags are described (they are described later). The point is: there is no description of the function’s return value (it’s not even stated that it is void, although it isn’t :); in the example there are exceptions caught, but there’s not explained the reason, why there can be any error when calling Db::open().

This is the description signed by Oracle. There is also other description, for example, one that I have found in the libdb package on Cygwin. In this case the Db::open function seems to be much better described:

Errors
ENOENT
The file or directory does not exist. 

The Db::open method may fail and throw DbException, encapsulating one of the
following non-zero errors, or return one of the following non-zero errors:
DB_OLD_VERSION
The database cannot be opened without being first upgraded.
EEXIST
DB_CREATE and DB_EXCL were specified and the database exists.
EINVAL
If an unknown database type, page size, hash function, pad byte, byte order,
or a flag value or parameter that is incompatible with the specified database was
specified; the DB_THREAD flag was specified and fast mutexes are not available
for this architecture; the DB_THREAD flag was specified to Db::open, but was not
specified to the DbEnv::open call for the environment in which the Db handle was
created; a backing flat text file was specified with either the DB_THREAD flag or the
provided database environment supports transaction processing; or if an invalid flag
value or parameter was specified.
ENOENT
A nonexistent re_source file was specified.
DB_REP_HANDLE_DEAD
The database handle has been invalidated because a replication election
unrolled a committed transaction.
DB_REP_LOCKOUT
The operation was blocked by client/master synchronization. 

If a transactional database environment operation was selected to resolve a
deadlock, the Db::open method will fail and either return DB_LOCK_DEADLOCK or
throw a DbDeadlockException exception.

If a Berkeley DB Concurrent Data Store database environment configured for lock
timeouts was unable to grant a lock in the allowed time, the Db::open method will fail
and either return DB_LOCK_NOTGRANTED or throw a DbLockNotGrantedException
exception.

There is one problem there: it’s not clear in this description, in which situation there can be an exception, and in which situation there can be a return value informing about the error (it’s described in another place, at DbException). Moreover, putting ENOENT on top makes it a bit exceptional and suggests that this exactly value can only be returned, while only the others can be send via exception (I guess it isn’t true). The problem is that users usually do not read documentations from board to board, but read only some fragment that is currently interesting to them. In this document there are even no links or redirections to other parts, and users tend to state that if something is not described, it works “some default way” – for example, if some special case isn’t mentioned, users think there are no special cases.

It’s hard to say, how much it has to do with the poor quality of the library API itself (it can be configured to either return errors by value or throw an exception) and how much with the documentation only. I have made so far two approaches to BDB, as it seemed to be a very simple (so, flexible) library for use with a very distinct place; the last one was memory allocation tracer (based on malloc_hook). Unfortunately, it proved to be useless due to some strange operations done with memory and it was crashing too often by unknown reason (version 4.5). Finally I had to use a client-server method and send the data to another process for collection.

Only things we have done by ourselves are free. Well, even this isn’t true. We still have to spend time, and time costs.

In case of libraries, the documentation for the library is moreover important, as problems with this will then expand when making software using this library. For example, take a look at this description from Elementary library (part of EFL library):

EAPI char* elm_entry_utf8_to_markup  (const char *s)

This converts a UTF-8 string into markup (HTML-like).
Parameters:
 - s: The string (in UTF-8) to be converted 

Returns:
The converted string (in markup)

Well… ok. Let’s take this a good deal: it returns a string; we can use it in our operations without any restrictions.

Ha! No way! Probably you can suspect something when you can see that the return type is ‘char*’, not ‘const char*’? Yes, of course, you are right: this function allocates memory by malloc() and returns a pointer to a dynamically allocated array of characters. So, once you finished with this array you got from this function, you must call free() in order to release it (or store it somewhere, where you usually keep only malloc()’ed memory).

No word about this thing in the documentation? Oh, well… for intelligent programmers it should be obvious that it returns allocated object because this pointer could not come from anywhere else (static local variable as buffer is a racy solution). A bit more funny thing is that in previous version this function was returning a string different way and it didn’t have to be freed afterwards… (and return type was const char*).

And that’s how it’s so easy to get into memory leaks. This is the biggest problem with C language. Its problem is not that it’s so raw and so low level. The problem is that this language does not even give you abilities to at least mark your intentions in the source code and count for any help from compiler to point you things that are errors in regard of this intention. It means that every smallest thing must be very thoroughly documented especially in case of libraries having C API (including those written in languages other than C). The function receives a pointer to X as argument? You must describe whether this pointer is an iterator to an array, or a pointer to the array’s beginning (with size in another argument or known from elsewhere), or a pointer to a variable, or maybe a pointer to a single object, maybe a pointer to an object to be consumed (freed afterwards) by the function. A pointer just plays too much roles in C language. For example (this spreads also to C++, but in this language you at least have alternatives in form of some advanced pointers), you cannot just “compare two pointers” and expect that you’ll receive some certain result. If you want to compare two pointers, you must know exactly what one of these pointers hold, and also what possible meaning is of the value of the other pointer. It’s because if you don’t know these things, you are not allowed to compare pointers, in particular, you are not allowed to use the result of the comparison in any next evaluations. For example, if your pointer is a pointer to single object or a pointer to function – it may contain some value of pointer or NULL. If this is an iterator to an array, it may have a value of a pointer to an element of this array from its beginning to the one element past the end (and before the first, too!). Moreover, we can put a constraint that a pointer, no matter its destination, is not allowed to be NULL. Effectively, whenever you have a pointer used in some interface, all aspects must be thoroughly described (again: single object, array of objects, nullable or not, producer/consumer).

Same thing about int – it’s even much more universal type. The fact that two functions return integer values doesn’t simultaneously mean that it makes sense to compare them. Well, yes, too many people do not even realize it (that’s why I tend to think last time that the C language can be preferred only by people, who are – concerning their professional experience – programming children, who just do not realize the threats carried by this language). For example: can you add values of X and Y coordinates? Yes, you can add, but what logical interpretation such a value has?

Every such problem costs later the time spent to detect and wipe bugs, and also the time for experimenting costs. As I said, in worst case you would have to even run your application under debugger together with the library and debug the library itself. You would eventually have to develop this library – dear customer, make pleasure to yourself because the “free developers” prefer to stuff you in…

Sorry.

People, who want to give credit to free and open software must first consider that this software in major cases suffers of poor documentation, and this is the main reason of why this software tends to generate higher costs than the software, for which you have to pay. Currently I can see only one significant good thing introduced by free software: it at least defines the basis for what you may have for free. It forces then those, who want to sell software, that in order to make money for this software they have to do something more than just create, test and distribute it.

Unfortunately the usual situation is that when there is some kind of software existing as free software, and then no-one even is going to produce a commercial version of this kind of software. They prefer to produce something that’s not available in free software world… with poor documentation, as usual.

That’s why the only software that looks professional is the commercial software created for developers, and software produced commercially for contracted customer, whom you’d have to pay high fines in case when the software is poor. Although it’s not unlike any other kind of products.

There wouldn’t be anything wrong with this but one thing: poor documentation is nothing else than cheating. Let the authors of poorly documented software state a disclaimer in the beginning: “this software is very poorly documented” – and we’ll be ok, we know what we are starting with. As this is a shooting oneself in foot for commercial software, the authors of open source software should consider this rule very seriously. This will contribute to more fair approach for the users, and also may be a call for help for creating some better documentation. Only advantages. But not for people, who count for that they can make profit from cheating.

Don’t kill them. Walk them around. Read the documentation before you decide whether to use a library. And if the documentation is poor, do not hesitate to drop it.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s