Visitors unvisited!

Do you know the Visitor pattern?

If you don’t, it’s bad. If you do, it’s much worse.

Try to find any explanation of the Visitor pattern on the net. For 99% of probability you’ll find a page that shows you example classes, where we have some “object” classes having “accept” methods and a “visitor” class that has a “visit” method.

Imagine that you don’t know what hammer and nail are and someone is going to explain it to you the following way: “Hammer is a tool that can be used to hammer a nail and nail is something that can be hammered using a hammer”. The majority of explanations what Visitor pattern look exactly like that. Moreover, the same holds about explanations of other Design Patterns for software engineering.

Every thinking programmer would first ask an obvious question: why would I ever need something like that?

Let’s try the example that can be found at Maciek Sobczak’s Inspirel page:

class Hammer;
class Drill;
class Saw;

class Visitor
{
public:
    void visit(Hammer & h) = 0;
    void visit(Drill & d) = 0;
    void visit(Saw & s) = 0;
};

// root of the given hierarchy
class Tool
{
public:
    virtual void accept(Visitor & v) = 0;

    // regular operations of Tool omitted
};

class Hammer : public Tool
{
public:
    virtual void accept(Visitor & v) { v.visit(*this); }

    // regular operations of Hammer omitted
};

class Drill : public Tool
{
public:
    virtual void accept(Visitor & v) { v.visit(*this); }

    // regular operations of Drill omitted
};

class Saw : public Tool
{
public:
    virtual void accept(Visitor & v) { v.visit(*this); }

    // regular operations of Saw omitted
};

class DoSomethingVisitor : public Visitor
{
public:
    void visit(Hammer & h)
    {
        // do something with the hammer
    }

    void visit(Drill & d)
    {
        // do something with the drill
    }

    void visit(Saw & s)
    {
        // do something with the saw
    }
};

vector<Tool *> myToolBox; // filled with lots of tools

void doSomethingWithAllTools()
{
    DoSomethingVisitor v;

    for (size_t i = 0; i != myToolBox.size(); ++i)
    {
        Tool & t = *(myToolBox[i]);

        t.accept(v);
    }
}

The following things come in mind when you look at that example: why do we use those confusing names ‘visit’ and ‘accept’? First, let’s try to analyze this case starting from the top-level entity:

vector<Tool *> myToolBox; // filled with lots of tools

void doSomethingWithAllTools()
{
    DoSomethingVisitor v;

    for (size_t i = 0; i != myToolBox.size(); ++i)
    {
        Tool & t = *(myToolBox[i]);

        t.accept(v);
    }
}

What we want to achieve is: “let an object of class Visitor do what this object states is appropriate with the given object of class Tool”. So the most readable form of this code should look like this:

vector<Tool *> myToolBox; // filled with lots of tools

void doSomethingWithAllTools()
{
    DoSomethingVisitor v;

    for (size_t i = 0; i != myToolBox.size(); ++i)
    {
        Tool & t = *(myToolBox[i]);
        v.invoke( t );
    }
}

Which reads: for every tool, invoke the Visitor object with the Tool object as argument.

Nonetheless, the question is still not answered: why would we need something like this? Moreover that finally the call to ‘accept’ method gets into DoSomethingVisitor::visit() for appropriate argument.

However here we have one important hint: we call it directly as a method of DoSomethingVisitor, a derivative of Visitor, which makes us to think (what is not shown here, though) that probably it will end up with calling this ‘invoke’ (or ‘accept’ as in previous example) using a reference for Visitor, not for DoSomethingVisitor. Effectively then, the most simple form would be:

vector<Tool *> myToolBox; // filled with lots of tools

void doSomethingWithAllTools()
{
    DoSomethingVisitor v;

    for (size_t i = 0; i != myToolBox.size(); ++i)
    {
        Tool & t = *(myToolBox[i]);
        invoke( v, t );
    }
}

So we can treat this in two ways:

  1. more generic, as multimethods
  2. a bit specialized, as invoking a generic action on an object of generic type

Interresting, isn’t it? The Visitor is generally a method of solving the problem of multimethods, but no-one says it directly. It wouldn’t even be so extremely stupid, unless the visitor was never reasonable to be used for any other purpose – but let’s not anticipate:). The same example would be rewritten in even simpler way than Maciek has shown using enumerations:

// First, fill in the mappings:
void doSomething_Hammer( Tool& t )
{
   Hammer& h = static_cast<Hammer&>( t );
   // do this something
}

void doSomething_Saw( Tool& t )
{
   Saw& h = static_cast<Saw&>( t );
   // do this something
}

void doSomething_Drill( Tool& t )
{
   Drill& h = static_cast<Drill&>( t );
   // do this something
}

map<type_info*, void (*)( Tool& )> tool_dispatch_map;

// For convenience
#define REGISTER_TOOL( Toolname ) tool_dispatch_map[&typeid(Toolname)] = \
          &doSomething_##Toolname

void register_tools()
{
    REGISTER_TOOL( Hammer );
    REGISTER_TOOL( Drill );
    REGISTER_TOOL( Saw );
}
#undef REGISTER_TOOL

// root of the given hierarchy
class Hammer : public Tool
{
public:
    // regular operations of Hammer omitted
};

class Drill : public Tool
{
public:
    // regular operations of Drill omitted
};

class Saw : public Tool
{
public:
    // regular operations of Saw omitted
};

vector<Tool *> myToolBox; // filled with lots of tools

void doSomethingWithAllTools()
{
    for (size_t i = 0; i != myToolBox.size(); ++i)
    {
        Tool & t = *(myToolBox[i]);

        void (*handler)( Tool& ) = tool_dispatch_map[&typeid(t)];
        if ( !handler )
        {
             clog << "INTERNAL ERROR: no handler for type " << typeid(t).name() <<
                     "\nPlease review register_tools() function\n";
             return;
        }

        (*handler)( t );
    }
}

What’s better here? Well, first of all it doesn’t need an additional enumeration type. Second, there can be exactly one place in the code where all the expansive definitions reside. However both these examples have one important common thing: they don’t use any “Visitor” class. Guess, why?

Indeed, when we are using the Visitor class, the path through which the invocation goes from the shown ‘invoke’ or ‘accept’ method calls to DoSomethingVisitor::visit() implementation relies on selecting an appropriate implementation basing on two objects – in contrast to usual virtual methods, where the implementation is selected basing on one object (the main object in this case). As languages like C++ or Java don’t have such thing as multimethods, they usually use Visitor pattern to achieve it. (For Java, there is an extension called MultiJava, which supports multimethods; indeed the Visitor pattern is easier to be used in C++ because Java needs that the identical body of method, that redirects accept to visit, be manually repeated in every deriving class.)

But the second way is even more interresting because, as it’s written on the Wikipedia’s Visitor page, it is a way of separating an algorithm from an object structure upon which it operates. Of course the ‘algorithm’ term is not treated as ‘group algorithm’, let’s name it so, as it is used by STL – this is an algorithm that defines what to do with particular object rather than a collection of objects.

So why such a confusing way of getting from the invocation on two generic objects to the actual implementation? This is specific for C++ (and alike) because we want that its mechanisms help us determine the required implementation – in particular, we use the virtual call (indirect call) and overloading.

First, we use virtual call so that the ‘accept’ method is selected for the actual type of particular Tool object, not as of Tool class (although this is the only reason why we override this method in a class derived from Tool because the body is identical as in the base class). Then, this object in ‘accept’ method is calling ‘visit’ on a (given in ‘accept’) generic Visitor, passing itself in concrete type, so that we use overloading. Then, the generic Visitor is virtually calling its ‘visit’ method, overloaded for a concrete type of Tool. Finally, we get into the implementation of ‘visit’ method in a concrete Visitor having the concrete Tool. In other words, this mechanism selects a method that matches only and exclusively this pair of types.

The question is, however, whether this kind of complicated dispatching is really worthwhile. Indeed, the virtual call is nothing else than indirect call, so IOW nothing else than setting a pointer to a function to particular pointer variable and then mapping this variable to the pair of two types! I have already shown, how the mapping a type_info object to a pointer to function can be used to fulfill this.

This is the hint how to solve the problem of multimethods. Visitors seem to be a bit specialized, as I said, because we have the ‘operation’ object and the ‘subjective’ object. We can state some special cases because of that.

The most simple way of creating multimethods is shown in Bjarne Stroustrup’s “Design and Evolution of C++”, where he shows a multimethod selecting an implementation for two objects of the same base class, but various concrete classes. This solution can be used also for two hierarchies and don’t need things like “visitor pattern”. Indeed, the visitor pattern isn’t any better – it holds exactly all the disadvantages of multimethods by virtuality/overloading because it’s intrusive as well, which isn’t wondering given that both solutions base on virtuality and overloading, too.

So, how can we fix this, or better, what problems can be solved? There are two: how we can decrease the number of required changes when adding a new class to any of these hierarchies, and how we can make this solution totally non-intrusive.

The first problem can be solved in C++ the following way: we create an intermediate class that will derive from Tool and needs that the name of the currently created class be put as its parameter. This intermediate class will have appropriate method that redirects to given class, making it first cast to the concrete type (this “concrete type” is passed as a template parameter, which is the same as the class you define). This way you solve the problem of the needs to repeat the same body of redirecting method in every class that derives from Tool. I don’t know how this solution can be provided in Java – maybe. I didn’t try.

Let me show this solution. Note that the class hierarchy here is a bit different, because I’m going to show how Visitors should be really used, so there are two hierarchies with two class each. This is a complete example – you can compile and try.

#include <iostream>
#include <vector>

using namespace std;

class Hammer;
class Saw;

class Action
{
public:
    virtual void invoke_for(Hammer & h) = 0;
    virtual void invoke_for(Saw & s) = 0;
};

// root of the given hierarchy
class Tool
{
public:
    virtual void chain_dispatch(Action & v) = 0;

    // regular operations of Tool omitted
};


void invoke( Action& a, Tool& t )
{
 t.chain_dispatch( a );
}


template <class Derived>
class DispatchActionTool: public Tool
{
public:
 void chain_dispatch( Action& a )
 {
  a.invoke_for( static_cast<Derived&>( *this ) );
 }
};

class Hammer : public DispatchActionTool<Hammer>
{
public:

    // regular operations of Hammer omitted
};

class Saw : public DispatchActionTool<Saw>
{
public:

    // regular operations of Saw omitted
};


class CleanAction: public Action
{
 virtual void invoke_for( Hammer& h );
 virtual void invoke_for( Saw& s );

 // regular operations of CleanAction omitted
};

class UseAction: public Action
{
 virtual void invoke_for( Hammer& h );
 virtual void invoke_for( Saw& s );

 // regular operations of UseAction omitted
};

//////////////////////

void CleanAction::invoke_for( Hammer& h )
{
 cout << "Cleaning hammer\n";
}

void CleanAction::invoke_for( Saw& s )
{
 cout << "Cleaning saw\n";
}

void UseAction::invoke_for( Hammer& h )
{
 cout << "Using hammer\n";
}

void UseAction::invoke_for( Saw& s )
{
 cout << "Using saw\n";
}

vector<Tool*> myToolBox; // filled with lots of tools

void performActionOnWholeBox( vector<Tool*>& toolbox, Action& a )
{
    for (size_t i = 0; i != toolbox.size(); ++i)
    {
        Tool & t = *(toolbox[i]);
  invoke( a, t );
    }
}

struct FDelete
{
 typedef void* argument_type;
 typedef void return_type;

 template <class T>
 void operator()( T* x )
 {
  delete x;
 }
};

int main( int argc, char** argv )
{
 myToolBox.push_back( new Hammer() );
 myToolBox.push_back( new Saw() );

 Action* a1 = new CleanAction();
 Action* a2 = new UseAction();

 performActionOnWholeBox( ::myToolBox, *a2 );
 performActionOnWholeBox( ::myToolBox, *a1 );

 for_each( ::myToolBox.begin(), ::myToolBox.end(), FDelete() );

 return 0;
}

What is important in this example?

So, start from performActionOnWholeBox(). This function walks thru the whole container and performs given action on all objects in the toolbox. What operation will be performed for particular object, it depends on the details of particular action object and particular tool object. I used a standalone function invoke() to show that this is not so an obvious thing which of its argument should be an object for which to call a virtual method.

Now: what you need to add a new Action:
– create a new class derived from Action
– add overloaded invoke_for() methods for all Tool classes as others have
– create implementations for invoke_for() with all available Tools

And now, what you need to add a new Tool:
– create a new class derived from DispatchActionTool<new class> (not from Tool!)
– review all Action classes and add an entry for your new tool class
– add an implementation for invoke_for() method for all Actions with your type

Too much work? You’re kidding. You always have to add an implementation for a crossing of two types; this is such a nature of multimethods. So, if you have a UseAction and CleanAction, and you create a new DestroyAction, you have to add this class with implementations for already existing Tools, so you have to implement DestroyAction for Hammer and DestroyAction for Saw. Similarly, if you want to add a new tool, say, Drill, you have to create a CleanAction for Drill and UseAction for Drill. This is the minumum you have to do stating that you have two interoperating class hierarchies. Minimizing work can be things you can shorten only behind this one.

Note also that this solution provides still the fastest code because the most important part of dispatching occurs at compile time, and any run-time dispatching is only dereferencing pointers. We still can use mappings that will search for the implementation (that is, select appropriate pointer to function) basing on a pair of two types. Types specified in run-time, of course, for example by type_info. This requires nothing more but creating these implementations and mapping them for specified pair of types. Worthwhile, if means really less work – whether it is indeed, evaluate by yourself…

So, is the Visitor over-emphasized? Yes, but stating that even the lazy initialization is called a Design Pattern, let’s say that Visitor is also a Design Pattern. But if you are going to use any Design Pattern it’s best that you first get known of possible problems they solve and possible schemes it can be used for. Treating Visitor as a Design Pattern itself makes a developer more stupid that they can’t even eventually develop in Java language.

(This topic is partially continued in the next article, Multimethods considered harmful.)

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s