Visitor Pattern's Purpose with Examples

Visitor pattern's purpose with examples

Once upon a time...

class MusicLibrary {
    private Set<Music> collection ...
    public Set<Music> getPopMusic() { ... }
    public Set<Music> getRockMusic() { ... }
    public Set<Music> getElectronicaMusic() { ... }
}

Then you realize you'd like to be able to filter the library's collection by other genres. You could keep adding new getter methods. Or you could use Visitors.

interface Visitor<T> {
    visit(Set<T> items);
}

interface MusicVisitor extends Visitor<Music>;

class MusicLibrary {
    private Set<Music> collection ...
    public void accept(MusicVisitor visitor) {
       visitor.visit( this.collection );
    }
}

class RockMusicVisitor implements MusicVisitor {
    private final Set<Music> picks = ...
    public visit(Set<Music> items) { ... }
    public Set<Music> getRockMusic() { return this.picks; }
}
class AmbientMusicVisitor implements MusicVisitor {
    private final Set<Music> picks = ...
    public visit(Set<Music> items) { ... }
    public Set<Music> getAmbientMusic() { return this.picks; }
}

You separate the data from the algorithm. You offload the algorithm to visitor implementations. You add functionality by creating more visitors, instead of constantly modifying (and bloating) the class that holds the data.

Visitor Pattern, why is it useful?

The article you link to is pretty clear as to why you'd want to use a visitor pattern: when you can't alter the objects because they come from a third party:

The assumption is that you have a primary class hierarchy that is fixed; perhaps it’s from another vendor and you can’t make changes to that hierarchy. However, your intent is that you’d like to add new polymorphic methods to that hierarchy, which means that normally you’d have to add something to the base class interface. So the dilemma is that you need to add methods to the base class, but you can’t touch the base class. How do you get around this?

Sure, if you can just add a visit method to bees, flies and worms, then that's fine. But when you can't, using the visitor pattern is the next best option.

Note that in the article the relationship is reversed; you can't alter the Flower hierarchy:

# The Flower hierarchy cannot be changed:

but the class does support the visitor dispatch pattern via the visit method:

class Flower(object):
    def accept(self, visitor):
        visitor.visit(self)

That implementation could be much more complex; the example has been simplified down to a simple visitor.visit() call here, but in practice a real visitor pattern can and does do much more at this stage.

For example, there could be composite classes, which contain multiple subcomponents. The accept() method would then delegate further down to those sub-elements to then call accept on them all, as needed. Keeping with the flower theme, perhaps there's a Chrysanthemum or Dahlia class, where some visitors would eat the ray components, while others would like to visit the components in the eye to pollinate. It's up to the composite object to direct each visitor to those parts individually.

If you are looking for specific examples, take a look at the ast module, which offers a NodeVisitor class which should be subclassed to add methods to let you customise how the AST tree passed in is being processed. I've used the specific NodeTransformer subclass to alter how Python code works on several occasions. Here the visitor pattern is used to effectively filter out certain types in a larger hierarchy, greatly simplifying AST-handling code without having to alter any of the AST node classes themselves.

Visitor Pattern Explanation

Visitor pattern is used to implement double dispatch. In plain words it means that the code that gets executed depends on runtime types of two objects.

When you call a regular virtual function, it is a single dispatch: the piece of code that gets executed depends on the runtime type of a single object, namely, the one the virtual method of which you are calling.

With the visitor pattern, the method that is being called ultimately depends on the type of two objects - the type of the object implementing the equipmentVisitor, and the type of the object on which you call accept (i.e. the equipmentVisited subclass).

There are other ways to implement double dispatch in C++. Item 31 of Scott Meyer's "More Effective C++" treats this subject in depth.

Example with Visitor Pattern

It is mainly because the example is a bad example of the visitor pattern. The purpose of the visitor pattern is to add common functionality to a group of objects without having to derive from the same class. It lets you keep adding functionality to classes without having to change the classes themselves. The longer fruit example in the answer that you quoted is a better explanation of the visitor pattern.

Read the quoted wikipedia article, for the visitor to pay off you should have a group of classes. In you case different classes are not really warranted so there is no need for the visitor pattern. Given a more heterogeneous class structure the visitor pattern might become useful.

Confused about the Visitor Design Pattern

The code in the OP resembles a well-known variation of the Visitor design pattern known as an Internal Visitor (see e.g. Extensibility for the Masses. Practical Extensibility with Object Algebras by Bruno C. d. S. Oliveira and William R. Cook). That variation, however, uses generics and return values (instead of void) to solve some of the problems that the Visitor pattern addresses.

Which problem is that, and why is the OP variation probably insufficient?

The main problem addressed by the Visitor pattern is when you have heterogenous objects that you need to treat the same. As the Gang of Four, (the authors of Design Patterns) states, you use the pattern when

"an object structure contains many classes of objects with differing interfaces, and you want to perform operations on these objects that depend on their concrete classes."

What's missing from this sentence is that while you'd like to "perform operations on these objects that depend on their concrete classes", you want to treat those concrete classes as though they have a single polymorphic type.

A period example

Using the animal domain is rarely illustrative (I'll get back to that later), so here's another more realistic example. Examples are in C# - I hope they're still useful to you.

Imagine that you're developing an online restaurant reservation system. As part of that system, you need to be able to show a calendar to users. This calendar could display how many remaining seats are available on a given day, or list all reservations on the day.

Sometimes, you want to display a single day, but at other times, you want to display an entire month as a single calendar object. Throw in an entire year for good measure. This means that you have three periods: year, month, and day. Each has differing interfaces:

public Year(int year)

public Month(int year, int month)

public Day(int year, int month, int day)

For brevity, these are just the constructors of three separate classes. Many people might just model this as a single class with nullable fields, but this then forces you to deal with null fields, or enums, or other kinds of nastiness.

The above three classes have different structure because they contain different data, yet you'd like to treat them as a single concept - a period.

To do so, define an IPeriod interface:

internal interface IPeriod
{
    T Accept<T>(IPeriodVisitor<T> visitor);
}

and make each class implement the interface. Here's Month:

internal sealed class Month : IPeriod
{
    private readonly int year;
    private readonly int month;

    public Month(int year, int month)
    {
        this.year = year;
        this.month = month;
    }

    public T Accept<T>(IPeriodVisitor<T> visitor)
    {
        return visitor.VisitMonth(year, month);
    }
}

This enables you to treat the three heterogenous classes as a single type, and define operations on that single type without having to change the interface.

Here, for example, is an implementation that calculates the previous period:

private class PreviousPeriodVisitor : IPeriodVisitor<IPeriod>
{
    public IPeriod VisitYear(int year)
    {
        var date = new DateTime(year, 1, 1);
        var previous = date.AddYears(-1);
        return Period.Year(previous.Year);
    }

    public IPeriod VisitMonth(int year, int month)
    {
        var date = new DateTime(year, month, 1);
        var previous = date.AddMonths(-1);
        return Period.Month(previous.Year, previous.Month);
    }

    public IPeriod VisitDay(int year, int month, int day)
    {
        var date = new DateTime(year, month, day);
        var previous = date.AddDays(-1);
        return Period.Day(previous.Year, previous.Month, previous.Day);
    }
}

If you have a Day, you'll get the previous Day, but if you have a Month, you'll get the previous Month, and so on.

You can see the PreviousPeriodVisitor class and other Visitors in use in this article, but here are the few lines of code where they're used:

var previous = period.Accept(new PreviousPeriodVisitor());
var next = period.Accept(new NextPeriodVisitor());

dto.Links = new[]
{
    url.LinkToPeriod(previous, "previous"),
    url.LinkToPeriod(next, "next")
};

Here, period is an IPeriod object, but the code doesn't know whether it's a Day, and Month, or a Year.

To be clear, the above example uses the Internal Visitor variation, which is isomorphic to a Church encoding.

Animals

Using animals to understand object-oriented programming is rarely illuminating. I think that schools should stop using that example, as it's more likely to confuse than help.

The OP code example doesn't suffer from the problem that the Visitor pattern solves, so in that context, it's not surprising if you fail to see the benefit.

The Cat and Dog classes are not heterogenous. They have the same class field and the same behaviour. The only difference is in the constructor. You could trivially refactor those two classes to a single Animal class:

public class Animal {
    private int health;

    public Animal(int health) {
        this.health = health;
    }

    public void increaseHealth(int healthIncrement) {
        this.health += healthIncrement;
    }

    public int getHealth() {
        return health;
    }
}

Then define two creation methods for cats and dogs, using the two distinct health values.

Since you now have a single class, no Visitor is warranted.

What is the point of accept() method in Visitor pattern?

The visitor pattern's visit/accept constructs are a necessary evil due to C-like languages' (C#, Java, etc.) semantics. The goal of the visitor pattern is to use double-dispatch to route your call as you'd expect from reading the code.

Normally when the visitor pattern is used, an object hierarchy is involved where all the nodes are derived from a base Node type, referred to henceforth as Node. Instinctively, we'd write it like this:

Node root = GetTreeRoot();
new MyVisitor().visit(root);

Herein lies the problem. If our MyVisitor class was defined like the following:

class MyVisitor implements IVisitor {
  void visit(CarNode node);
  void visit(TrainNode node);
  void visit(PlaneNode node);
  void visit(Node node);
}

If, at runtime, regardless of the actual type that root is, our call would go into the overload visit(Node node). This would be true for all variables declared of type Node. Why is this? Because Java and other C-like languages only consider the static type, or the type that the variable is declared as, of the parameter when deciding which overload to call. Java doesn't take the extra step to ask, for every method call, at runtime, "Okay, what is the dynamic type of root? Oh, I see. It's a TrainNode. Let's see if there's any method in MyVisitor which accepts a parameter of type TrainNode...". The compiler, at compile-time, determines which is the method that will be called. (If Java indeed did inspect the arguments' dynamic types, performance would be pretty terrible.)

Java does give us one tool for taking into account the runtime (i.e. dynamic) type of an object when a method is called -- virtual method dispatch. When we call a virtual method, the call actually goes to a table in memory that consists of function pointers. Each type has a table. If a particular method is overridden by a class, that class' function table entry will contain the address of the overridden function. If the class doesn't override a method, it will contain a pointer to the base class' implementation. This still incurs a performance overhead (each method call will basically be dereferencing two pointers: one pointing to the type's function table, and another of function itself), but it's still faster than having to inspect parameter types.

The goal of the visitor pattern is to accomplish double-dispatch -- not only is the type of the call target considered (MyVisitor, via virtual methods), but also the type of the parameter (what type of Node are we looking at)? The Visitor pattern allows us to do this by the visit/accept combination.

By changing our line to this:

root.accept(new MyVisitor());

We can get what we want: via virtual method dispatch, we enter the correct accept() call as implemented by the subclass -- in our example with TrainElement, we'll enter TrainElement's implementation of accept():

class TrainNode extends Node implements IVisitable {
  void accept(IVisitor v) {
    v.visit(this);
  }
}

What does the compiler know at this point, inside the scope of TrainNode's accept? It knows that the static type of this is a TrainNode. This is an important additional shred of information that the compiler was not aware of in our caller's scope: there, all it knew about root was that it was a Node. Now the compiler knows that this (root) is not just a Node, but it's actually a TrainNode. In consequence, the one line found inside accept(): v.visit(this), means something else entirely. The compiler will now look for an overload of visit() that takes a TrainNode. If it can't find one, it'll then compile the call to an overload that takes a Node. If neither exist, you'll get a compilation error (unless you have an overload that takes object). Execution will thus enter what we had intended all along: MyVisitor's implementation of visit(TrainNode e). No casts were needed, and, most importantly, no reflection was needed. Thus, the overhead of this mechanism is rather low: it only consists of pointer references and nothing else.

You're right in your question -- we can use a cast and get the correct behavior. However, often, we don't even know what type Node is. Take the case of the following hierarchy:

abstract class Node { ... }
abstract class BinaryNode extends Node { Node left, right; }
abstract class AdditionNode extends BinaryNode { }
abstract class MultiplicationNode extends BinaryNode { }
abstract class LiteralNode { int value; }

And we were writing a simple compiler which parses a source file and produces a object hierarchy that conforms to the specification above. If we were writing an interpreter for the hierarchy implemented as a Visitor:

class Interpreter implements IVisitor<int> {
  int visit(AdditionNode n) {
    int left = n.left.accept(this);
    int right = n.right.accept(this); 
    return left + right;
  }
  int visit(MultiplicationNode n) {
    int left = n.left.accept(this);
    int right = n.right.accept(this);
    return left * right;
  }
  int visit(LiteralNode n) {
    return n.value;
  }
}

Casting wouldn't get us very far, since we don't know the types of left or right in the visit() methods. Our parser would most likely also just return an object of type Node which pointed at the root of the hierarchy as well, so we can't cast that safely either. So our simple interpreter can look like:

Node program = parse(args[0]);
int result = program.accept(new Interpreter());
System.out.println("Output: " + result);

The visitor pattern allows us to do something very powerful: given an object hierarchy, it allows us to create modular operations that operate over the hierarchy without needing requiring to put the code in the hierarchy's class itself. The visitor pattern is used widely, for example, in compiler construction. Given the syntax tree of a particular program, many visitors are written that operate on that tree: type checking, optimizations, machine code emission are all usually implemented as different visitors. In the case of the optimization visitor, it can even output a new syntax tree given the input tree.

It has its drawbacks, of course: if we add a new type into the hierarchy, we need to also add a visit() method for that new type into the IVisitor interface, and create stub (or full) implementations in all of our visitors. We also need to add the accept() method too, for the reasons described above. If performance doesn't mean that much to you, there are solutions for writing visitors without needing the accept(), but they normally involve reflection and thus can incur quite a large overhead.

Visitor Pattern's Purpose with Examples