Ruby Design Pattern: How to Make an Extensible Factory Class

Ruby design pattern: How to make an extensible factory class?

You don't need a LogFileReaderFactory; just teach your LogFileReader class how to instantiate its subclasses:

class LogFileReader
  def self.create type
    case type 
    when :git
      GitLogFileReader.new
    when :bzr
      BzrLogFileReader.new
    else
      raise "Bad log file type: #{type}"
    end
  end
end

class GitLogFileReader < LogFileReader
  def display
    puts "I'm a git log file reader!"
  end
end

class BzrLogFileReader < LogFileReader
  def display
    puts "A bzr log file reader..."
  end
end

As you can see, the superclass can act as its own factory. Now, how about automatic registration? Well, why don't we just keep a hash of our registered subclasses, and register each one when we define them:

class LogFileReader
  @@subclasses = { }
  def self.create type
    c = @@subclasses[type]
    if c
      c.new
    else
      raise "Bad log file type: #{type}"
    end
  end
  def self.register_reader name
    @@subclasses[name] = self
  end
end

class GitLogFileReader < LogFileReader
  def display
    puts "I'm a git log file reader!"
  end
  register_reader :git
end

class BzrLogFileReader < LogFileReader
  def display
    puts "A bzr log file reader..."
  end
  register_reader :bzr
end

LogFileReader.create(:git).display
LogFileReader.create(:bzr).display

class SvnLogFileReader < LogFileReader
  def display
    puts "Subersion reader, at your service."
  end
  register_reader :svn
end

LogFileReader.create(:svn).display

And there you have it. Just split that up into a few files, and require them appropriately.

You should read Peter Norvig's Design Patterns in Dynamic Languages if you're interested in this sort of thing. He demonstrates how many design patterns are actually working around restrictions or inadequacies in your programming language; and with a sufficiently powerful and flexible language, you don't really need a design pattern, you just implement what you want to do. He uses Dylan and Common Lisp for examples, but many of his points are relevant to Ruby as well.

You might also want to take a look at Why's Poignant Guide to Ruby, particularly chapters 5 and 6, though only if you can deal with surrealist technical writing.

edit: Riffing of off Jörg's answer now; I do like reducing repetition, and so not repeating the name of the version control system in both the class and the registration. Adding the following to my second example will allow you to write much simpler class definitions while still being pretty simple and easy to understand.

def log_file_reader name, superclass=LogFileReader, &block
  Class.new(superclass, &block).register_reader(name)
end

log_file_reader :git do
  def display
    puts "I'm a git log file reader!"
  end
end

log_file_reader :bzr do
  def display
    puts "A bzr log file reader..."
  end
end

Of course, in production code, you may want to actually name those classes, by generating a constant definition based on the name passed in, for better error messages.

def log_file_reader name, superclass=LogFileReader, &block
  c = Class.new(superclass, &block)
  c.register_reader(name)
  Object.const_set("#{name.to_s.capitalize}LogFileReader", c)
end

Factory pattern with a class that can has different class sub types

If the types really have nothing in common, then you need no explicit base class. System.Object suffices, just as with many other generic types (i.e. any generic type lacking a constraint).

In other words, you could declare as:

class Transaction<T>
{
    public bool Success { get; private set; }
    public T Entity { get; private set; }

    public Transaction(bool success, T entity)
    {
        Success = success;
        Entity = entity;
    }

    public void GenerateOutput() { /* something goes here */ }
}

Personally, I would avoid adding a "department type" member. After all, that's implicit from the type parameter T. But you could add that easily to the above if you want.

If and when you find that the types do have something in common, such that your Transaction<T> type needs to do more than simply hold onto an instance of one of those types (which is about all it can do without a constraint), then you will be able to put that commonality into an interface or base class (depending on the specific need), and specify that in a constraint for the Transaction<T> class.

Note that it's not clear what you mean for the GenerateOutput() to do, or how it should work. But assuming that you want output that is specific for each Entity value, it seems to me that that is your "something in common". I.e., it's not the Transaction<T> class at all that needs to implement that method, but rather each entity type. In that case, you have something like this:

interface IDepartmentEntity
{
    void GenerateOutput();
}

class Office : IDepartmentEntity
{
    public void GenerateOutput() { /* department-specific logic here */ }
}

// etc.

Then you can declare:

class Transaction<T> where T : IDepartmentEntity
{
    public bool Success { get; private set; }
    public T Entity { get; private set; }

    public Transaction(bool success, T entity)
    {
        Success = success;
        Entity = entity;
    }

    public void GenerateOutput() { Entity.GenerateOutput(); }
}

EDIT:

Per Prasant's follow-up edit, with a request for advice on the GetTransactionObject()…

The right way to do this depends on the caller and the context, a detail not provided in the question. IMHO, the best scenario is where the caller is aware of the type. This allows the full power of generics to be used.

For example:

class TransactionFactory
{
    public Transaction<T> GetTransactionObject<T>()
        where T : IDepartmentEntity, new()
    {
        return new Transaction<T>()
        {
            Data = new T(),
            params = null
        }
    }
}

Then you call like this:

Transaction<FireData> transaction = factory.GetTransactionObject<FireData>();

The caller, of course already knowing the type it is creating, then can fill in the appropriate properties of the transaction.Data object.

If that approach is not possible, then you will need for Transaction<T> itself to have a base class, or implement an interface. Note that in my original example, the IDepartmentEntity interface has only one method, and it's the same as the GenerateOutput() method in the Transaction class.

So maybe, that interface is really about generating output instead of being a data entity. Call it, instead of IDepartmentEntity, something like IOutputGenerator.

In that case, you might have something like this:

class Transaction<T> : IOutputGenerator
{
    // all as before
}

class TransactionFactory
{
    public IOutputGenerator GetTransactionObject(string org)
    {
        if( typeClassLookup.TryGetValue(org, out typeValue))
        {
            switch (typeValue.ToString())
            {
                case "policeData":
                    transactionObject = new Transaction<PoliceData>() { Data = new PoliceData(), params = null};
                case "FireData":
                    transactionObject = new Transaction<FireData>() {Data = new FireData(), params = null};
            }
        }
        return transactionObject;
    }
}

This is an inferior solution, as it means the caller can only directly access the IOutputGenerator functionality. Anything else requires doing some type-checking and special-case code, something that really ought to be avoided whenever possible.

Note: if the Transaction type has other members which, like the GenerateOutput() method, are independent of the contained type T here, and which would be useful to callers who don't know T, then a possible variation of the above is to not reuse the interface used for the department-specific data types, but instead declare a base class for Transaction<T>, named of course Transaction, containing all those members not related to T. Then the return value can be Transaction.

Design pattern to return appropriate class depending if autoloader detects file? Factory, Service Locator, etc?

The Factory-Pattern

You already were on the right track. What you might want to take a look at is the Factory-Pattern

By it's definition the Factory Pattern is a way to instantiate objects not by the classes constructor, but by another class or method.

That's how you probably want to roll. By it's simplest implementation the factory that you need to create here needs to take two arguments. The main class that you want to use and the fallback class in case the main class does not exist.

I will later show a few options how to improve the factory so that it will be able to automate things even further.

Setting up everything

For simplicitys sake I will not use an autoloader here, but using one will work just as fine.

Filesystem Structure

-Dependencies -- DependencyRouter.php -Fallbacks -- FallbackRouter.php -Interfaces -- RouterInterface.php -FallbackFactory.php -index.php

The Dependencies directory contains the main classes that you want to be instantiated in the first place. The Fallbacks directory contains the corresponding fallback class in case the main class can not be instantiated.

Since both, objects of the main class and the fallback class should be able to be used the same way we will define contracts for them. We do this by creating Interfaces. That's what the last folder is for.

To not take up so much space, the gist for the actual implementations (which is not really part of the question) can be found here.

Let's now have a look on the actual Factory.

<?php

class ClassNotFoundException extends \Exception {}

class FallbackFactory {

    public function createInstance( $main, $fallback, $instanceArgs = [] )
    {

        if( class_exists( $main) )
            $reflectionClass = new \ReflectionClass( $main );

        else if ( class_exists( $fallback ) )
            $reflectionClass = new \ReflectionClass( $fallback );

        else
            throw new ClassNotFoundException('The Class ' . $main . ' does not exist and neither does the fallback class ' . $fallback);

        return $reflectionClass->newInstanceArgs($instanceArgs);

    }

}

There is really nothing special going on. First we look if the actual main class does exist. If it does we will store a instance of a ReflectionClass. If it does not we check if the fallback class exist. If it is the case we do the same as before. You could directly instantiate an object but let's use a ReflectionClass here to keep open some neat magic we can add later.

When neither the main class nor the exception class does exist, we should throw an Exception and pass the responsibility for handling the error to the developer.

That's really the whole magic.

index.php

<?php

require_once 'FallbackFactory.php';
require_once 'Interfaces/RouterInterface.php';
require_once 'Dependencies/DependencyRouter.php';
require_once 'Fallbacks/FallbackRouter.php';

$factory = new FallbackFactory();

$router = $factory->createInstance(
    '\Dependencies\DependencyRouter',
    '\Fallbacks\FallbackRouter'
);

$router->route();

That would be how to use the factory. If the main DependencyRouter could be found the output will be:

I am the main Router

If the DependencyRouter could not be found but the FallbackRouter the output will be:

I am the Fallback router

Otherwise an Exception will be thrown.

Extending the Factory

If you want to make the factory act more dynamically you could do a few things.

1. Make use of namespacing and name your classes consistent

If you want to avoid specifying a Fallback everytime you could name your main and the fallback classes the same but specify a different namespace for them. So you would need to only pass the main class. The fallback class would be determined automatically. e.g

$router = $factory->createInstance(
    '\Dependencies\Router',
);

If \Dependencies\Router\ is not present, the factory would automatically look for a \Fallbacks\Router class for example.

2. Automatically resolve Dependencies

At the moment we pass in the constructor arguments as a parameter to the factory method. But by making use of Type Hinting and Reflection you could automagically resolve dependencies.

3. Specifying a Fallback-Hierarchy

Instead of passing one fallback class you could pass an array of multiple classes that all are looked up and resolved if found.

Convert a class to a subclass on instantiation

Ok, couple of things:

You can't convert an instance of a class A to an instance of A's subclass B. At least, not automatically. B can (and usually does) contain attributes not present in A, it can have completely different constructor etc. So, AFAIK, no OO language will allow you to "convert" classes that way.

Even in static-typed languages, when you instantiate B, and then assign it to a variable a of type A, it is still instance of B, it is not converted to its ancestor class whatsoever.

Ruby is a dynamic language with powerful reflection capabilities, so you can always decide which class to instantiate in the runtime - check this out:

puts "Which class to instantiate: "
class_name = gets.chomp
klass = Module.const_get class_name
instance = klass.new

So, no need for any conversion here - just instantiate the class you need in the first place.

Another thing: as I mentioned in the comment, method category? is simply wrong, as it violates OOP principles. In Ruby, you can - and should - use method is_a?, so your check will look like:

if instance.is_a? Category
  puts 'Yes, yes, it is a category!'
else
  puts "Nope, it's something else."
end

This is just a tip of the iceberg, there's lot more about instantiating different classes, and another question I have linked in the comment can be a great starting point, although some code examples there might confuse you. But it is definitely worth understanding them.

Edit: After re-reading your updated question, it seems to me that the right way for you would be to create a factory class and let it do the detecting and instantiating different page types. So, user wouldn't call Page.new directly, but rather call something like

MediaWikiClient.get_page "Category:My Category"

and get_page method would instantiate corresponding class.

How to eliminate switch-case in code?

You are using the switch-statement to create an instance based on a type argument. In this case or basically when using conditional operations with data types, each time you want to extend your application you would have to touch this statements by adding and modifying conditions.
Easy to imagine how this can make your condition checks explode.
Note that your example is hiding dependencies by introducing a factory or a factory methods to create the concrete type instead of creating it directly in the class where the dependency actually exists.

You should try to use a factory only when instantiating a type that is complex to construct (e.g. you have to instantiate additional types in order to satisfy all dependenciesr) or needs additional configuration. In this situation a switch could be used to check a parameter to determine the configuration to apply on the created instance. The Builder pattern can be also useful in this situation.
By encapsulating complex construction or configuration, you can eliminate duplicated code as well, since all is in one place now.

To eliminate this switch I would extract each type construction into (if needed) a separate factory that knows how to construct this type only. So instead of having a method ConnectorFactory.CreateDbConnector(type) you should have a
MySqlFactory.CreateConnector(args)
and a OracleDbFactory.CreateConnector() method.
When the type switch is eliminated this way (a factory each type), introducing a new type only requires to create a new factory which is dedicated to this very type only. No modifications on existing code.

Given that you can modify the connector types from your example, it would be an even better solution to leave the bad factories alone by moving each (factory) method to the object they actually create: MySqlFactory.CreateConnector(args) becomes MySqlConnector.CreateInstance(args). And here as well a method encapsulates each condition and in case of a factory (method) also the applied instance modifications. This means if we had a case where we created a read-only connector, we would now add an additional method to the connector called MySqlConnector.CreateReadOnlyInstance(args) (same applies if we would have stayed with the factories). If you like to expose a singleton you would introduce a MySqlConnector.CreateSharedInstance(args).
Since each type features now its own factory method, adding a new type to the application won't break anything existing.

In case the factory is not required I would always instantiate the type directly in place.

The "Builder" pattern as a solution also offers flexible control over instantiation and encapsulates the procedure, all without the need of a switch statement. But the builder again should be associated with one single type only.

If you need to be extensible e.g. you want substitute a type (or implementation) I would refactor it the same way but then you should implement all factories according to the "Abstract Factory Pattern" and use dependency inversion all over. All type instantiation should then occur in factories or factory methods only. The "new" keyword is not allowed anywhere else except for build-in types. This reduces the effort needed to modify existing code by making the factories the only place where the concrete type has to be changed. Once you have done so, you are also well prepared to use dependency injection if desired.

Long story short, generally you can extract every condition of a switch into a separate method with a descriptive name that describes the case. This time the caller is forced to choose the required method (representing a case). The caller of course already knows everything about himself (also his own type like in your example). He will also know the actions he likes to perform. No switching in order to choose the right action is necessary.
And considering the "Tell-Don't-Ask" principle, no switch should be placed inside a second type to decide which action to perform on the first type. The first type must call the appropriate method to perform an intended action on his data or on his behalf.

And not every switch statement is a violation and violating a principle is fine if you now the consequences. E.g. it would be fine to use a switch to check a private flag. A switch-statement is equivalent to a nested if-then-else statement. The type switching should be avoided e.g. by using inheritance and defining good interfaces for boundaries. When you use e.g. polymorphism and dependency inversion you will most likely have no need for type checks.

How to use design pattern if I have parameter to pass to the product class generater?

Looks to me as if you want to delegate some behaviour to other classes, which are usually not "subclasses" (in terms of inheritance).

A delegate may depend on it's caller - you may pass a reference to the caller to the constructor, if the delegate needs use some of it's callers.

Please do not confuse "design" with "design pattern". You develop a design for your application that should be build upon common design patterns. Design patterns guide to solutions for most common problems.

Ruby Design Pattern: How to Make an Extensible Factory Class