Tuesday 20 November 2012

Reusing code in Test Data Builders in C++

Test Data Builders

Sometimes, when writing unit or acceptance tests, it's a good idea to use Test Data Builder. For example, let's take a network frame that has two fields - one for source, one for destination. A builder for such frame could look like this:

class FrameBuilder
{
protected:
  std::string _source;
  std::string _destination;
public:
  FrameBuilder& source(const std::string& newSource)
  {
    _source = newSource;
    return *this;
  }

  FrameBuilder& destination(const std::string& newDestination)
  {
    _destination = newDestination;
    return *this;
  }

  Frame build()
  {
    Frame frame;
    frame.source = _source;
    frame.destination = _destination;
    return frame;
  }
};

and it can be used like this:

auto frame = FrameBuilder().source("A").destination("B").build();

The issue with Test Data Builder method reuse

The pattern is fairly easy, but things get complicated when we have a whole family of frames, each sharing the same set of fields. If we wanted to write a separate builder for each frame, we'd end up duplicating a lot of code. So another idea is inheritance. However, taking the naive approach gets us into some trouble. Let's see it in action:

class FrameBuilder
{
protected:
  std::string _source;
  std::string _destination;
public:
  FrameBuilder& source(const std::string& newSource)
  {
    _source = newSource;
    return *this;
  }

  FrameBuilder& destination(const std::string& newDestination)
  {
    _destination = newDestination;
    return *this;
  }

  virtual Frame* build() = 0;
};

class AuthorizationFrameBuilder : public FrameBuilder
{
private:
  std::string _password;
public:
  AuthorizationFrameBuilder& password(const std::string& newPassword)
  {
    _password = newPassword;
    return *this;
  }

  Frame* build()
  {
    auto authorizationFrame = new AuthorizationFrame();
    authorizationFrame->source = _source;
    authorizationFrame->destination = _destination;
    authorizationFrame->password = _password;
    return authorizationFrame;
  }
}

Note that there are two difficulties with this approach:

  1. We need the build() method to return a pointer, or we'll never be able to use methods from FrameBuilder in the chain (because each of the methods from FrameBuilder returns a reference to FrameBuilder, which only knows how to create frames, not how to create authorization frames). So, we'll need the polymorphism to be able to perform chains like:
    AuthorizationFrameBuilder().password("a").source("b").build()
  2. Because FrameBuilder calls return a reference to FrameBuilder, not an AuthorizationFrameBuilder, we cannot use calls from the latter after calls from the first. E.g. we cannot make a chain like this:
    AuthorizationFrameBuilder().source("b").password("a").build()
    This is because source() method returns FrameBuilder, that doesn't include a method called password() at all. Such chains end up in compile errors.

Templates to the rescue!

Fortuntely, there's a solution for this. Templates! Yes, they can help us here, but in order to do this, we have to use the Curiously Recurring Template Pattern. This way we'll force the FrameBuilder methods to return reference to its subclass - this will allow us to mix methods from FrameBuilder and AuthorizationFrameBuilder in any order in a chain.

Here's an example code for the solution:

template<typename T> class FrameBuilder
{
protected:
  std::string _source;
  std::string _destination;
public:
  T& source(const std::string& newSource)
  {
    _source = newSource;
    return *(reinterpret_cast<T*>(this));
  }

  T& destination(const std::string& newDestination)
  {
    _destination = newDestination;
    return *(reinterpret_cast<T*>(this));
  }
};

class AuthorizationFrameBuilder 
: public FrameBuilder<AuthorizationFrameBuilder>
{
private:
  std::string _password;
public:
  AuthorizationFrameBuilder& password(const std::string& password)
  {
    _password = password;
    return *this;
  }

  AuthorizationFrame build()
  {
    AuthorizationFrame frame;
    frame.source = _source;
    frame.destination = _destination;
    frame.password = _password;
    return frame;
  }
};

Note that in FrameBuilder, the this pointer is cast to its template type, which happens to be the sublass on which the methods are actually called. this cast is identical in every method of FrameBuilder, so it can be turned into a separate method like this:

  T& thisInstance()
  {
    return *(reinterpret_cast<T*>(this));
  }

  T& source(const std::string& newSource)
  {
    _source = newSource;
    return thisInstance();
  }

Summary

This solution makes it easy to reuse any number of methods in any number of different builders, so it's a real treasure when we've got many data structures that happen to share some common fields.

That's all for today - if you'd like to, please use the comments section to share your solution to this problem for other programming languages.

Bye!

No comments: