The C++ Class Literal Idiom

In C or C++, almost every name that can be used as a valid expression is an object in the sense of the C standard. That is, it has a memory address and a size, and its storage is filled with some bit pattern at any given time. However, literals are an exception. Consider the literal 7. Variables of type int have an address, but 7 has no address. The expression &7 is meaningless. 7 doesn't have a value. It is a value. 7 is eternal. 7 just is.

We would like to create objects that act like literals for a user-defined C++ class. We choose as our running example a typesafe Length class.

    class Length
    {
    private:
      double meters_;
      ...
    public:
      friend Length operator* (double, Length);
      ...
    };

A proper Length class should hide the actual units stored in the private double data field. We do not provide a public constructor from double (not even an explicit one) because that would be meaningless ("What units?"). Likewise, there is no conversion operator from Length to double.

    public:
      Length (double);      // Evil, because semantically wrong
      operator double ();   // Evil, because semantically wrong

Instead, we want to provide literals with names like METER and YARD, and force all users to manipulate Length objects as follows:

    Length x = 3.0 * METER;
    Length y = 2.0 * YARD;
    double xyards = x / YARD;
    Length xx = 3.0;         // COMPILE ERROR
    double d = x;            // COMPILE ERROR
    Length z = METER * YARD; // COMPILE ERROR

Implementation as Global Variables

The obvious approach is to make METER and YARD immutable global objects of type Length:

    // In Length.hpp
    class Length
    {
    private:
      double meters_;
      Length (double meters) : meters_ (meters) {}

    public:
      static Length make_METER ();
      static Length make_YARD  ();
      ...
    };

    extern const Length METER;
    extern const Length YARD;

    // In Length.cc
    Length Length::make_METER () { return Length (1.000); }
    Length Length::make_YARD  () { return Length (0.914); }

    const Length METER = Length::make_METER();
    const Length YARD  = Length::make_YARD();
This is not bad. But it's not perfect either.

It's not maximally efficient. Because the internal data of the literals are only visible in a separate compilation unit, constant folding is likely to be beyond the reach of the compiler.

User code can access the address where the literals are stored. Their object-ness is exposed. It is impossible to forbid using operator& on Length literals, without also forbidding it on all Length objects.

In a horrifying replay of ancient FORTRAN compilers, it is possible for Murphy or Machiavelli to modify the literal:

    (const_cast<Length &>(METER)) = YARD; // Evil incarnate
This kind of abuse is likely to succeed without the desired core dump, since the implementation is unlikely to actually store const objects with non-trivial constructors in read-only storage.

Implementation as Inline Functions

For small value types, it is more efficient to create literal objects "Just In Time" whenever they are needed than to store them in memory. This naturally leads to the idea of using inline functions to represent literals.

    // In Length.hpp
    class Length
    {
    private:
      double meters_;
      Length (double meters) : meters_ (meters) {}
    public:
      friend inline Length METER () { return Length (1.000); }
      friend inline Length YARD  () { return Length (0.914); }
      ...
    };

This solves all the above deficiencies with static storage objects. A good compiler will be able to optimize the code as well as if fundamental types were used directly. (Perhaps more efficiently, since the compiler can assume that objects of different types are not aliased.)

There is an added advantage that the implementation is entirely contained in a header file, so there is no need to provide an object code library. This is particularly useful when working with templates.

The only deficiency of the inline function approach is syntactic. User code now looks like this:

    Length x = 3.0 * METER();
    Length y = 2.0 * YARD();
    double xyards = x / YARD();

The function call parentheses are highly misleading for something that is really just a compile-time constant. It would be much nicer if those could be eliminated. We've sworn off using macros, which work well, but don't play nicely in a namespace-aware world...

// Avert thine eyes...
#define METER METER()
#define YARD  YARD()

Implementation as Factory Methods

A variation on the inline function approach is to use factory methods which require the user to specify the units to be used.

    // In Length.hpp
    class Length
    {
    private:
      double meters_;
      Length (double meters) : meters_ (meters) {}
    public:
      static inline Length meters (double x) { return Length (1.000 * x); }
      static inline Length yards  (double x) { return Length (0.914 * x); }
      ...
    };

User code is now syntactically very different:

    Length x = Length::meters (3.0);
    Length y = Length::yards (2.0);
    double xyards = x / Length::yards(1.0);

Although this is an excellent solution, the syntax is still a little odd, especially for native speakers of languages that put numbers before nouns. We continue to search for alternatives.

Implementation as Stateless Singletons

Let's explore a different path. One of our principles is to create new types where appropriate to represent new concepts. This leads to the idea of making METER a singleton object of type Meter, instead of an instance of type Length. To be used in a context requiring a Length, METER merely needs to be convertible to Length, not to be of type Length (but see below...). By making METER a singleton object, it no longer needs any state — that is encoded in the type.

    // In Length.hpp
    class Length
    {
    private:
      double meters_;
      Length (double meters) : meters_ (meters) {}

    public:
      class Meter {}; Length (const Meter&) : meters_ (1.000) {}
      class Yard  {}; Length (const Yard &) : meters_ (0.914) {}
      ...
    };

    extern Length::Meter METER;
    extern Length::Yard  YARD;

    // In Length.cc
    Length::Meter METER;
    Length::Yard  YARD;

Observe how the state of the METER literal has been transferred into the  Length (const Meter&)  constructor. Class Meter has no instance data.

One of the advantages of introducing new types is being able to specify different operations on those types. Taking the address of METER is semantically meaningless, and we want to outlaw it. Now that the type of METER is no longer Length, we can do this.

    // In Length.hpp
    class Length
    {
    private:
      double meters_;
      Length (double meters) : meters_ (meters) {}

      class Literal
      {
      protected:
	Literal () {}
      private:
	void* operator new (size_t);	  // outlawed
	void  operator delete (void*);	  // outlawed
	void  operator& ();		  // outlawed
	void  operator= (const Literal&); // outlawed
      };

    public:
      class Meter; Length (const Meter&) : meters_ (1.000) {}
      class Yard;  Length (const Yard &) : meters_ (0.914) {}
      ...
    };

    class Length::Meter : private Literal {}; extern Length::Meter METER;
    class Length::Yard	: private Literal {}; extern Length::Yard  YARD;

    // In Length.cc
    Length::Meter METER;
    Length::Yard  YARD;

We have succeeded in partially suppressing the object-ness of our literals.

The constants defined in the Length.cc translation unit contain no data, are never used, cannot have their address taken (without cheating), and so the compiler will never reference them. This means the linker can discard them.

Implementation as Derived Singletons

There is a fundamental problem in C++ that we call the Proxy Problem — given a type T, it is difficult to create a corresponding proxy type which can be used in all contexts that T can be used. Creating a user-defined conversion from Proxy to T seems like the obvious approach to implementing such a class. One of the problems that arises is the limit of one user-defined conversion per implicit conversion sequence. Our stateless singleton types above exhibit this problem. They are Length proxies. For example:

struct S { S (Length) {} };
void foo (S);
...
  Length x;
  foo (x);    // OK -- one user-defined conversion
  foo (METER) // COMPILE ERROR -- chained user-defined conversions not allowed

Curiously, there are circumstances where C++ does allow chaining of user-defined conversions. The user-defined conversion of a derived class to its base class is a conversion favored by the C++ standard (the C++ standard is too opaque for me to quote chapter and verse, but three out of three compilers agree). From this we get the idea of implementing our proxy literal classes as classes derived from Length. We use inheritance in a perhaps novel way — not to inherit interface or implementation, but merely to improve the treatment of our class by the C++ type system. We use private inheritance since that suffices for our goal.

(This is a general approach to implementing proxy classes where the type being proxied is of class type. It won't work for, e.g. pointer proxies.)

    // In Length.hpp
    class Length
    {
    private:
      double meters_;
      Length (double meters) : meters_ (meters) {}
      class Literal;

    public:
      class Meter; Length (const Meter&) : meters_ (1.000) {}
      class Yard;  Length (const Yard&)  : meters_ (0.914) {}

      Length (const Meter&) : meters_ (1.000) {}
      Length (const Yard &) : meters_ (0.914) {}
      ...
    };

    class Length::Literal : private Length
    {
    protected:
      Literal () : Length (99.44 /* unused */) {}
    private:
      void* operator new (size_t);
      void  operator delete (void*);
      void  operator& ();
      void  operator= (const Literal&);
    };

    class Length::Meter : public Literal {}; extern Length::Meter METER;
    class Length::Yard  : public Literal {}; extern Length::Yard  YARD;

    // In Length.cc
    Length::Meter METER;
    Length::Yard  YARD;

Since we inherit from Length, we inherit Length's private data member. But we don't use it, and so we initialize it to some random value. Just as with stateless singletons, the linker will be able to discard derived singleton objects.

There may be additional work required to make METER a more complete Length proxy. A call to a member function using the `.' operator is not subject to conversion, so we will have to implement forwarding functions for all of Length's member functions.

We call this use of private inheritance the Class Literal Idiom.

Literal Dispatching — an Esoteric Technique

If class literals are of a different type from their base class, then we can do function overloading on the class literals.

Imagine that we have a Bool type, with literals TRUE (of type True) and FALSE (of type False). We could write this code:

    // In foo.hpp
    void foo (True);
    void foo (False);
    inline void foo (Bool b) { if (b) foo (TRUE); else foo (FALSE); }

    // In foo_True.cc
    void foo (True) { /* lots of code here */ }

    // In foo_False.cc
    void foo (False) { /* lots of code here */ }

The advantage of this technique is that, if every call to foo uses the constant argument TRUE, then the code for  foo (False)  can be discarded by the linker. Too esoteric? Perhaps not for template-intensive code.

The Real World

Is the Class Literal Idiom a useful technique? We believe so, but practical experience is needed. One current problem is that not all C++ compilers are smart enough to optimize uses of proxy objects away completely, as they should.

Complete Source Code

Complete sample programs for all the above variants is available here.


Back to Martin's home page
Last modified: Fri Jan 17 13:09:57 PST 2003
Copyright © 2003 Martin Buchholz