Category Archives: Beginner

Syncing Enums and Arrays

puzzleHere is a basic technique that may not be familiar to all…

Often, the need arises to define a symbolic enumeration, along with a “companion” array of complementary data such as string names.  With default compiler-assigned values, the enumeration serves as a convenient index into the companion array, as in the following example:

// Enumeration and companion array, often 
// accompanied by a comment like this:
// WARNING: Must keep ColorEnum and ColorNames in sync!
enum ColorEnum
{
	Red,
	Green,
	Blue
};

static const char * ColorNames[] = 
{ 
	"Red",
	"Green",
	"Blue"
};

void PrintColor(ColorEnum c)
{
	cout << ColorNames[c];
}

The risk of this technique is in maintaining coordination between symbols and slots - both the sequence and the count.  Beyond that, there is violation of the DRY principle ("don't repeat yourself"), creating more opportunities for things to get out of sync.

The solution is to use a pair of macros.  The first, which we'll call sequence, defines an abstract sequence of elements.  The second, which we'll call select, defines the attributes of each element.  Often, the function of the select macro is simply to select one or more entities from the abstract definition in sequence.  But the capability is general and can be used to define any sequence, including enums, static C-style arrays, static const members, parse tables, etc.

Following is a code sample demonstrating the basic idea.

// Define abstract sequence of elements, each of which 
// programmatically "selects" appropriate attributes
#define sequence \
	select( Red, 0xFF0000 ) \
	select( Green, 0x00FF00 ) \
	select( Blue, 0x0000FF )

// Generate symbolic enum: Red, Green, Blue
#undef select
#define select(symbol, value, ...) symbol,
enum ColorEnum { sequence };

// Generate color name sequence: "Red", "Green", "Blue"
// Indexed by ColorEnum above
#undef select
#define select(symbol, value, ...) #symbol, 
static const char * ColorNames[] = { sequence };

// Generate color value sequence: 0xFF0000, 0x00FF00, ...
// Indexed by ColorEnum above
#undef select
#define select(symbol, value, ...) value,
static int ColorValues[] = { sequence };

// Generate color struct sequence: {"Red", 0xFF0000}, {"Green", 0x00FF00}, ...
// Indexed by ColorEnum above
// Note the definition of select as a variadic macro, 
// to conveniently pick up all remaining element values.
#undef select
#define select(symbol, ...) { #symbol, __VA_ARGS__ }, 
static struct Color
{
	const char * symbol;
	int value;
}
Colors[] = { sequence };

// Define public static color symbols with numeric values
#undef select
#define select(symbol, value, ...) static const int symbol = value;
struct Colors { sequence };

For a deeper dive into code-gen with the preprocessor, I recommend reading Appendix A An Introduction to Preprocessor Metaprogramming from the excellent "C++ Template Metaprogramming" by David Abrahams and Aleksey Gurtovoy.

 

Code Progressions

1085844_sax_close-upI love jazz.  Especially when a musician improvises over the changes – the “changes” being chord progressions in a given section of a song.  One of my favorites is Coltrane’s Giant Steps.  I’m in awe when a soloist can float above complex chord changes without getting lost.  The key, I’m sure, is having the progression memorized.

Here’s an analogy: I’ve found it helpful to memorize several progressions in my programming. This helps ensure that my code is safe, efficient, and loosely coupled, without conscious effort – like a musician soloing over the changes. The following code progressions might be helpful for a programmer new to the art, and seeking to develop an idiomatic approach.  Each sequence begins with safest, most restrictive option, progressing to greater complexity and capability as necessary.

Member Accessibility

  • Private
  • Protected
  • Public

This is the progression every object-oriented programmer begins with. Keeping the surface area of a class as small as possible reduces complexity, test case coverage, and cognitive load for consumers of the class (i.e., it does your peers a favor).

Member Declaration

  • Module Static
  • Class Static Inline
  • Class Static Out-of-line
  • Const Non-Virtual Inline
  • Const Non-Virtual Out-of-line
  • Modifiable Non-Virtual Inline
  • Modifiable Non-Virtual Out-of-line
  • Const Virtual
  • Modifiable Virtual

Each degree in this progression increases cost or complexity, or reduces generality. This one is a little meatier, because it combines several concerns, all related to member declaration.  But we can break it down into “atomic” sub-progressions, each of which factors into the overall progression:

  • Module → Class
  • Static → Member
  • Const → Modifiable
  • Non-Virtual → Virtual
  • Inline → Out-of-line

Private Static might be useful with pure header implementations, or when it’s hard to identify a source code module for a member to live in. But generally, Module Static is preferable to Class Static because it introduces zero coupling, not being mentioned in the public interface. One pattern for implementing Module Static is to forward declare an opaque helper class and wrap it with a smart pointer.

Const methods have a broader domain than Modifiable methods, given the compiler’s ability to bind a const reference to a temporary object, and so should be preferred.

Inline methods can be handled more efficiently by the compiler, and should be preferred over out-of-line methods, all things being equal. For example, simple get/set accessors which add no coupling should always be implemented inline. In this case, the narrower context also aids readability.

Non-Virtual methods require the additional pushing/en-registering of the ‘this’ pointer when dispatching the call, and should only be used when required.

Finally, Virtual methods require a level of indirection (pointer table lookup) when dispatching the call, and should be used only as necessary.

Data Storage

  • Stack Variable
  • Function Parameter
  • Class Member (Aggregation)
  • Heap

The scope of a variable’s existence and visibility should be kept as narrow as possible. This doesn’t just aid readability. For every degree of scope broadening, there is increased cost in managing memory or communicating the data between parts of the program. Whenever possible, a developer should strive for a “stack-based discipline.”  This might entail rewriting an algorithm to be recursive, or simply relocating a variable to the topmost stack frame in which it must be visible. The heap should be considered a means of last resort. It certainly has its place, such as a factory method returning a collection, but it’s relatively slow and adds complexity and risk.

Code Reuse

  • Aggregation (or Private Inheritance)
  • Single Inheritance
  • Single Concrete & Multiple Interface
  • Multiple Concrete

Private Inheritance might be useful if a developer wants to expose a small subset of an inherited interface publicly, in which case a few “using” declarations can come in handy. Otherwise, Aggregation is typically preferred, as it reduces complexity in constructor implementations.

Some languages enforce single concrete inheritance, permitting implementation of multiple abstract interfaces. This design eliminates the possibility of the “dreaded diamond hierarchy”, with multiple instances of the same base. Other languages permit multiple concrete inheritance, and must address multiple bases (typically through virtual inheritance).

Do you have a progression you’d like to add?  Let’s hear it!