Thursday, January 27, 2011

C++ POD Member Handling

I always mess up the initialization of plain old data fields in C++. Always. Maybe by writing it up, I'll finally get it right.

Plain Old Data is essentially anything a regular C compiler could compile. That is:

  • integer and floating point numbers (including bool, though it isn't in C)
  • enums
  • pointers, including pointers to objects and pointers to functions
  • some aggregate data structures (structs, unions, and classes)

A struct, union, or class is treated as plain old data if it has only the default constructor and destructor, has no protected or private member variables, does not inherit from a base class, and has no virtual functions. I suspect most C++ programmers have an intuitive feel for when a class behaves like an object and when its just a collection of data. That intuition is pretty good in this case.

The default constructor for plain old data leaves it uninitialized. An explicit constructor sets it to zero.

CodeResult
class Foo {
 public:
  Foo() {}
  int a_;
};
Result: a_ is uninitialized.
class Foo {
 public:
  Foo() : a_() {}
  int a_;
};
Result: the member corresponding to a_ is zeroed.
Were it a structure, the entire thing would be zero.

People are often confused by the first point, that member POD fields are left uninitialized unless specifically listed in the initializer list. This is not the same as for member objects, which call the default constructor. Making this even more confusing, when a process starts up any pages it gets from the OS will be zeroed out to prevent information leakage. So if you look at the first few objects allocated, there is a better than average chance that all of the member variables will be zero. Unfortunately once the process has run for a while and dirtied some of its own pages, it will start getting objects where the POD variables contain junk.


 
Struct initializers

Putting a POD struct in the initializer list results in zeroing the struct. If the struct needs to contain non-zero data, C++0x adds a useful capability:

struct bar {
  int y;
  int z;
};

class Foo {
 public:
  Foo() : b_({1, 2}) {}
  struct bar b_;
};

Recent versions of gcc implement this handling, though a warning will be issued unless the -std=c++0x or -std=gnu++0x command line flag is given.