C++’s Rule of Zero

Edit: Note that my blog (including this post) has moved to http://www.nirfriedman.com. Feel free to visit there instead for a cleaner look and the latest content!

Some of you probably already know about the Rule of Zero. More of you have probably heard of the Rule of 3 (pre C++ 11) or the rule of 5. Let’s start by reviewing what the Rule of Zero is.

What is the Rule of Zero?

The idea behind it is as follows: classes should not define any of the special functions (copy/move constructors/assignment, and the destructor) unless they are classes dedicated to resource management. It’s described in this blog post. There are a couple of good reasons for this, one more conceptual, and one more practical. The conceptual reason stems from the single responsibility principle, the idea that a class should be responsible for one thing. If you make a class responsible for more than one thing, then you have rather tightly coupled the implementation and interface of two separate things. Your class’ job is to orchestrate the state of its member variables to provide some combined state, and interface to that state. When you write special member functions, you are basically picking up the pieces left behind by members that don’t manage their resource in quite the way you want. The right thing to do is to make sure you are using the right members. This leads us directly to the practical reason: in c++, a class can’t choose for which members it will redefine the special functions. What do I mean? Well, if you don’t write the special functions, the compiler generates them for you by trying to apply the expected operation on a member by member basis. So if you use the default move constructor, the compiler will generate a move constructor that simply attempts to move construct all data members (and base classes). As soon as you write a custom move constructor that does some special handling for one data member, you now have to write code to deal with every other member, even if you just need to change the one. This may not seem like a big deal, after all, what’s one line of code per variable, per custom special function? The real problem isn’t so much having to write it, as it is that the compiler won’t enforce it and it’s easy to forget. Consider this code:

class Example {
 public:
  Example(const Example & other) : m_double(other.m_double) { }

 private:
  double m_double = 0.0;
  bool m_bool = false;
};

You won’t get a warning in your IDE or in any sanitizer for this, because m_bool will deterministically get set to false by virtue of its inline initialization, and the fact that it does not get mentioned in the constructor initialization list. But this is almost certainly a bug, in that whenever you copy an Example object, the new copy will have m_bool = false regardless of the source of the copy. Something like this is very easy to introduce, and could be very hard to dig up afterwards.

Why people break the rule

Of course, people won’t generally write code like the example above, because doubles and booleans are simple, and you are nearly guaranteed to be happy with their default resource semantics. So let’s take a very common real case: pointers. It’s quite common for a class to have pointers. Of course, we’re savvy modern c++11 programmers, so we’ll use a unique_ptr, preventing any possible leaks. The problem is that while unique_ptr provides the correct destructor, unless we want the class owning the unique_ptr to be uncopyable, it doesn’t do what we want for the copy constructor/assignment. Let’s suppose for a moment that Example wants to own a non-polymorphic class Pointee by pointer, we end up with code that looks like this:

class Example {
 public:
  Example(const Example & other)
      : m_pointer(make_unique<Pointee>(*other.m_pointer))
      , m_bool(other.m_bool)
  { }

  // Won't be default generated unless you add this
  Example(Example &&) = default; 

  // similar code for copy/move assignment

 private:
  unique_ptr<Pointee> m_pointer;
  bool m_bool = false;
};

Code along these lines is very common in the wild, and it’s not optimal. In a larger class, every modification that adds or removes variables will be vulnerable to introducing a bug into the copy/move semantics. Can we do better?

How to follow the rule

The conclusion from the above code is that unique_ptr doesn’t actually represent the correct resource management for our class. So rather than making this the issue of the entire Example class, including all its other members, we should fix the problem at the root. Let’s write a class that has the proper resource semantics.

template<class T>
class DeepCopyPointer {
 public:
  DeepCopyPointer(const DeepCopyPointer & other)
      : m_pointer(make_unique<T>(*other.m_pointer))
  { }

  DeepCopyPointer(DeepCopyPointer &&) = default; 

  // similar code for copy/move assignment

 private:
  unique_ptr<T> m_pointer;
};

This looks better, right? Well, there’s a bit of a problem: this class doesn’t actually provide any of the interface that a unique_ptr does. We could make Example a friend and have it access the unique_ptr directly, but this is ugly. Another problem is that this code is pretty specialized, it’s just for unique_ptr, it’s only for non-polymorphic objects: if we are storing a base class pointer and the object’s type is different, this copy won’t work properly. We’d probably want to copy by calling a clone method or similar.

A better solution by breaking another rule

Let’s do something more generic. Let’s write a class that mimics another class, but allows you to change its copy behavior. To mimic that other class, we’re going to inherit from it. Now, we are interested in mimicking unique_ptr, isn’t it bad to inherit from it? People often say that you shouldn’t inherit from types like unique_ptr or vector, because they don’t have virtual functions, and especially destructors. The answer is that it’s not bad to inherit, as long as you never use it polymorphically. With that in mind:

template <class T, class F>
class copyer : public T {
 public:
  copyer(T && t)
      : T(std::forward(t)) { };

  copyer(copyer && other) = default;
  copyer & operator=(copyer && other) = default;

  copyer(const copyer & other)
      : T(F()(other)) { }

  copyer & operator=(const copyer & other) {
    std::swap(*this, copyer(other));
    return *this;
  }
};

T is the class to be mimicked, and F is a functor that will help us redefine the copying behavior. Note that we are going with the defaults for move construction/assignment, we assume that we like T’s defaults for move construction/assignment. I chose to do this because it’s quite a bit trickier to write this code generically if T is not moveable (or not moveable in a way we like). To get the deep copy unique_ptr we want, we just do this:

template <class T>
struct F {
   std::unique_ptr<T> operator()(const std::unique_ptr<T> & other) {
    return std::make_unique<T>(*other);
  }
};

template <class T> using copying_ptr = copyer<std::unique_ptr<T>, F<T>>;

We can now write Example like this:

class Example {

 private:
  copying_ptr<Pointee> m_pointer;
  bool m_bool = false;
};

That’s it, we get all the special functions for free now. And we don’t have to worry about adding bugs to Example each time we add/remove variables. We can also easily reuse copyer, if Example wanted to alter copy semantics in a different way, it could just define a private nested struct and use that in templating copyer instead of F.

In summary

Don’t start writing out big special functions for your complex classes, just because a couple of members don’t have the right default behavior. Use techniques like this one to instead have members that do have the default behavior; your code will be cleaner and less bug prone. Edit: Here’s a brief follow up on exception safety, inspired by some excellent comments by Ross Smith (CaptainCrowbar) on cpp reddit.

Strong Exception Guarantee

There is one place where the default generated special functions fall short. The strong exception guarantee says that if a function fails by throwing, then the state of the program is the same as before the function was called. For constructors, this is automatically upheld provided each member has a proper destructor; an exception thrown in the constructor body means that the partially constructed object will be destroyed. Move assignments are moves, and it’s extremely common and desirable for types to have noexcept move operations. So copy assignment is the odd man out. The default copy assignment will try to copy assign each member in sequence. For example:

class Example {

 private:
  std::vector m_vec1;
  std::vector m_vec2;
};

Example e1;
Example e2;
....
e2 = e1;

What could happen here is that e1.m_vec1 will get copied into e2.m_vec1, then the copy of e1.m_vec2 into e2.m_vec2 throws. Now, e2 is in a half copied state, and the strong exception guarantee is broken. So here’s how we fix it:

class Example {
 public:
  Example(const Example &) = default;
  Example(Example &&) = default;
  Example & operator=(Example &&) = default;

  Example & operator=(const Example & other ) {
    std::swap(*this, Example(other));
    return *this;
  }

 private:
  std::vector m_vec1;
  std::vector m_vec2;
};

I simply use std::swap instead of the using namespace std technique, because I know I won’t write a custom swap; in the vast majority of cases it’s not necessary. Now, if an exception is thrown, it will throw while the temporary is being constructed in Example(other), before the swap is executed, and e2 will be unchanged. Note that std::swap calls the move constructor and move assignment, so the three defaulted special functions are all being used to produce a copy assignment that gives the strong guarantee. So, is this compatible with the spirit of the Rule of Zero? It may seem not, as we explicitly wrote a special function. But I would argue that it is. The main thing I emphasized is the undesirability of letting your class deal with resource management details of its members. Here, this is not happening. Notice that none of the special function code will need to be changed as you add or remove members, provided they are ones that provide the correct semantics. Maybe the take away is this: the real Zero in the rule of Zero is that your special functions should reference exactly zero of your member variables. Whether it stems from not being written, explicitly defaulted, or copied and swapped, all of these approaches are fine.

Advertisements
Posted in c++
18 comments on “C++’s Rule of Zero
  1. sehe says:

    Might be nice to link to the source of this idea: R.Martinho’s blog http://flamingdangerzone.com/cxx11/rule-of-zero/

    Like

  2. There seems to be an unbalanced paren in both the second and third examples.

    Like

  3. Sander says:

    Good point, one that I find myself violating from time to time. Note that you could also use copy-and-swap for assignment. It lets you do away with the move assignment definition (rule of four-and-a-half).

    Like

    • quicknir says:

      Sorry, I don’t quite follow what you mean. In the section I added at the bottom, I do use CAS for copy assignment, and I default move assignment. Are you talking about an assignment operator that takes a parameter by value?

      Like

  4. Marco A. says:

    Good article, I believe there’s a typo in the line ‘What could happen here is that e1.m_vec1 will get copied into e2.m_vec2’, I suppose the latter should be e2.m_vec2

    Like

  5. There appear to be some weird template issues in here:

    * ‘class DeepCopyPointer’ is declared as a class template, but doesn’t have any template parameters. Its m_pointer member variable doesn’t provide a template parameter to the unique_ptr template either.

    * ‘class copyer’ is indeed a class template, with template parameters T and F, but the template parameters are not included in the declaration.

    * ‘struct F’ has the same issues as DeepCopyPointer.

    I’ve viewed the page in both Firefox and Safari so I don’t think this is a browser issue; it looks like something in the blog is *eating* the angle-bracket-enclosed template parameters.

    Like

    • quicknir says:

      Damn, I don’t know how I didn’t see it. Makes me wonder if it happened at some point after I posted, though I don’t know how that would happen. Thanks for pointing this out; I think I corrected them all but if you see something else let me know.

      Like

  6. instagram darmowe like says:

    Tremendous issues here. I’m very glad to see your post.

    Thanks so much and I’m looking forward to contact you.
    Will you please drop me a e-mail?

    Like

  7. […] going to leave this alone. Also, those of you that are attentive and maybe even read my last post about resource management and special functions may notice something that could go wrong in the […]

    Like

  8. abigagli says:

    Coming late to the party, but I just found this and think it’s really interesting.
    I’m quite confused by the copier class implementation though:

    I think the intention was for the “copyer (T &&t)” constructor to be a forwarding constructor, but here T is the class template parameter, and so the “T &&t” parameter specification is not in a deducing context, which implies this is just a constructor taking an rvalue reference.
    In other words, the parameter is not what the standard now calls a “forwarding reference”, which means it only correctly binds to rvalues, which I don’t think is what was intended.
    Am I missing something here?

    Like

    • quicknir says:

      The intention was not to have a forwarding constructor. If you want to have a forwarding constructor, it needs to be varidiadically templated, else it won’t forward very well! The idea is indeed to take an rvalue reference to a T. The idea of the copying class is to change the copying behavior of T but leave everything else untouched. I choose to construct this class by having a sink T parameter. There are other ways to do it; I could have a forwarding constructor for instance. But forwarding constructors require some care, otherwise they can actually take over other operations unexpectedly (like copy construction). This approach is simpler and safer, but less convenient for the user. However, since the only place you’d use the class is as a member variable of another class, I expect the construction operations to be encapsulated, therefore not as inconvenient, thus explaining which I chose this tradeoff. Does that make sense?

      Like

      • abigagli says:

        Yes, I see.
        That’s pretty much what I would have thought, but that “std::forward(t)” (which by the way lacks the template argument inside “”, but I think wordpress has something to do with it) as the argument for T’s constructor made me think there was something else going on…

        As we’re binding to an rvalue reference, wouldn’t it be more idiomatic to use std::move there?

        Like

    • quicknir says:

      Replying here because WP doesn’t handle nested commenting. Also, WP does all kinds of things with swallowing and puking up and &, which probably caused that discrepancy. I’m actually leaving WP for Ghost I think, I’ll leave a final post here if you’re interested in following the new one.

      I agree that std::move is more idiomatic there, thank you for catching that. I’ll make that correction when I move the articles over :-).

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: