void peri()
{
// string::string(const char * other);
string theory("hello");
// string::string(const string & other);
string vest(theory);
}
The question is can theory and vest share the state that represents what after all is a
common value. Clearly they can, the issue is what are the consequences of this sharing.
class string
{
public:
string(const char * other);
string(const string & other);
string & operator=(const string & rhs);
~string();
char & operator[](size_t index);
...
private: // state
char * rep;
};
string::string(const char * other)
: rep(new char[strlen(other) + 1])
{
strcpy(rep, other);
}
string::string(const string & other)
: rep(other.rep)
{
// empty
}
string & string::operator=(const string & rhs)
{
rep = rhs.rep;
return *this;
}
string::~string()
{
delete rep;
}
char & string::operator[](size_t index)
{
return rep[index];
}
The first consequence of naive sharing is that it breaks the Law of Least Astonishment.
If I change theory I do not expect vest to change and when it does I'm more than a
little annoyed.
string theory("hello");
string vest(theory);
theory[0] = 'C'; // theory == "Cello"
cout << vest << endl; // !!! prints Cello
There are various ways to solve this. For example a copy on write pointer, which I'll
look at in another pointer article.
The second consequence of sharing is that it introduces an associative relationship. Or to be blunt a pointer. theory and vest now hold a pointer to, and thus share, the state that represents the value "hello". Whenever we have an associative relationship we are implicitly talking about separate objects with separate lifetimes. We must somehow manage the shared relationship. Custody. The naïve code above is fatally flawed because it utterly fails to manage the shared relationship. Theory and vest share the same scope; the string destructor will be called twice at the end of the peri function and the code will do a double deletion. The alternative of making the destructor empty is no good either. This will just cause a memory leak. One solution to this problem is to introduce extra information which is used to manage the shared relationship. A reference count.
namespace accu
{
class string
{
public:
...
private:
char * rep;
int * count;
};
}
The plain constructor allocates some memory to hold a replica of the literal, and
some more memory to hold the shared reference count. It initializes the count to one
to indicate that it (the object that the constructor is constructing) does not share
the state with any other string objects. Not yet.
namespace accu
{
string::string(const char * other)
: rep(new char[strlen(other) + 1])
, count(new int(1))
{
strcpy(rep, other);
}
}
The copy constructor simply makes two pointer initializations. In other words
shallow copies. Normally a shallow copy is dangerous. However in this case the
idea is to do just that. It's not dangerous because we have a count that is
managing the sharing. The vital line is the count increment.
namespace accu
{
string::string(const string & rhs)
: rep(rhs.rep)
, count(rhs.count)
{
++*count;
}
}
With this copy constructor then after...
string theory("hello");
string vest(theory);
theory.rep and vest.rep both point to the same shared state and,
vitally, theory.count and vest.count both point to the same shared
integer which holds the value two indicating that two objects (theory and vest) are
sharing the state.
The destructor can now use the shared count to determine whether it is the last object referring to the state. If it is it must do the deletions, if not it must decrement the count since the string object that is dying holds one of the references and it is, well, dying.
namespace accu
{
string::~string()
{
if (--*count == 0)
{
delete rep;
delete count;
}
}
}
The copy assignment operator is basically just a combination of the code in the
destructor plus the code in the copy constructor. And a self assignment check.
namespace accu
{
string & string::operator=(const string & rhs)
{
if (this != &rhs)
{
if (--*count == 0)
{
delete rep;
delete count;
}
rep = rhs.rep;
count = rhs.count;
++*count;
}
return *this;
}
}
namespace accu
{
class string
{
public:
string(const char * other);
...
private:
class body;
counted_ptr<body> ptr;
};
}
// accu/counted_ptr.hpp #ifndef ACCU_COUNTED_PTR_INCLUDED #define ACCU_COUNTED_PTR_INCLUDED namespace accu { template<typename type> class counted_ptr { public: // create/copy/destroy counted_ptr(type * other); counted_ptr(const counted_ptr & other); counted_ptr & operator=(const counted_ptr & rhs); ~counted_ptr(); public: // access type * operator->() const; type & operator*() const; type * raw() const throw(); private: // precondition for -> and * void check_not_null_ptr() const; private: // plumbing void increment() const; void decrement() const; private: // state type * ptr; int * count; }; } //include-all compilation model #include "accu/counted_ptr-template.hpp"
// accu/counted_ptr-template.hpp #if !defined ACCU_COUNTED_PTR_INCLUDED \ || defined ACCU_COUNTED_PTR_TEMPLATE_INCLUDED #error "include: " \ "counted_ptr-template.hpp must not be included directly" #endif #define ACCU_COUNTED_PTR_TEMPLATE_INCLUDED namespace accu // counted_ptr - create/copy/destroy { template<typename type> counted_ptr<type>::counted_ptr(type * other) : ptr(other) , count(new int(1)) { // empty } template<typename type> counted_ptr<type>::counted_ptr(const counted_ptr & other) : ptr(other.ptr) , count(other.count) { increment(); } template<typename type> counted_ptr<type> & counted_ptr<type>::operator=(const counted_ptr & rhs) { if (this != &rhs) { decrement(); ptr = rhs.ptr; count = rhs.count; increment(); } return *this; } template<typename type> counted_ptr<type>::~counted_ptr() { decrement(); } } namespace accu // counted_ptr - access { template<typename type> type * counted_ptr<type>::operator->() const { check_not_null_ptr(); return ptr; } template<typename type> type & counted_ptr<type>::operator*() const { check_not_null_ptr(); return *ptr; } template<typename type> type * counted_ptr<type>::raw() const throw() { return ptr; } } namespace accu // counted_ptr - preconditions { template<typename type> void counted_ptr<type>::check_not_null_ptr() const { if (ptr == 0) { throw logic_error("counted_ptr: null pointer"); } } } namespace accu // counted_ptr - plumbing { template<typename type> void counted_ptr<type>::increment() const { ++*count; } template<typename type> void counted_ptr<type>::decrement() const { if (--*count == 0) { delete ptr; delete count; } } }
if (--*count == 0)
{
delete ptr;
delete count;
}
The question is what happens if the destructor called via the delete ptr expression
throws an exception. The answer is that the memory pointed to by count won't be
reclaimed. This is because count is a raw pointer and not a fully constructed
class object. One way to solve this is simply to delete the integer first:
if (--*count == 0)
{
delete count;
delete ptr;
}
However, it might be useful to create a general reference_count class. There is also
an issue concerning the nature of ++ and --. If you are working with multi-threading
you need to be sure that ++ and -- are atomic. So let's create a reference_count class.
// accu/reference_count.hpp namespace accu { class reference_count { public: // create/copy/destroy reference_count(); reference_count(const reference_count & other); reference_count & operator=(const reference_count & rhs); ~reference_count(); public: // query bool is_unique() const; private: // plumbing void increment() const; void decrement() const; private: // state int * count; }; }
// accu/reference_count.cpp namespace accu { reference_count::reference_count() : count(new int(1)) { // empty } reference_count::reference_count(const reference_count & other) : count(other.count) { increment(); } reference_count & reference_count::operator=(const reference_count & rhs) { rhs.increment(); // NOTE decrement(); // this is count = rhs.count; // self-assignment safe return *this; } reference_count::~reference_count() { decrement(); } } namespace accu { bool reference_count::is_unique() const { return *count == 1; } } namespace accu { void reference_count::increment() const { ++*count; } void reference_count::decrement() const { if (--*count == 0) { delete count; } } }With this class counted_ptr simplifies somewhat.
// accu/counted_ptr.hpp ... namespace accu { class reference_count; template<typename type> class counted_ptr { public: counted_ptr(type * other); // default copy constructor okay counted_ptr & operator=(const counted_ptr & rhs); ~counted_ptr(); // ... private: type * ptr; reference_count count; }; }
// accu/counted_ptr-template.hpp ... namespace accu // counted_ptr - create/copy/destroy { template<typename type> counted_ptr<type>::counted_ptr(type * other) : ptr(other) , count() { // empty } template<typename type> counted_ptr<type> & counted_ptr<type>::operator=(const counted_ptr & rhs) { if (this != &rhs) { if (count.is_unique()) { delete ptr; } ptr = rhs.ptr; count = rhs.count; } return *this; } template<typename type> counted_ptr<type>::~counted_ptr() { if (count.is_unique()) { delete ptr; } } }Continuing the theme of exception safety, let's look at the constructor for counted_ptr. A typical use will be something like:
namespace accu
{
string::string(const char * other)
: ptr(new body(other))
{
// empty
}
}
This will call the counted_ptr constructor:
namespace accu
{
template<typename type>
counted_ptr<type>::counted_ptr(type * other)
: ptr(other)
, count()
{
// empty
}
}
Now ptr inside counted_ptr<> is a raw pointer. That means if the reference_count
constructor throws an exception (which it could) we will have a resource leak because
the argument to the counted_ptr constructor was a raw pointer created via the
new body(other) expression. I've thought about this and I can't see an elegant
solution. The best I can come up with is to split off the initialization of the
reference_count into a separate method. Like this:
namespace accu
{
template<typename type>
counted_ptr<type>::counted_ptr(type * other)
: ptr(other)
, count()
{
auto_ptr<type> resource(other);
count.initialize();
resource.reset(0);
}
}
namespace accu
{
class reference_count
{
public:
reference_count() throw();
void initialize();
private:
int * count;
};
}
namespace accu
{
reference_count::reference_count() throw()
: count(0)
{
// empty
}
void reference_count::initialize()
{
count = new int(1);
}
}
If anyone can see a "better" solution I'd appreciate an email.
namespace accu
{
template<typename integer>
class reference_count;
template<typename type, typename integer = int>
class counted_ptr
{
public:
// ...
private:
type * ptr;
reference_count<integer> count;
};
}
namespace accu
{
template<typename integer>
class reference_count
{
public:
// ...
private:
integer * count;
};
}
Remembering the problems hinted at with ++ and -- and multithreading you could
create a reference_count class based on an atomic integer for example:
class atomic_integer
{
public:
// ...
atomic_integer & operator++();
const atomic_integer operator++(int);
atomic_integer & operator--();
const atomic_integer operator--(int);
// ...
};
Or perhaps an integer with a limited range. For example one where an attempt to
decrement zero to minus one would cause an exception.
That's almost it for now. There is one thought I'll leave you with though. Right at the very start I declared operator[](size_t index) inside the string definition. How does this sit with the reference_counting?
string theory("hello");
string vest(theory);
theory[0] = 'C';
cout << vest << endl; // must print "hello" not "cello"
namespace // unnamed
{
template<typename type>
void check_not_null(type * ptr)
{
...
}
}
My motivation for not making this a method of pointer
That's all for now.
Cheers
Jon Jagger
jon@jaggersoft.com
References (also counted 1,2,3 :-)
[1] Ruminations on C++,
Andrew Koenig and Barbara Moo, Addison Wesley,
ISBN 0-201-4233-1,
Chapter 7. Handles: Part 2. Page 67.
[2] Scientific and Engineering C++,
John Barton and Lee Nackman,
Addison Wesley,
ISBN 0-201-53393-6,
Chapter 14 Pointer Classes, page 419
[3] The C++ Programming Language, 3rd edition,
Bjarne Stroustrup,
Addison Wesley,
ISBN 0-201-88954-4,
25.7 Handle Classes, page 782.