Publications
of Jon Jagger
jon@jaggersoft.com
Appeared in Overload 29, December 1998

counterd_ptr<type> Revisited

Recap

In Overload 25 I looked at counted_ptr<type>. Specifically I used it to implement a string class which was able to share common state.
// accu/string.hpp
namespace accu
{
    class string
    {
    public:
        string(const char * other = "");
        ...
        char & operator[](size_t index);
        ...
    private:
        struct body;
        counted_ptr<body> ptr;
    };
}
...
Towards the end of that article I hinted at problem with counted_ptr<type> that remained unresolved:
#include "accu/string.hpp"

using accu::string;

string theory("hello");
string vest(theory);
theory[0] = 'C';
cout << vest << endl; // must print "hello" and not "cello"
The basic problem is that sometimes you have to stop sharing data. What should happen when you attempt to change the shared data? There is no right or wrong answer. It depends on the context. It depends why you’re sharing. You have to decide whether the sharing is an externally visible feature or an internal implementation detail. For my fledgling string class I was sharing state because it gave lazy evaluation. It saved me having to make a deep copy of the state during a copy construction or a copy assignment. In other words it was an internal implementation detail.

const overloading

There are a number of ways to solve this problem. In case you haven't got the previous article (it was a while ago) and because it's not the focus of this article I'll unwrap the counted pointer. Ok, to get things started the first thing to notice is that we can make use of const overloading:
// accu/string.hpp
namespace accu
{
    class string
    {
    public:

        string(const char * other = "");
        ...	
              char & operator[](size_t index);
        const char & operator[](size_t index) const;
        ...
    private:

        char   * text;
        size_t * count;
    };
}
An alternative version of the const array-subscript operator could return a plain char (by value). There is not much to choose between the two, but you might consider using the one that gives the clearest (least obscure) error message when misused (if they differ). The implementation of the const version is simple and does not need to make a deep copy since the data cannot change. The implementation of the non-const version is not so simple since it may need to make a deep copy.
namespace accu // string
{
    string::string(const char * other)
      : text(new char[strlen(other) + 1])
      , count(new size_t(1))
    {
        strcpy(text, literal);
    }

    char & string::operator[](size_t index)
    {
        bounds_check(index);
        unshare_state();
        return text[index];
    }
}
And string::unshare_state() will need to ensure its state is not shared by any other string objects. There are a number of details to remember when writing unshare_state. First there is no need to make a deep copy if the state is already unshared, in other words if the reference count is one. Secondly, if a deep copy is required then a new reference count will also need to be allocated. That's two allocations inside a single function. Ensuring such a function is exception safe can be tricky.
namespace accu
{
    void string::unshare_state()
    {
        if (*count != 1)
        {
            auto_ptr<size_t> new_count(new size_t(1));
            char * new_text = new char[strlen(text) + 1];
            strcpy(new_text, text);
            --*count;
            count = new_count.release();
            text = new_text;
        }
    }
}
With this in place we can now check the following example cases are well-behaved:
string writeable("hello");
string another(writeable);
writable[0] = 'C';		// 0
cout << writeable << endl; 	// 1
cout << another << endl;	// 2
const string readonly("Pat");
readonly[0] = 'C';		// 3
Line 1 should print "Cello" because writeable is modified in line 0. Line 2 should print "hello". Line 3 should give a compiler error. Fine.

But consider this...

writeable[0] = 'C';            // 0
cout << writeable[0] << endl;  // 4
The point to note is that a read access of a modifiable string will cause the method string::unshare_state() to be invoked. That's a pity seeing as the statement on line 4 is not modifying the string. Many of you will have read Jim Coplien's book Advanced C++ Programming Styles and Idioms and will know a way of solving this. The trick is not to return a "real" char reference from the non const version of string::operator[] but to return something that looks like, acts like, feels like, smells like, and behaves like a char reference. A proxy. I'll call it char_reference [1].

string::char_reference

namespace accu
{
    class string
    {
    public: // types

        class char_reference
        {
        public:
            ...
            void operator=(char new_value); 
            operator char() const;
            ...
        private:
            ...
        };

        char_reference operator[](size_t index);
          const char & operator[](size_t index) const;
        ...
    private:
        ...
    };
}
I've made the char_reference assignment operator a void function for simplicity of exposition. The return type is not the focus of this article. With this sleight of hand in place we can look at lines 0 and 4. Here's line 0 with the peel removed bit by bit:
writeable[0] = 'C';
writeable.operator[](0) = 'C';
writeable.operator[](0).operator=('C');
And here's the expression in line 4 with the peel removed bit by bit:
writeable[0] 
writeable.operator[](0)
writeable.operator[](0).operator char()

Implementing string::char_reference

Fine, but I'm going to explore this a little further: in particular, the implementation of the two char_reference methods. A first attempt might look like this (implemented inline to save space)
namespace accu
{
    class string
    {
    public: // types

        class char_reference
        {
        public:
            char_reference(char & it) : ch(it) {}

            void operator=(char new_value) 
            { 
                ch = new_value; 
            }

            operator char() const 
            { 
                return ch; 
            }
        private:
            char & ch;
        };

        ...
    };
}
However, this is flawed. Once again we need to remember to unshare the string state when it is being modified. Changing an element of a string is something that should be done by a method of string. Here's another attempt:
namespace accu
{
    class string
    {
    public: // types

        class char_reference
        {
        public:
            char_reference(string & s, size_t index);
            void operator=(char new_value); 
            operator char() const;    
        private:
            string & s;
            size_t index;
        };

        ...
    };
}
This leaves the interesting question of what methods of string the methods of char_reference should delegate to. The conversion operator can be implemented like this (in a conforming compiler):
namespace accu
{
    string::char_reference::operator char() const
    {
        const string & ro = s;
        return ro[index];
    }
}
Note that s must be used as a read only string reference, to avoid infinite recursion. But how can the assignment operator be implemented? Not like this, because we're back to infinite recursion again.
namespace accu
{
    void string::char_reference::operator=(char new_value)
    {
        s[index] = new_value; 
    }
}

A new string method: public or private?

One way to solve this is to create a new string method. For example:
namespace accu
{
    void string::assign(size_t index, char new_value)
    {
        bounds_check(index);
        unshare_state();
        text[index] = new_value;
    }
}
The question is whether to make string::assign public or private [2]. There are conflicting forces. On the one had you might want to make it private, viewing it as an implementation detail. You might also want to make it private so that a string client has only one syntax for assignment. But how does char_reference gain access to this private method? A common solution is to use friendship:
namespace accu
{
    class string
    {
    public: // types

        class char_reference
        {
        public:
            ...
            void operator=(char new_value) 
            {
                s.assign(index, new_value);
            }
            ...
        };
	...
    private:

        friend char_reference;
        void assign(size_t index, char new_value);
        ...
    };
}
On the other hand you might consider that the cure is worse than the symptoms. Granting char_reference total friendship when limited friendship (to assign) was all that was required might be seen as something of a large sledge-hammer cracking a small nut. If this is your view, you'd probably make the primitive public, and accept a choice of assignment syntax.

A new string method: public but uncallable!

However, there is an alternative. You can remove the friendship, and make string::assign public but uncallable! Bizarre [3]. The trick is to use an opaque type. You wouldn't do this in practice (there is a better solution which I'll cover in the next article) but I'll show it for interest's sake.
// string.hpp
namespace accu
{
    class string
    {
    public: // types

        class char_reference
        {
        public:
            ...
            void operator=(char new_value); 
            ...
        };

        char_reference operator[](size_t index);
        ...
    public: // but uncallable!

        struct position; // HERE
        void assign(position index, char new_value);

    private:
        ...
    };
}
// string.cpp
namespace accu
{
    struct string::position
    {
        size_t index;
    };
    ...
    void 
    string::char_reference::operator=(char new_value)
    {
        position p = { index };	
        s.assign(p, new_value);
    }
    ...
    void string::assign(position pos, char new_value)
    {
        bounds_check(pos.index);
        unshare_state();
        text[pos.index] = new_value;
    }
    ...
}

That's all for now.
Cheers
Jon Jagger
jon@jaggersoft.com

[1] Note that char_reference will also be valuable when implementing string::iterator::operator*()

[2] There was a long thread on ACCU.general essentially boiling down to this recently

[3] There is also another solution. It is possible to grant limited friendship. Mark Radford showed me how. Perhaps I'll cover that in another article.