// accu/string.hpp
namespace accu
{
class string
{
public:
string(const char * other = "");
...
char & operator[](size_t index);
...
private:
struct body;
counted_ptr<body> ptr;
};
}
...
Towards the end of that article I hinted at problem with counted_ptr<type> that
remained unresolved:
#include "accu/string.hpp"
using accu::string;
string theory("hello");
string vest(theory);
theory[0] = 'C';
cout << vest << endl; // must print "hello" and not "cello"
The basic problem is that sometimes you have to stop sharing data. What should happen
when you attempt to change the shared data? There is no right or wrong answer.
It depends on the context. It depends why you’re sharing. You have to decide
whether the sharing is an externally visible feature or an internal implementation
detail. For my fledgling string class I was sharing state because it gave lazy
evaluation. It saved me having to make a deep copy of the state during a copy
construction or a copy assignment. In other words it was an internal implementation
detail.
// accu/string.hpp
namespace accu
{
class string
{
public:
string(const char * other = "");
...
char & operator[](size_t index);
const char & operator[](size_t index) const;
...
private:
char * text;
size_t * count;
};
}
An alternative version of the const array-subscript operator could return a
plain char (by value). There is not much to choose between the two, but you might
consider using the one that gives the clearest (least obscure) error message when
misused (if they differ). The implementation of the const version is simple and
does not need to make a deep copy since the data cannot change. The implementation
of the non-const version is not so simple since it may need to make a deep copy.
namespace accu // string
{
string::string(const char * other)
: text(new char[strlen(other) + 1])
, count(new size_t(1))
{
strcpy(text, literal);
}
char & string::operator[](size_t index)
{
bounds_check(index);
unshare_state();
return text[index];
}
}
And string::unshare_state() will need to ensure its state is not shared by any other
string objects. There are a number of details to remember when writing unshare_state.
First there is no need to make a deep copy if the state is already unshared, in other
words if the reference count is one. Secondly, if a deep copy is required then a new
reference count will also need to be allocated. That's two allocations inside a
single function. Ensuring such a function is exception safe can be tricky.
namespace accu
{
void string::unshare_state()
{
if (*count != 1)
{
auto_ptr<size_t> new_count(new size_t(1));
char * new_text = new char[strlen(text) + 1];
strcpy(new_text, text);
--*count;
count = new_count.release();
text = new_text;
}
}
}
With this in place we can now check the following example cases are well-behaved:
string writeable("hello");
string another(writeable);
writable[0] = 'C'; // 0
cout << writeable << endl; // 1
cout << another << endl; // 2
const string readonly("Pat");
readonly[0] = 'C'; // 3
Line 1 should print "Cello" because writeable is modified in line 0.
Line 2 should print "hello".
Line 3 should give a compiler error.
Fine.
But consider this...
writeable[0] = 'C'; // 0 cout << writeable[0] << endl; // 4The point to note is that a read access of a modifiable string will cause the method string::unshare_state() to be invoked. That's a pity seeing as the statement on line 4 is not modifying the string. Many of you will have read Jim Coplien's book Advanced C++ Programming Styles and Idioms and will know a way of solving this. The trick is not to return a "real" char reference from the non const version of string::operator[] but to return something that looks like, acts like, feels like, smells like, and behaves like a char reference. A proxy. I'll call it char_reference [1].
namespace accu
{
class string
{
public: // types
class char_reference
{
public:
...
void operator=(char new_value);
operator char() const;
...
private:
...
};
char_reference operator[](size_t index);
const char & operator[](size_t index) const;
...
private:
...
};
}
I've made the char_reference assignment operator a void function for simplicity of
exposition. The return type is not the focus of this article. With this sleight of
hand in place we can look at lines 0 and 4. Here's line 0 with the peel removed bit
by bit:
writeable[0] = 'C';
writeable.operator[](0) = 'C';
writeable.operator[](0).operator=('C');
And here's the expression in line 4 with the peel removed bit by bit:
writeable[0] writeable.operator[](0) writeable.operator[](0).operator char()
namespace accu
{
class string
{
public: // types
class char_reference
{
public:
char_reference(char & it) : ch(it) {}
void operator=(char new_value)
{
ch = new_value;
}
operator char() const
{
return ch;
}
private:
char & ch;
};
...
};
}
However, this is flawed. Once again we need to remember to unshare the string
state when it is being modified. Changing an element of a string is something
that should be done by a method of string. Here's another attempt:
namespace accu
{
class string
{
public: // types
class char_reference
{
public:
char_reference(string & s, size_t index);
void operator=(char new_value);
operator char() const;
private:
string & s;
size_t index;
};
...
};
}
This leaves the interesting question of what methods of string the methods of
char_reference should delegate to. The conversion operator can be implemented
like this (in a conforming compiler):
namespace accu
{
string::char_reference::operator char() const
{
const string & ro = s;
return ro[index];
}
}
Note that s must be used as a read only string reference, to avoid infinite
recursion. But how can the assignment operator be implemented? Not like this,
because we're back to infinite recursion again.
namespace accu
{
void string::char_reference::operator=(char new_value)
{
s[index] = new_value;
}
}
namespace accu
{
void string::assign(size_t index, char new_value)
{
bounds_check(index);
unshare_state();
text[index] = new_value;
}
}
The question is whether to make string::assign public or private [2]. There are
conflicting forces. On the one had you might want to make it private, viewing
it as an implementation detail. You might also want to make it private so that
a string client has only one syntax for assignment. But how does char_reference
gain access to this private method? A common solution is to use friendship:
namespace accu
{
class string
{
public: // types
class char_reference
{
public:
...
void operator=(char new_value)
{
s.assign(index, new_value);
}
...
};
...
private:
friend char_reference;
void assign(size_t index, char new_value);
...
};
}
On the other hand you might consider that the cure is worse than the symptoms.
Granting char_reference total friendship when limited friendship (to assign) was
all that was required might be seen as something of a large sledge-hammer cracking
a small nut. If this is your view, you'd probably make the primitive public, and
accept a choice of assignment syntax.
// string.hpp namespace accu { class string { public: // types class char_reference { public: ... void operator=(char new_value); ... }; char_reference operator[](size_t index); ... public: // but uncallable! struct position; // HERE void assign(position index, char new_value); private: ... }; }
// string.cpp
namespace accu
{
struct string::position
{
size_t index;
};
...
void
string::char_reference::operator=(char new_value)
{
position p = { index };
s.assign(p, new_value);
}
...
void string::assign(position pos, char new_value)
{
bounds_check(pos.index);
unshare_state();
text[pos.index] = new_value;
}
...
}
That's all for now.
Cheers
Jon Jagger
jon@jaggersoft.com
[1] Note that char_reference will also be valuable when implementing string::iterator::operator*()
[2] There was a long thread on ACCU.general essentially boiling down to this recently
[3] There is also another solution. It is possible to grant limited friendship. Mark Radford showed me how. Perhaps I'll cover that in another article.