Publications
of Jon Jagger
jon@jaggersoft.com
Appeared in Overload 24, February 1998

pointer<type>

The built-in pointer is very powerful. And very dangerous. It's powerful because it can be used for many purposes. It's dangerous for the same reason. For example
class dodgy { ... };

void very(dodgy * ptr) 
{ 
    ptr++; 
}
Incrementing (or decrementing) a built-in pointer that doesn't point into an [array] makes no sense. The built-in pointer type is too powerful [1]. In C++ we can rectify this by creating different pointer classes for different pointer uses. I hope to cover specific pointer classes in coming articles but for now I'm just going to get the ball rolling with a general look at a pointer class.

A good place to start is a minimal pointer class. What is the minimal interface for a pointer class? To answer that let's look at a minimal interface for a built-in pointer.

class base {};
base object;

base * ptr = &object;	// initialisation
ptr = &object;		// assignment
*ptr;		        // dereference: *
ptr->method();		// dereference: ->
if (ptr != ptr);	// comparison: !=
if (ptr == ptr);	// comparison: ==
if (ptr);		// comparison: != null ptr, implicit	
if (!ptr);		// comparison: == null ptr, implicit	
Based on this, a first cut might be:
// accu/pointer.hpp
...
namespace accu
{
    template<typename type>
    class pointer
    {
    public: // construct/copy/destroy

        pointer(type * initial_ptr = 0);
        // default copy constructor
        // default copy assignment operator
        // default destructor

    public: // dereference

        type & operator*() const; 
        type * operator->() const;

    public: // conversions

        operator bool () const;	
        bool operator!() const;	

    private: // state

        type * ptr;
    };

    // relational operators

    template<typename type>
    bool operator==(const pointer<type> & lhs, 
                    const pointer<type> & rhs);
 
    template<typename type>
    bool operator!=(const pointer<type> & lhs, 
                    const pointer<type> & rhs);
}	
This is almost the minimal interface I have in mind, but not quite. What about public inheritance?
class deriving : public base {};
deriving lesson;

base * raw = &lesson;	// initialisation
raw = &lesson;		// assignment
We need to ensure the pointer object can be initialised/assigned from a pointer<derived> object.
pointer<base> ptr = &lesson;
ptr = &lesson;
This can be done. It requires two template member functions: a template copy constructor and a template copy assignment operator.
// accu/pointer.hpp
...
namespace accu
{
    template<typename type>
    class pointer
    {
    public:
        ...
        template<class derived>
        pointer(const pointer<derived> & other);
        ...
        template<class derived>
        pointer & operator=(const pointer<derived> & rhs);
        ...
    };
}
There are a couple of minor points of interest. Firstly, I have used <class derived> and not <typename derived>. Secondly, pointer<type> and pointer<derived> are separate types. pointer<type> has no access to the private data of pointer<derived>. For example, the following will not compile as a definition of the template copy constructor:
// accu/pointer_template.hpp
...
namespace accu
{
    ...
    template<typename type>
    template<class derived>
    pointer::pointer(const pointer<derived> & other)
        : ptr(other.ptr)
    {
        // all done
    }
}
I will return to this problem. Before I do, Iíd like to cover a subtlety involving the template copy constructor. The C++ standard clearly states that a template constructor is never a copy constructor [2]. In other words, the presence of a template constructor does not suppress the implicit declaration of the copy constructor. A similar rule applies for a template copy assignment operator. Letís take a moment to think about those implicit declarations. Thereís the copy constructor, the copy assignment operator and the destructor. Be clear what these invisible compiler generated methods are:
// accu/pointer_template.hpp
...
namespace accu 
{
    ...
    template<typename type>
    pointer<type>::pointer(const pointer & other)
        : ptr(other.ptr)
    {
        // all done
    }
    ...
    template<typename type>
    pointer<type> & pointer<type>::operator=(const pointer & rhs)
    {
        ptr = rhs.ptr;
        return *this;
    }
    ...
    template<typename type>
    pointer<type>::~pointer()
    {
        // all done
    }
    ...
}
There are two things about these compiler generated implicit methods you might question. Firstly, because they are implicit they're not, well, explicit. There is something to be said for having them in hard, visible ink in the interface. Especially in a teaching environment. Or if you want to single step while debugging. Secondly, they may not be quite what you want. It is impossible for any of these three to generate an exception (just as it is impossible in the corresponding raw pointer expressions) yet they do not have a throw() specification. For me these two factors tip the balance. Hereís the revised class definition:
// accu/pointer.hpp
#ifndef ACCU_POINTER_INCLUDED
#define ACCU_POINTER_INCLUDED

namespace accu
{
    template<typename type>
    class pointer
    {
    public: // 'tors

        pointer(type * initial_ptr = 0) throw();
        pointer(const pointer & other) throw();
       ~pointer() throw();

        template<class derived>
            pointer(const pointer<derived> & other) throw();
        
    public: // assignment
            
        pointer & operator=(const pointer & rhs) throw();
        
        template<class derived>
            pointer & operator=(const pointer<derived> & rhs) throw();

    public: // dereference

        type & operator*() const; 
        type * operator->() const;

    public: // conversions

        operator bool () const throw();
        bool operator!() const throw();

    private: // state

        type * ptr;
    };
    ...
}

namespace accu // relational operators
{
    template<typename type>
    bool operator==(const pointer<type> & lhs, 
                    const pointer<type> & rhs) throw();

    template<typename type>
    bool operator!=(const pointer<type> & lhs, 
                    const pointer<type> & rhs) throw();
}

#include "accu/pointer-template.hpp"

#endif	
I have left the operator* and operator-> declarations without a throw() specification. The bodies of these operators are ideal places to check for a null pointer and throw an appropriate exception. However, what is an appropriate exception? The C++ standard basically gives a choice of two: logic_error and runtime_error. A logic_error is an error that the user could (at least in theory) avoid. Dereferencing a null pointer<type> is avoidable since the user can make the check themselves. For example via the bool conversion operator. A reasonable exception is therefore a logic_error. One way to implement this would be create a private method called check_not_null() which operator* and operator-> could then call. However, check_not_null() would then appear in the interface. Private but still visible. Really it is part of the implementation. I prefer my interface files to be as clean as possible. Also, there is still the problem of how to implement the template copy constructor, the template assignment operator and the global comparison operators. One solution is to provide a simple auto_ptr-like accessor called get(). It might be important to allow easy access to the underlying raw pointer (to use dynamic_cast for example).
// accu/pointer-template.hpp
#if !defined(ACCU_POINTER_INCLUDED) 
    || defined(ACCU_POINTER_TEMPLATE_INCLUDED)
#error include "accu/pointer.hpp" : \
    accu/pointer-template.hpp must not be included directly
#endif
...
#define ACCU_POINTER_TEMPLATE_INCLUDED
...
#include <exception>
...
namespace 
{
    // Ooops. Putting this template function in
    // an anonymous namespace breaks the
    // One Definition Rule. It's best to make it
    // a private static method after all

    template<typename type>
    void check_not_null(type * ptr)
    {
        if (!ptr) 
        {	
            throw std::logic_error("pointer: null");
        }
    } 
}

namespace accu // construct/copy/destroy
{
    ...
    template<typename type>
      template<class derived>
    pointer<type>::pointer(const pointer<derived> & other) throw()
        : ptr(other.get())
    {
        // all done    
    }
    ...
    template<typename type>
      template<class derived>
    pointer<type> & 
    pointer<type>::operator=(const pointer<derived> & rhs) throw()
    {
        ptr = rhs.get();
        return *this;
    }
    ...
}

namespace accu // dereference
{
    template<typename type>
    type & pointer<type>::operator*() const
    {
        check_not_null(ptr);
        return *ptr;
    }

    template<typename type>
    type * pointer<type>::operator->() const
    {
        check_not_null(ptr);
        return ptr;
    }

    template<typename type>
    type * pointer<type>::get() const	
    {
        return ptr;
    }
}

namespace accu // comparison
{
    template<typename type>
    bool operator==(const pointer<type> & lhs, 
                    const pointer<type> & rhs) throw()
    {
        return lhs.get() == rhs.get();
    }

    template<typename type>
    bool operator!=(const pointer<type> & lhs, 
                    const pointer<type> & rhs) throw()
    {
        return lhs.get() != rhs.get();
    }
}
One issue that still remains unmentioned is whether the constructor should be explicit or not. Consider the consequences if the constructor was made explicit:
void oops(const pointer<base> & lhs)
{
    pointer<base> local(lhs);    // WORKS, 1
    if (!lhs) ...                // WORKS, 2
    if (lhs) ...                 // WORKS, 3

    pointer<base> local = lhs;   // FAILS, have to use 1
    if (0 == lhs) ...            // FAILS, have to use 2
    if (lhs == 0) ...            // FAILS, have to use 2
    if (0 !== lhs) ...           // FAILS, have to use 3
    if (lhs != 0) ...            // FAILS, have to use 3
}
Is this better? Itís perhaps a matter of personal preference. But there is a difference. Which is more explicit?
if (lhs) ...		
or
if (lhs != 0) ...
I think the answer largely depends on the level youíre viewing from. You might argue that the latter is more explicit because itís explicitly comparing lhs to the null pointer. But is it? Zero is not the null pointer. Itís zero! By the same token you might argue that the former is more explicit because itís not explicitly comparing lhs to zero. At a higher level you can read
if(lhs) as "if lhs is true" or "if lhs is valid". Whatever you feel, ultimately even if the constructor is explicit, you can make all versions of the comparisons work. You just have to provide global operators. For example:
namespace accu
{
    template<typename type>
    bool operator==(const pointer<type> & lhs, 
                    const type * rhs) throw()
    {
        return lhs.get() == rhs;
    }

    template<typename type>
    bool operator!=(const pointer<type> & lhs, 
                    const type * rhs) throw()
    {
        return lhs.get() != rhs);
    }

    template<typename type>
    bool operator==(const type * lhs, 
                    const pointer<type> & rhs) throw()
    {
        return lhs == rhs.get();
    }

    template<typename type>
    bool operator!=(const type * lhs, 
                    const pointer<type> & rhs) throw()
    {
        return lhs != rhs.get();
    }
}
I'll just leave you with one final thought. What you don't implement (eg ++ in pointer) can be as important as what you do.

That's all for now.
Cheers
Jon Jagger
jon@jaggersoft.com

[1] Scientific and Engineering C++ John J.Barton & Lee R.Nackman Addison Wesley, ISBN 0-201-5393-6 Chapter 14 Pointer Classes, page 419

[2] C++ Draft Standard, CD2 12.8 Copying class objects Footnotes 104 107