Language

The Free and Open Productivity Suite
Released: Apache OpenOffice 4.1.15

OpenOffice.orgCachemaps


Contents

Abstract
Solution 1 (Single Mutex)
Solution 2 (Weak References)
Solution 3 (Best of Both Worlds)
Performance Measurements


Abstract

I quite often have the need for a construct I will call a cachemap. A cachemap is like an ordinary map (or hashmap), in that it maps from keys (normally strings) to objects, but it also controls the lifetime of the mapped objects. If a client requests an object that is not yet instantiated, a (refcounted) instance is created, inserted into the cachemap, and returned to the client. If the client no longer needs the object (the refcount of the object drops to zero), it is automatically removed from the cachemap. And if a client requests an object that is already present in the cachemap (because it was requested by a client before, and its refcount is still positive), the present instance is returned. Skeleton C++ class definitions for the cachemap and the objects it manages could look like the following:


class ObjectContainer // the cachemap
{
public:
    rtl::Reference< Object > get(rtl::OUString const & rKey);
};

class Object // the objects managed by the cachemap
{
public:
    void acquire() SAL_THROW(());

    void release() SAL_THROW(());
};

The problems in implementing this are that (1) the cachemap must not acquire() the objects it currently has in its map (otherwise, the refcounts of those objects could never drop to zero, and they would never be removed from the map), and (2) in a multi-threaded environment there is a problem when one thread does a final release() of an object (refcount drops to zero and object is removed from the map) and at the same time another thread happens to call get() to request that very object.

Below, I describe three different implementations that solve these problems. The first solution was my initial idea, but it tends to be slow. The second is based on weak references, but it turned out to be slow, too. The third combines ideas from the other two solutions---and it turned out to be fast. So, if you are interested in boiler plate code for a fast implementation of the cachemap pattern, turn to the third solution directly.

The three solutions all follow the same conventions. They implement an ObjectContainerN (the cachemap, mapping from strings to objects) and a corresponding ObjectN (the refcounted objects managed by the cachemap). The source code presented here is also available within the ucb project (ucb/workben/cachemap), and it is presented here without the (somewhat longish) copyright headers.


Solution 1 (Single Mutex)

The combined header file for both ObjectContainer1 and Object1 is cachemapobject1.hxx:


#ifndef INCLUDED_UCB_CACHEMAPOBJECT1_HXX
#define INCLUDED_UCB_CACHEMAPOBJECT1_HXX

#ifndef _OSL_INTERLOCK_H_
#include "osl/interlck.h"
#endif
#ifndef _OSL_MUTEX_HXX_
#include "osl/mutex.hxx"
#endif
#ifndef _RTL_REF_HXX_
#include "rtl/ref.hxx"
#endif
#ifndef _SAL_TYPES_H_
#include "sal/types.h"
#endif
#ifndef _SALHELPER_SIMPLEREFERENCEOBJECT_HXX_
#include "salhelper/simplereferenceobject.hxx"
#endif

#ifndef INCLUDED_MAP
#include <map>
#define INCLUDED_MAP
#endif
#ifndef INCLUDED_MEMORY
#include <memory>
#define INCLUDED_MEMORY
#endif

namespace rtl { class OUString; }
namespace ucb { namespace cachemap { class Object1; } }

namespace ucb { namespace cachemap {

class ObjectContainer1: public salhelper::SimpleReferenceObject
{
public:
    ObjectContainer1();

    virtual ~ObjectContainer1() SAL_THROW(());

    rtl::Reference< Object1 > get(rtl::OUString const & rKey);

private:
    typedef std::map< rtl::OUString, Object1 * > Map;

    Map m_aMap;
    osl::Mutex m_aMutex;

    void releaseElement(Object1 * pElement) SAL_THROW(());

    friend class Object1; // to access Map, releaseElement()
};

class Object1
{
public:
    inline void acquire() SAL_THROW(())
    { osl_incrementInterlockedCount(&m_nRefCount); }

    inline void release() SAL_THROW(())
    { m_xContainer->releaseElement(this); }

private:
    rtl::Reference< ObjectContainer1 > m_xContainer;
    ObjectContainer1::Map::iterator m_aContainerIt;
    oslInterlockedCount m_nRefCount;

    inline Object1(rtl::Reference< ObjectContainer1 > const & rContainer);

    inline ~Object1() SAL_THROW(());

    Object1(Object1 &); // not implemented
    void operator =(Object1); // not implemented

    friend class ObjectContainer1;
        // to access m_aContainerIt, m_nRefCount, Object1(), ~Object1()
#if defined WNT
    friend struct std::auto_ptr< Object1 > // to access ~Object
        // work around compiler bug...
#else // WNT
    friend class std::auto_ptr< Object1 >; // to access ~Object1()
#endif // WNT
};

} }

#endif // INCLUDED_UCB_CACHEMAPOBJECT1_HXX

A few notes:

  • It is a good idea to use valid C++ identifiers as include guards for header files. That is, do not use identifiers that start with an underline followed by an upper case letter (e.g., _UCB_CACHEMAPOBJECT1_HXX_) or that contain two underlines in a row, because those identifiers are reserved for internal use by the compiler and standard library implementation. (Lakos, section 2.4, suggestes the INCLUDED_XXX style.)

  • Use the convention of (redundant but compilation time reducing) external include guards for standard header files, too. For this to work, it is necessary that everybody use the same identifiers for the various standard header files, of course. (See Lakos, section 2.5.)

  • ObjectContainer1 turns out to be a refcounted object. This was not required in the introduction of the cachemap pattern, but it is rather a consequence of the chosen implementation (and it should not hurt any client, either).

  • In ObjectContainer1, a std::map is used instead of a (de facto standard?) std::hash_map that might be a bit faster. The implementation of ObjectContainer1 relies on the fact that iterators into a std::map are not invalidated when calling insert() or erase(). There seems to be no corresponding guarantee for std::hash_map.

  • ObjectContainer1 and Object1 need to be mutual friends. If one entity is a friend of another, these entities are strongly coupled, and it seems always best to keep them together in a single header file (thereby breaking the rule to have a seperate header file for each class).

  • Logically, only ObjectContainer1 is allowed to create and destroy instances of Object1, hence the design decision to make the constructor and destructor of Object1 private and make ObjectContainer1 a friend of Object1. But a std::auto_ptr< Object1 > is needed at one place in the implementation, and that std::auto_ptr also needs access to the destructor of Object1. It seemed best to simply make std::auto_ptr< Object1 > a friend of Object1, too, in this situation (though too many friends can be an indication of bad design.) Alas, the Microsoft compiler used on the Windows x86 platform (WNT define) needs to see struct (or nothing) in this friend declaration, instead of the correct class, so some platform-dependent code is needed here (and in retrospect it might seem better to make the destructor of Object1 public and get rid of all this).

The corresponding implementation file is cachemapobject1.cxx:


#ifndef INCLUDED_UCB_CACHEMAPOBJECT1_HXX
#include "cachemapobject1.hxx"
#endif

#ifndef _OSL_DIAGNOSE_H_
#include "osl/diagnose.h"
#endif
#ifndef _OSL_INTERLOCK_H_
#include "osl/interlck.h"
#endif
#ifndef _OSL_MUTEX_HXX_
#include "osl/mutex.hxx"
#endif
#ifndef _RTL_REF_HXX_
#include "rtl/ref.hxx"
#endif
#ifndef _RTL_USTRING_HXX_
#include "rtl/ustring.hxx"
#endif

#ifndef INCLUDED_MEMORY
#include <memory>
#define INCLUDED_MEMORY
#endif

using ucb::cachemap::Object1;
using ucb::cachemap::ObjectContainer1;

inline
Object1::Object1(rtl::Reference< ObjectContainer1 > const & rContainer):
    m_xContainer(rContainer),
    m_nRefCount(0)
{
    OSL_ASSERT(m_xContainer.is());
}

inline Object1::~Object1() SAL_THROW(())
{}

void ObjectContainer1::releaseElement(Object1 * pElement) SAL_THROW(())
{
    OSL_ASSERT(pElement);
    bool bDelete = false;
    {
        osl::MutexGuard aGuard(m_aMutex);
        if (osl_decrementInterlockedCount(&pElement->m_nRefCount) == 0)
        {
            m_aMap.erase(pElement->m_aContainerIt);
            bDelete = true;
        }
    }
    if (bDelete)
        delete pElement;
}

ObjectContainer1::ObjectContainer1()
{}

ObjectContainer1::~ObjectContainer1() SAL_THROW(())
{}

rtl::Reference< Object1 > ObjectContainer1::get(rtl::OUString const & rKey)
{
    osl::MutexGuard aGuard(m_aMutex);
    Map::iterator aIt(m_aMap.find(rKey));
    if (aIt == m_aMap.end())
    {
        std::auto_ptr< Object1 > xElement(new Object1(this));
        aIt = m_aMap.insert(Map::value_type(rKey, xElement.get())).first;
        aIt->second->m_aContainerIt = aIt;
        xElement.release();
    }
    return aIt->second;
}

  • It is always a good idea to include the header file that belongs to an implementation file first in that file. That way, there is an automatic check that each header file is self-contained (for almost every header, there is a corresponding implementation file, and compilation of that file fails if the header is not self-contained). (See Lakos, section 3.2.)

  • Note how releaseElement() deletes the element only after the mutex has been released. It is good practice to do as little as absolutely necessary during the time a mutex is locked.

  • Note how a std::auto_ptr< Object1 > is used to ensure the newly created Object1 is properly deleted in case the insert() happens to throw an exception.

The way the problems mentionend in the introduction are solved here is that ObjectContainer1 only holds pointers to instances of Object1, not refcounting references, and that Object1::release() is simply forwarded to ObjectContainer1::releaseElement(). The releaseElement() method checks whether the refcount of the object drops to zero and removes it from the map, if appropriate. The methods ObjectContainer1::releaseElement() and ObjectContainer1::get() share a mutex, so that there can never be the situation that one thread tries to get an object another thread is about to delete. The disadvantage of this solution is that every call to Object1::release() needs to acquire a mutex, which is quite expensive on typical platforms.


Solution 2 (Weak References)

The UNO concept of weak references offers itself as another solution to the problem: ObjectContainer2 holds weak references to instances of Object2, and the mechanism used in the ObjectContainer2::get() method to translate a weak reference to an Object2 into a hard reference (that may then turn out to be null) solves all the problems for us.

With this solution, only ObjectContainer2 needs to know Object2, and not also the other way around (as it was in the first solution due to the mutual friend-ness of the two classes). Therefore, there are individual header files for Object2 and ObjectContainer2.

The header file cachemapobject2.hxx is nothing more than a skeleton (Object2 is so bare-bones it does not even need an implementation file):


#ifndef INCLUDED_UCB_CACHEMAPOBJECT2_HXX
#define INCLUDED_UCB_CACHEMAPOBJECT2_HXX

#ifndef _CPPUHELPER_WEAK_HXX_
#include "cppuhelper/weak.hxx"
#endif

namespace ucb { namespace cachemap {

class Object2: public cppu::OWeakObject
{};

} }

#endif // INCLUDED_UCB_CACHEMAPOBJECT2_HXX

The header file cachemapobjectcontainer2.hxx is hardly more interesting:


#ifndef INCLUDED_UCB_CACHEMAPOBJECTCONTAINER2_HXX
#define INCLUDED_UCB_CACHEMAPOBJECTCONTAINER2_HXX

#ifndef _CPPUHELPER_WEAKREF_HXX_
#include "cppuhelper/weakref.hxx"
#endif
#ifndef _OSL_MUTEX_HXX_
#include "osl/mutex.hxx"
#endif
#ifndef _RTL_REF_HXX_
#include "rtl/ref.hxx"
#endif
#ifndef _SAL_TYPES_H_
#include "sal/types.h"
#endif

#ifndef INCLUDED_HASH_MAP
#include <hash_map>
#define INCLUDED_HASH_MAP
#endif

namespace rtl {
    class OUString;
    struct OUStringHash;
}
namespace ucb { namespace cachemap { class Object2; } }

namespace ucb { namespace cachemap {

class ObjectContainer2
{
public:
    ObjectContainer2();

    ~ObjectContainer2() SAL_THROW(());

    rtl::Reference< Object2 > get(rtl::OUString const & rKey);

private:
    typedef std::hash_map< rtl::OUString,
                           com::sun::star::uno::WeakReference< Object2 >,
                           rtl::OUStringHash >
    Map;

    ObjectContainer2(ObjectContainer2 &); // not implemented
    void operator =(ObjectContainer2); // not implemented

    Map m_aMap;
    osl::Mutex m_aMutex;
};

} }

#endif // INCLUDED_UCB_CACHEMAPOBJECTCONTAINER2_HXX

  • Here, a (de facto standard?) std::hash_map is used instead of a std::map as used for ObjectContainer1: We do not need any guarantees about iterators, so there is no reason not to use the (potentially faster) std::hash_map.

The solution to the implementation problems is hidden in the implementation of cppu::OWeakObject and com::sun::star::uno::WeakReference, so the implementation file cachemapobjectcontainer2.cxx is quite simple, too:


#ifndef INCLUDED_UCB_CACHEMAPOBJECTCONTAINER2_HXX
#include "cachemapobjectcontainer2.hxx"
#endif

#ifndef INCLUDED_UCB_CACHEMAPOBJECT2_HXX
#include "cachemapobject2.hxx"
#endif

#ifndef _COM_SUN_STAR_UNO_REFERENCE_HXX_
#include "com/sun/star/uno/Reference.hxx"
#endif
#ifndef _COM_SUN_STAR_UNO_XWEAK_HPP_
#include "com/sun/star/uno/XWeak.hpp"
#endif
#ifndef _CPPUHELPER_WEAKREF_HXX_
#include "cppuhelper/weakref.hxx"
#endif
#ifndef _OSL_MUTEX_HXX_
#include "osl/mutex.hxx"
#endif
#ifndef _RTL_REF_HXX_
#include "rtl/ref.hxx"
#endif
#ifndef _RTL_USTRING_HXX_
#include "rtl/ustring.hxx"
#endif

using namespace com::sun;
using ucb::cachemap::Object2;
using ucb::cachemap::ObjectContainer2;

ObjectContainer2::ObjectContainer2()
{}

ObjectContainer2::~ObjectContainer2() SAL_THROW(())
{}

rtl::Reference< Object2 > ObjectContainer2::get(rtl::OUString const & rKey)
{
    rtl::Reference< Object2 > xElement;
    {
        osl::MutexGuard aGuard(m_aMutex);
        Map::iterator aIt(m_aMap.find(rKey));
        if (aIt != m_aMap.end())
            xElement = static_cast< Object2 * >(
                           star::uno::Reference< star::uno::XWeak >(
                                   aIt->second.get(), star::uno::UNO_QUERY).
                               get());
        if (!xElement.is())
        {
            xElement = new Object2;
            m_aMap[rKey]
                = star::uno::WeakReference< Object2 >(xElement.get());
        }
    }
    return xElement;
}

The main trick (hack?) here is to translate from a com::sun::star::uno::WeakReference< Object2 > via a com::sun::star::uno::Reference< com::sun::star::uno::XWeak > to an rtl::Reference< Object2 > (in general, you must go from a com::sun::star::uno::WeakReference< T > to a com::sun::star::uno::Reference< T >, but it is not possible to have a com::sun::star::uno::Reference< Object2 >, so we must go via the base class XWeak here).

The drawback of this solution is that it is not faster than the initial one (even slower on some platforms). The overhead of using the (generic) mechanism of weak references when requesting objects outweights the benefits of the faster refcounting.

Another point that can well be a drawback of this solution is that a weak reference to a destroyed object is not removed from the map; it is replaced with a weak references to a living object the next time ObjectContainer2::get() requests that object, but it is never erased from the map. Therefore, the map itself keeps growing over the lifetime of the ObjectContainer2 instance.


Solution 3 (Best of Both Worlds)

If exploiting the generic facilities in the second solution did not give any performance advantage, let us try to become faster with a hand-crafted version (i.e, combining the hand-crafted framework of the first solution with the implementation details of the second one).

The header file cachemapobject3.hxx is almost identical to the header file from the first solution (differences are in green):


#ifndef INCLUDED_UCB_CACHEMAPOBJECT3_HXX
#define INCLUDED_UCB_CACHEMAPOBJECT3_HXX

#ifndef _OSL_INTERLOCK_H_
#include "osl/interlck.h"
#endif
#ifndef _OSL_MUTEX_HXX_
#include "osl/mutex.hxx"
#endif
#ifndef _RTL_REF_HXX_
#include "rtl/ref.hxx"
#endif
#ifndef _SAL_TYPES_H_
#include "sal/types.h"
#endif
#ifndef _SALHELPER_SIMPLEREFERENCEOBJECT_HXX_
#include "salhelper/simplereferenceobject.hxx"
#endif

#ifndef INCLUDED_MAP
#include <map>
#define INCLUDED_MAP
#endif
#ifndef INCLUDED_MEMORY
#include <memory>
#define INCLUDED_MEMORY
#endif

namespace rtl { class OUString; }
namespace ucb { namespace cachemap { class Object3; } }

namespace ucb { namespace cachemap {

class ObjectContainer3: public salhelper::SimpleReferenceObject
{
public:
    ObjectContainer3();

    virtual ~ObjectContainer3() SAL_THROW(());

    rtl::Reference< Object3 > get(rtl::OUString const & rKey);

private:
    typedef std::map< rtl::OUString, Object3 * > Map;

    Map m_aMap;
    osl::Mutex m_aMutex;

    void releaseElement(Object3 * pElement) SAL_THROW(());

    friend class Object3; // to access Map, releaseElement()
};

class Object3
{
public:
    inline void acquire() SAL_THROW(())
    { osl_incrementInterlockedCount(&m_nRefCount); }

    void release() SAL_THROW(());

private:
    rtl::Reference< ObjectContainer3 > m_xContainer;
    ObjectContainer3::Map::iterator m_aContainerIt;
    oslInterlockedCount m_nRefCount;

    inline Object3(rtl::Reference< ObjectContainer3 > const & rContainer);

    inline ~Object3() SAL_THROW(());

    Object3(Object3 &); // not implemented
    void operator =(Object3); // not implemented

    friend class ObjectContainer3;
        // to access m_aContainerIt, m_nRefCount, Object3(), ~Object3()
#if defined WNT
    friend struct std::auto_ptr< Object3 >; // to access ~Object3()
#else // WNT
    friend class std::auto_ptr< Object3 >; // to access ~Object3()
#endif // WNT
};

} }

#endif // INCLUDED_UCB_CACHEMAPOBJECT3_HXX

Also, the implementation file cachemapobject3.cxx has a strong resemblence to the implementation file from the first solution (again, differences in green):


#ifndef INCLUDED_UCB_CACHEMAPOBJECT3_HXX
#include "cachemapobject3.hxx"
#endif

#ifndef _OSL_DIAGNOSE_H_
#include "osl/diagnose.h"
#endif
#ifndef _OSL_INTERLOCK_H_
#include "osl/interlck.h"
#endif
#ifndef _OSL_MUTEX_HXX_
#include "osl/mutex.hxx"
#endif
#ifndef _RTL_REF_HXX_
#include "rtl/ref.hxx"
#endif
#ifndef _RTL_USTRING_HXX_
#include "rtl/ustring.hxx"
#endif

#ifndef INCLUDED_MEMORY
#include <memory>
#define INCLUDED_MEMORY
#endif

using ucb::cachemap::Object3;
using ucb::cachemap::ObjectContainer3;

inline
Object3::Object3(rtl::Reference< ObjectContainer3 > const & rContainer):
    m_xContainer(rContainer),
    m_nRefCount(0)
{
    OSL_ASSERT(m_xContainer.is());
}

inline Object3::~Object3() SAL_THROW(())
{}

void Object3::release() SAL_THROW(())
{
    if (osl_decrementInterlockedCount(&m_nRefCount) == 0)
    {
        m_xContainer->releaseElement(this);
        delete this;
    }
}

void ObjectContainer3::releaseElement(Object3 * pElement) SAL_THROW(())
{
    OSL_ASSERT(pElement);
    osl::MutexGuard aGuard(m_aMutex);
    if (pElement->m_aContainerIt != m_aMap.end())
        m_aMap.erase(pElement->m_aContainerIt);
}

ObjectContainer3::ObjectContainer3()
{}

ObjectContainer3::~ObjectContainer3() SAL_THROW(())
{}

rtl::Reference< Object3 > ObjectContainer3::get(rtl::OUString const & rKey)
{
    osl::MutexGuard aGuard(m_aMutex);
    Map::iterator aIt(m_aMap.find(rKey));
    if (aIt == m_aMap.end())
    {
        std::auto_ptr< Object3 > xElement(new Object3(this));
        aIt = m_aMap.insert(Map::value_type(rKey, xElement.get())).first;
        aIt->second->m_aContainerIt = aIt;
        xElement.release();
        return aIt->second;
    }
    else if (osl_incrementInterlockedCount(&aIt->second->m_nRefCount) > 1)
    {
        rtl::Reference< Object3 > xElement(aIt->second);
        osl_decrementInterlockedCount(&aIt->second->m_nRefCount);
        return xElement;
    }
    else
    {
        osl_decrementInterlockedCount(&aIt->second->m_nRefCount);
        aIt->second->m_aContainerIt = m_aMap.end();
        aIt->second = new Object3(this);
        aIt->second->m_aContainerIt = aIt;
        return aIt->second;
    }
}

This implementation does not need to lock a mutex in every call to Object3::release(). Only if the refcount drops to zero is the mutex of the ObjectContainer3 locked and the object removed from the map. It is possible that an ObjectContainer3::get() gets in the way between dropping the refcount to zero and locking the mutex. But the get() method notices that (it goes through a cycle of incrementing and decrementing the refcount itself, and if the refcount is not larger than 1, the object is about to be deleted) and leaves alone the to-be-deleted object and inserts into the map a fresh instance instead.

Note that this implementation also does not have the drawback of an ever-growing map, because each object is removed from the map as soon as its refcount drops to zero.


Performance Measurements

A simple test file (cachemaptest.cxx) makes 200000 get() requests, half of which find objects already present in the map. The refcount of each requested object is incremented and decremented 50 times in a row. Only 100 different identifiers are used for the objects, so that the second solution does not suffer from the problem of an ever-growing map. The measurements for two of our production platforms (Solaris Sparc with Forte compiler and Windows x86 with Microsoft compiler) are as follows (all times in miliseconds; because of different machines used, only comparisions within each column are meaningful):


unxsols3.pro

wntmsci7.pro

Solution 1

9137

3846

Solution 2

8634

5598

Solution 3

3166

2704



Author: Stephan Bergmann ($Date: 2001/06/14 14:49:37 $)
Copyright 2001 OpenOffice.org Foundation. All Rights Reserved.



Apache Software Foundation

Copyright & License | Privacy | Contact Us | Donate | Thanks

Apache, OpenOffice, OpenOffice.org and the seagull logo are registered trademarks of The Apache Software Foundation. The Apache feather logo is a trademark of The Apache Software Foundation. Other names appearing on the site may be trademarks of their respective owners.