Elephant's early architecture was tightly coupled to the Berkeley DB API. Over time we've moved towards a more modular architecture to support easy upgrading, repository migration, shared functionality between data stores and general hygene.
The architecture has been carefully modularized:
To get a feeling for what is happening inside elephant, it is probably best to walk through the various major protocols to see how these components participate in implementing them.
When the main elephant
open-store function is called with a
specification, it calls get-controller which first checks to see if a
controller already exists for that spec.
If there is no controller, it calls
construct one. If the data store code base is not present,
load-data-store is called to ensure that any asdf dependencies
are satisfied. The associations for asdf dependencies are statically
*elephant-data-stores* for each data store type
supported by elephant.
While being loaded, the data store is responsible for calling
register-data-store-con-init to register a data store
initialization function for its spec type (i.e. :BDB or :CLSQL).
For example, from bdb-controller.lisp:
(eval-when (:compile-toplevel :load-toplevel) (register-data-store-con-init :bdb 'bdb-test-and-construct))
This mapping between spec types and initialization functions is
lookup-data-store-con-init from within
build-controller. The function returned by
lookup-data-store-con-init is passed the full specification and
store-controller subclass instance for the specified
The new controller is stored in the
table, associating the object with its specification. Finally
Elephant calls open-controller to actually establish a connection to
or create the files of the data store.
Finally, if the default store controller
nil, it will be initialized with the new store controller, otherwise
the original value is left in
*store-controller* until that
store controller is closed using
The data store implementor has access to various utilities to aid initialization.
get-user-configuration-parameter- Access symbol tags in my-config.sexp to access data store specific user configuration. You can also add special variables to variables.lisp and add a tag-variable pair to
*user-configurable-parameters*in variables.lisp to automatically initialize it when the store controller is opened.
get-conbehavior when store is closed or lost
database-versiona store controller implements this in order to tell Elephant what serializer to use. Currently, version 0.6.0 databases use serializer1 and all later database use serializer version 2. This is to ensure that a given version of the Elephant code can open databases from prior versions in order to properly upgrade to the new code base.
At this point, all operations referencing the store controller should be able to proceed.
At the end of a session,
The only thing that a data store has to do to support new object
creation, other than implement the slot protocol, is implement the
next-oid to return the next unique object id for the
persistent object being created.
Existing objects are created during deserialization of object references. The serializer subsystem is built-into the core of elephant and can be used by data stores. The serializer is abstracted so that multiple serializers can be co-resident and the data store can choose the appropriate one. The abstraction boundary between the serializer, the data store, and the core Elephant system is not perfect, so be aware and refer to existing data store implementations if in doubt.
A serializer takes as arguments the store-controller, lisp object and
buffer-stream from the memory utility library and returns the
buffer-stream with the binary serialized object. The deserializer
reverses this process. For all lisp objects except persistent
classes, this means reallocating the storage space for the object and
recreating all its contents. Deserializing a standard object results
in a new standard object of the same class with the same slot values.
Persistent classes are dealt with specially. When a persistent object
is serialized, it's oid and class are stored in the
buffer-stream. On deserialization it uses the oid to check in
the store-controller's cache for an existing placeholder object. If
the cache misses, then it creates a new placeholder object using the
class and oid as described in See Persistent Classes and Objects.
The store controller contains a cache instance that is automatically
initialized by the core Elephant object protocol.
Currently the serializer is selected by the core Elephant code based on the store controller's database version. See the reference section for details on implementing the store-controller database version method. It is a relatively small change to have the data store choose its own serializer, however we will have to tighten up and document the contracts between the Elephant core code, serializer and data store.
The core protocol that the data store needs to support is the slot access protocol. During object initialization, these functions are called to initialize the slots of the object. The four functions are:
More details can be found in the data store api reference section. In
short, these functions specialize on the specific
the data store and take instances, values and slotnames as appropriate.
Typically the oid will be extracted from the instance and be used to update a table or record where the oid and slotname identifies the value. A slot is typically unbound when no value exists (as opposed to nil).
The BTree protocol is the most extensive interface that data stores must
implement. Data store implementations are required to subclass the
and implement their complete APIs. Each class type is constructed
by Elephant using a
store-controller that builds them. These
get-value interface is similar to the persistent
slot reader and writer, but instead of using oid and slotname to set
values, it uses the btree oid and a key value as a unique identifier
for a value.
The BTree protocol almost requires an actual BTree implementation to be at all efficient. Keys and values need to be accessible via the cursor API, which means they need to be walked linearly in the sort order of the keys (described in Persistent BTrees).
An indexed BTree automatically maintains a hash table of the indices defined on it so that users can access them by mapping or lookup-by-name. The data store also has access to this interface.
A BTree index must also maintain a connection to its parent BTree so
that an index value can be used as a primary tree key to retrieve the
primary BTree value as part of the
cursor-pprev family of methods.
The contract of
remove-kv is that the storage in the data store
is actually freed for reuse.
Persistent set implemenation is optional. A default BTree based implementation is provided by default
One of the most important pieces of functionality remaining to discuss
is implementing transactions. In existing data stores, transactions
are merely extensions of the underlying start, commit and abort
methods of the 3rd party library or server being used. The Elephant
user interfaces to these functions in two ways: a call to
execute-transaction or explicit calls to
access to the data store's
execute-transaction. This function
has a rich contract. It accepts as arguments the store controller, a
closure that executes the transaction body and a set of keywords.
Keywords required to be supported by the method (or ignored without
loss of semantics) are
The semantics of
with-transaction are that a new transaction
will always be requested of the data store. If a transaction exists,
ensure-transaction will merely call the transaction closure.
If not it will function as a call to
execute-transaction is that it must ensure that the transaction
closure is executed within a dynamic context that insures the ACID
properties of any database operations (
persistent slot operations). If there is a non-local exit during this
execution, the transaction should be aborted. If it returns normally,
the transaction is committed. The integer in the
argument dictates how many times
retry the transaction before failing.
Elephant provides some bookkeeping to the data store to help with
nested transactions by using the
variable. In the dynamic context of the transaction closure, another
execute-transaction may occur with the transaction
argument defaulting to the value of
data store has to decide how to handle these cases. To support this,
the first call to execute transaction can create a dynamic binding for
*current-transaction* using the
call. This creates a transaction object that records the store
controller that started the transaction and any data store-specific
The current policy is that the body of a transaction is executed with
*store-controller* variable bound to the store-controller
object creating the transaction. This is important for default
arguments and generally helps more than it hurts, so is an
implementation requirement placed on
If two nested calls to
with-transaction are made successively
in a dynamic context, the data store can create true nested
transactions. The first transaction is passed to the
argument of the second. The second can choose to just continue the
current transaction (the CLSQL data store policy) or to nest the
transaction (the BDB data store policy).
Finally, some provision is made for the case where two store controllers have concurrently active transactions in the same thread. This feature was created to allow for migration, where a read from one database happens in one transaction, and while active has to writes to another data store with a valid transaction.
The trick is that
with-transaction checks to see if the current
transaction object is the same as the
passed to the
:store-controller argument. If not, a fresh
transaction is started.
Currently no provision is made for more than two levels of multi-store nesting as we do not implement a full transaction stack (to avoid walking the stack on each call to handle this rare case). If a third transaction is started by the store controller that started the first transaction, it will have no access to the parent transaction which may be a significant source of problems for the underlying database.