Next: Data Store API Reference, Previous: Design Patterns, Up: Top
Elephant's early architecture was tightly coupled to the Berkeley DB API. Over time we've moved towards a more modular architecture to support easy upgrading, repository migration, shared functionality between data stores and general hygene.
The architecture has been carefully modularized:
To get a feeling for what is happening inside elephant, it is probably best to walk through the various major protocols to see how these components participate in implementing them.
with-transaction
When the main elephant open-store
function is called with a
specification, it calls get-controller which first checks to see if a
controller already exists for that spec.
If there is no controller, it calls build-controller
to
construct one. If the data store code base is not present,
load-data-store
is called to ensure that any asdf dependencies
are satisfied. The associations for asdf dependencies are statically
configured in *elephant-data-stores*
for each data store type
supported by elephant.
While being loaded, the data store is responsible for calling
register-data-store-con-init
to register a data store
initialization function for its spec type (i.e. :BDB or :CLSQL).
For example, from bdb-controller.lisp:
(eval-when (:compile-toplevel :load-toplevel) (register-data-store-con-init :bdb 'bdb-test-and-construct))
This mapping between spec types and initialization functions is
accessed by lookup-data-store-con-init
from within
build-controller
. The function returned by
lookup-data-store-con-init
is passed the full specification and
returns a store-controller
subclass instance for the specified
data store.
The new controller is stored in the *dbconnection-spec*
hash
table, associating the object with its specification. Finally
Elephant calls open-controller to actually establish a connection to
or create the files of the data store.
Finally, if the default store controller *store-controller*
is
nil, it will be initialized with the new store controller, otherwise
the original value is left in *store-controller*
until that
store controller is closed using close-store
.
The data store implementor has access to various utilities to aid initialization.
get-user-configuration-parameter
- Access symbol tags
in my-config.sexp to access data store specific user
configuration. You can also add special variables to variables.lisp
and add a tag-variable pair to *user-configurable-parameters*
in variables.lisp to automatically initialize it when the store
controller is opened.
get-con
behavior when store is closed or lost
database-version
a store controller implements this
in order to tell Elephant what serializer to use. Currently,
version 0.6.0 databases use serializer1 and all later database
use serializer version 2. This is to ensure that a given version
of the Elephant code can open databases from prior versions in
order to properly upgrade to the new code base.
At this point, all operations referencing the store controller should be able to proceed.
At the end of a session,
The only thing that a data store has to do to support new object
creation, other than implement the slot protocol, is implement the
method next-oid
to return the next unique object id for the
persistent object being created.
Existing objects are created during deserialization of object references. The serializer subsystem is built-into the core of elephant and can be used by data stores. The serializer is abstracted so that multiple serializers can be co-resident and the data store can choose the appropriate one. The abstraction boundary between the serializer, the data store, and the core Elephant system is not perfect, so be aware and refer to existing data store implementations if in doubt.
A serializer takes as arguments the store-controller, lisp object and
a buffer-stream
from the memory utility library and returns the
buffer-stream with the binary serialized object. The deserializer
reverses this process. For all lisp objects except persistent
classes, this means reallocating the storage space for the object and
recreating all its contents. Deserializing a standard object results
in a new standard object of the same class with the same slot values.
Persistent classes are dealt with specially. When a persistent object
is serialized, it's oid and class are stored in the
buffer-stream
. On deserialization it uses the oid to check in
the store-controller's cache for an existing placeholder object. If
the cache misses, then it creates a new placeholder object using the
class and oid as described in See Persistent Classes and Objects.
The store controller contains a cache instance that is automatically
initialized by the core Elephant object protocol.
Currently the serializer is selected by the core Elephant code based on the store controller's database version. See the reference section for details on implementing the store-controller database version method. It is a relatively small change to have the data store choose its own serializer, however we will have to tighten up and document the contracts between the Elephant core code, serializer and data store.
The core protocol that the data store needs to support is the slot access protocol. During object initialization, these functions are called to initialize the slots of the object. The four functions are:
persistent-slot-reader
persistent-slot-writer
persistent-slot-boundp
persistent-slot-makunbound
More details can be found in the data store api reference section. In
short, these functions specialize on the specific store-controller
of
the data store and take instances, values and slotnames as appropriate.
Typically the oid will be extracted from the instance and be used to update a table or record where the oid and slotname identifies the value. A slot is typically unbound when no value exists (as opposed to nil).
The BTree protocol is the most extensive interface that data stores must
implement. Data store implementations are required to subclass the
abstract classes btree
, indexed-btree
, and index
and implement their complete APIs. Each class type is constructed
by Elephant using a store-controller
that builds them. These
methods are build-btree
, build-indexed-btree
and
build-index
.
The get-value
interface is similar to the persistent
slot reader and writer, but instead of using oid and slotname to set
values, it uses the btree oid and a key value as a unique identifier
for a value.
The BTree protocol almost requires an actual BTree implementation to be at all efficient. Keys and values need to be accessible via the cursor API, which means they need to be walked linearly in the sort order of the keys (described in Persistent BTrees).
An indexed BTree automatically maintains a hash table of the indices defined on it so that users can access them by mapping or lookup-by-name. The data store also has access to this interface.
A BTree index must also maintain a connection to its parent BTree so
that an index value can be used as a primary tree key to retrieve the
primary BTree value as part of the cursor-pnext
and
cursor-pprev
family of methods.
The contract of remove-kv
is that the storage in the data store
is actually freed for reuse.
Persistent set implemenation is optional. A default BTree based implementation is provided by default
One of the most important pieces of functionality remaining to discuss
is implementing transactions. In existing data stores, transactions
are merely extensions of the underlying start, commit and abort
methods of the 3rd party library or server being used. The Elephant
user interfaces to these functions in two ways: a call to
execute-transaction
or explicit calls to controller-start-transaction
,
controller-commit-transaction
and controller-abort-transaction
.
The macros with-transaction
and ensure-transaction
wrap
access to the data store's execute-transaction
. This function
has a rich contract. It accepts as arguments the store controller, a
closure that executes the transaction body and a set of keywords.
Keywords required to be supported by the method (or ignored without
loss of semantics) are :parent
and :retries
.
The semantics of with-transaction
are that a new transaction
will always be requested of the data store. If a transaction exists,
ensure-transaction
will merely call the transaction closure.
If not it will function as a call to with-transaction
.
execute-transaction
is that it must ensure that the transaction
closure is executed within a dynamic context that insures the ACID
properties of any database operations (pset
,btree
or
persistent slot operations). If there is a non-local exit during this
execution, the transaction should be aborted. If it returns normally,
the transaction is committed. The integer in the :retries
argument dictates how many times execute-transaction
should
retry the transaction before failing.
Elephant provides some bookkeeping to the data store to help with
nested transactions by using the *current-transaction*
dynamic
variable. In the dynamic context of the transaction closure, another
call to execute-transaction
may occur with the transaction
argument defaulting to the value of *current-transaction*
. The
data store has to decide how to handle these cases. To support this,
the first call to execute transaction can create a dynamic binding for
*current-transaction*
using the make-transaction-record
call. This creates a transaction object that records the store
controller that started the transaction and any data store-specific
transaction data.
The current policy is that the body of a transaction is executed with
the *store-controller*
variable bound to the store-controller
object creating the transaction. This is important for default
arguments and generally helps more than it hurts, so is an
implementation requirement placed on execute-transaction
.
If two nested calls to with-transaction
are made successively
in a dynamic context, the data store can create true nested
transactions. The first transaction is passed to the :parent
argument of the second. The second can choose to just continue the
current transaction (the CLSQL data store policy) or to nest the
transaction (the BDB data store policy).
Finally, some provision is made for the case where two store controllers have concurrently active transactions in the same thread. This feature was created to allow for migration, where a read from one database happens in one transaction, and while active has to writes to another data store with a valid transaction.
The trick is that with-transaction
checks to see if the current
transaction object is the same as the store-controller
object
passed to the :store-controller
argument. If not, a fresh
transaction is started.
Currently no provision is made for more than two levels of multi-store nesting as we do not implement a full transaction stack (to avoid walking the stack on each call to handle this rare case). If a third transaction is started by the store controller that started the first transaction, it will have no access to the parent transaction which may be a significant source of problems for the underlying database.