Next: CLSQL Data Store, Previous: Garbage Collection, Up: User Guide
This section briefly describes special facilities of the Berkeley DB data store and explains how persistent objects map onto it. Elephant was originally written targeting only Berkeley DB. As such, the design of Elephant was heavily influenced by the Berkeley DB architecture.
Berkeley DB is a C library that very efficiently implements a database by allowing the application to directly manipulate the memory pools and disk storage without requiring communication through a server as in many relational database applications. The library supports multi-threaded and multi-process transactions through a shared memory region that provides for shared buffer pools, shared locks, etc. Each process in a multi-process application is independently linked to the library, but shares the memory pool and disk storage.
The following subsections discuss places where Berkeley DB provides additional facilities to the Elephant interfaces described above.
The Berkeley DB data store (indicated by a :BDB
in the data
store specification) supports the Elephant protocols using Berkeley DB
as a backend. The primary features of the BDB library that are used
are BTree databases, the transactional subsystem, a shared buffer pool
and unique ID sequences.
All data written to the data store ends up in a BTree slot using a transaction. There are two databases, one for persistent slot values and one for btrees. The mapping of Elephant objects is quite simple.
Persistent slots are written to a btree using a unique key and the serialized value being written. The key is the oid of the persistent object concatenated to the serialized name of the slot being written. This ordering groups slots together on the disk
When opening a store there are several special options you can invoke:
:recover
tells Berkeley DB to run recovery on the
underlying database. This is reasonably cheap if you do not need
to run recovery, but can take a very long time if you let your log
files get too long. This option must be run in a single-threaded
mode before other threads or processes are accessing the same database.
:recover-fatal
runs Berkeley DB catastrophic recovery (see BDB documentation).
:thread
set this to nil if you want to run single threaded,
it avoids locking overhead on the environment. The default is
to run free-threaded.
:deadlock-detect
launches a background process via
the run-shell commands of lisp. This background process connects to a Berkeley
DB database and runs a regular check for deadlock, freeing locks as appropriate
when it finds them. This can avoid a set of annoying crashes in Berkeley DB,
the very crashes that, in part, motivated Franz to abandon AllegroStore and write
the pure-Lisp AllegroCache.
Berkeley DB transactions have a number of additional keyword parameters that can help you tune performance or change the semantics in Berkeley DB applications. They are summaried briefly here, see the BDB docs for detailed information:
:degree-2
This option provides for cursor stability, that is whatever
object the cursor is currently at will not change, however prior
values read may change. This can significantly enhance performance if
you frequently map over a btree as it doesn't lock the entire btree,
just the current element. All transactions running concurrently over
the btree can commit without restarting. The global parameter
*map-using-degree2*
determines the default behavior of this
option. It is set to true by default so that map has similar
semantics to lists. This violates both Atomicity and
Consistency depending on how it is used.
:read-uncommitted
Allows reading data that has been written by other
transactions, this avoids the current thread blocking on a read access
(for example you are merely dumping a btree for inspection) so long as
you don't care whether the data you read changes or not. This
violates Atomicity and Consistency depending on how it is used
:txn-nosync
Do not flush the log when this transaction completes. This means
that you lose the Durability of a transaction, but gain performance by avoiding the expensive
sync operation.
:txn-nowait
If a lock is unavailable, have the underlying database return a
deadlock message immediately, rather than blocking, so that the transaction restarts.
:txn-sync
This is the default behavior and specifies that the transaction log
of the current transaction is flushed to disk before the transaction commit routine returns. This
provides full ACID compliance.
:transaction
This argument is for advanced use. It tells
the Berkeley DB transaction subsystem the transaction it should use rather
than to create a new one. The :parent
argument provides a parent transaction
that can result in a true nested transaction.
The berkeley DB data store exports some special facilities that are not currently supported by other data stores.
optimize-layout
. This function provides an interface
to tell Berkeley DB to try to reclaim freed storage from the file
system. This is of limited utility as it can only shrink database by
the number of empty pages at the end of the file. Depending on what
storage you have deleted, this can end up being only a handful or even
zero pages. This will work well if you recently ran an experiment
where you created a bunch of new data, then deleted it all and want to
reclaim the space (i.e. you had runaway loop that was creating endless
objects).
db-bdb:checkpoint
. This internal function forces
the transaction log to be flushed and all active data to be written to
the database so that the logs and database are in synch. This is good
to run when you want to delete old log files and backup your database
files as a coherent, recoverable set. Run checkpoing, close the
database and then manually run “db_archive -d” on the database to
remove old logs. Finally, copy the resulting data to stable storage.
Read the Berkeley DB docs for more details of backing up and
checkpointing.
Performance tuning for Berkeley DB is a complex topic and we will not cover it here. You need to understand the Berkeley DB data store architecture, the transaction architecture, the serializer and other such parameters. The primary performance related parameters are described in config.sexp. They are:
:berkeley-db-map-degree2
- Improve the efficiency of cursor traversals
in the various mapping functions. Defaults to true, meaning a value
you just read while mapping may change before the traversal is done.
So if you operate only on the current cursor location, you are
guaranteed that it's value is stable.
:berkeley-db-cachesize
- Change the size of the buffer cache
for Berkeley DB to match your working set. Default is 10MB, or about
twenty thousand indexed class objects, or 50k standard persistent
objects. You can save memory by reducing this value.