Update: Release 0.9 is now available!
Introduction
Elephant is a persistent object protocol and database for Common Lisp.
The persistent protocol component of elephant overrides class creation
and standard slot accesses using the Meta Object Protocol (
MOP) to render slot
values persistent. Database functionality includes the ability to
persistently index and retrieve ordered sets of class instances and
ordinary lisp values.
Values are stored persistently on disk by one or more 'data stores',
which currently consist of Berkeley DB and SQL servers via CL-SQL. A
pure-lisp data store is under development. Elephant inherits the ACID properties of these
stores, and by design is transactional, multi-threaded, and in the
case of Berkeley DB, process safe.
The core elephant code base is available under the LLGPL
license. Data stores each come with their own, separate license.
Key Features
- Persistent Metaprotocol
- Create persistent classes with persistent and transient slots
- Each slot value is stored independantly and persistently on disk
- Persistent slots are inherited by subclasses
- Instances are lightweight, storing only a unique ID and a parent store reference
- Class indexing and traversal
- Class objects can be declared :indexed and automatically persisted
- All instances of a class are retrievable
- Slot option ':index t' automatically creates an inverted index on slot values
- Set and Mapping interfaces to indexed class instances
- User API to add/remove indices from a class and create 'derived' indices that populate an inverted index with the values returned by an arbitrary lisp function of an instance
- Persistent Sets
- Lightweight persistent set
- Persistent sets are also persistent objects
- Supports insert, remove, map, as-list operations
- assign sets as slot values in a class for easy 1:n mappings
- Persistent BTrees
- get-value and (setf get-value) to perform key/value stores
- Btrees are also persistent objects
- Indexes provide alternative orderings of a parent btree
- Rich cursor API for iterating over btrees with side effects
- Efficient map operators for indices:
map-btree and map-index
- Storage of Lisp Objects and Values
- Store most lisp types in persistent-slots and indices
- All numeric quantities
- Strings, including Unicode
- Symbols and paths
- Lists and trees of cons cells
- Standard classes
- Structs
- All array types
- Hash tables
- Efficient binary serializer implementation
- Transactional Architecture
- Transactions ensure that a set of operations for a given store are performed atomically
- Each primitive Elephant API operation not in a transaction is automatically atomic, and thus the state of the store is always consistent even if an interruption happens during the API call
- Transactions significantly enhance performance (all reads and
writes are in-memory until commit) as well as providing ACID features
- A simple macro, with-transaction, ensures transactionality of the body
- Nested transactions, including intermingling multiple-stores (tricky, but supported)
- Multi-store Architecture
- There are currently two different data stores: one using Berkeley DB and one using CL-SQL w/ Postrgresql or SQLite3
- Each store has a separate physical file
- Multiple-stores can be open and in-use concurrently
- Migration interface to copy or move data to different data stores of any kind
- Upgrade facility to open and migrate data from older versions of Elephant
Limitations
Elephant is not a complete persistence solution, although it provides
most of the features as other alternatives. Functions, closures, and
class objects cannot be stored. Garbage collection is only supported
via an offline migrate interface which will compact the database by
only copying reachable instances. Explicit deletion of data is also
possible and some data stores will exploit reuse freed space to reduce
the total disk utilization.
Perhaps the most frustrating limitation, and the one requiring the
most effort to work around is that aggregate objects (arrays, lists,
hash tables) do not support persistent operations. That is if you
store a list, reload it and use (setf (cdr list)
new-list) the in-memory list will not be the same as that on
disk. Any list that is loaded from a slot or index will result in a
freshly consed list. Elephant solves these problem by introducing two
primitive data structures, persistent sets and BTrees.
Elephant's database facilities are limited to explicit storage in
BTrees via persistent class and slot indexing. The Elephant roadmap
includes some significant feature enhancements for querying the
database that should make Elephant simpler to use and useful across a
wide variety of applications. See the
Trac Site for more
information.
Caveats aside, Elephant is used in active websites and other projects
that require significant reliability and has held up well under some
heavy usage patterns. The 0.9 release has patched many of the
problems identified in these uses. See the download page for more details.