Class indices are a very convenient way of gaining the efficiency that BTree indices provide. If a given object is most often sought by the value of one of its slots, which is of course quite common, it is convenient to define a class index on that slot, although the same functionality can be gained in a more complicated way through the use of secondary indices.
The file tests/testindexing.lisp provides many useful examples of both declaring class indexes and using the API to seek objects using them.
The following code from that file in the test “indexing-range” demonstrates the convenience of a class indexes and the function “get-instances-by-range”. Note in the definition of the “slot1” the keyword “:index” is used to specify that this slot should be indexed.
(defclass idx-four () ((slot1 :initarg :slot1 :initform 1 :accessor slot1 :index t)) (:metaclass persistent-metaclass)) (defun make-idx-four (val) (make-instance 'idx-four :slot1 val)) (with-transaction () (mapc #'make-idx-four '(1 1 1 2 2 4 5 5 5 6 10))) (let ((x1 (get-instances-by-range 'idx-four 'slot1 2 6)) (x2 (get-instances-by-range 'idx-four 'slot1 0 2)) (x3 (get-instances-by-range 'idx-four 'slot1 6 15)) ) (format t " x1 = ~A~%" (mapcar #'slot1 x1)) (format t " x2 = ~A~%" (mapcar #'slot1 x2)) (format t " x3 = ~A~%" (mapcar #'slot1 x3))
Additionally, the test
Can be used to judge the performance of indexing a medium sized dataset.
The file src/elephant/classindex.lisp provides the source code and some crisp documentation of the class indexing system.
Note that for retrieving items, the API is provided by three functions:
(defgeneric get-instances-by-class (persistent-metaclass)) (defgeneric get-instances-by-value (persistent-metaclass slot-name value)) (defgeneric get-instances-by-range (persistent-metaclass slot-name start end))
By using these functions, any class that is a subclass of persistent-metaclass can also be thought of as a container of all of its instances, which are persistent in the database between lisp invocations. Moreover an individual object can be looked up on O(log n) time via a value on which it is indexed.
At the top of this same file, you will find the a description of the API which can be used to dynamically add and remove indexes. (Adding and removing indexes can also be performed by a re-execution of the “defclass” macro with different values.)
You can enable/disable class indexing for an entire class. When you disable
indexing all references to instances of that class are lost. If you re-enable
class indexing only newly created classes will be stored in the class index.
You can manually restore them by using
find-class-index to get the
clas index BTree if you have an alternate in-memory index.
You can add/remove a secondary index for a slot. So long as the class index remains, this can be done multiple times without losing any data.
There is also a facility for defining 'derived slots'. These can be non-slot parameters which are a function of the class's persistent slot values. For example you can use an index to keep an alternate representation available for fast indexing. If an object has an x,y coordinate, you could define a derived index for r,theta which stored references in polar coordinates. These would be ordered so you could iterate over a class-index to get objects in order of increasing radius from the origin or over a range of theta.
Beware, however, that derived indices have to compute their result every time you update any persistent instance's slot. This is because there is no way to know which persistent slots the derived index value(s) depends on. Thus there is a fairly significant computational cost to objects with frequent updates having derived indices. The storage cost, however, may be less as all that is added is the index value and an OID reference into the class index. To add a slot value you add a serialized OID+class-ref+slotname to index value which can be much larger if you use long slotnames and package names and unicode.
Thus, the question of if and how a given class should be indexed is very flexible and dynamic, and does not need to be determined at the beginning of your development. This represents the ability to “late bind” the decision of what to index.
In general, there is always a tradeoff: an indexed slot increases storage associated with that slot and slows down write operations. Reads however remain as fast as for unindexed persistent slots. The Elephant system makes it simple to choose where and when one wants to utilize this tradeoff.
Finally, that file src/elephant/classindex-utils.lisp documents tools for handling class redefinitions and the policy that should be used for synchronizing the classes with the database. This process is somewhat user customizable; documentation for this exists in the source file referenced above.