eXist-db provides a basic document versioning extension. This extension tracks all changes to a document by storing the differences between the revisions. Older versions can be restored on the fly and even queried in memory. There's also basic support to detect and intercept conflicting writes.
The versioning extension was created with human editors in mind. These will typically change documents through an editor or some form-based front-end. It should work well with documents up to several megabytes in size.
eXist-db has no control over the client. It does not know where a document update comes from and cannot directly communicate with the user. The versioning extension should therefore be seen more like a toolbox than a complete solution. Advanced functionality (merging, conflict resolution, etc.) will require support from the end-user applications.
The versioning will not track machine-generated node-level edits using XUpdate or XQuery update extensions.
The versioning extensions has the following components:
A trigger (to be registered with a collection) that implements the core versioning functionality
A serialization filter which adds special version attributes to every document. These attributes are used to detect conflicting writes.
An XQuery module which provides a function library for end-user applications, including functions like
v:doc(restore a given revision on the fly).
Versioning can be enabled for separate collections in the collection hierarchy. To
enable versioning, a trigger must be registered with the top-level collection. This is
done through the same collection configuration files,
that are used for defining indexes.
Register the versioning trigger
To enable versioning for a collection, you have to edit the collection's
collection.xconf configuration file. This file must be stored
/db/system/config collection. As described in the Configuring Indexes document, the
/db/system/config collection mirrors the hierarchical structure
of the main collection tree.
collection.xconf, you must register the trigger class
org.exist.versioning.VersioningTrigger for the
<collection xmlns="http://exist-db.org/collection-config/1.0"> <index/> <triggers> <trigger event="create,delete,update,copy,move" class="org.exist.versioning.VersioningTrigger"> <parameter name="overwrite" value="yes"/> </trigger> </triggers> </collection>
If you store above document into
/db/system/config/db/collection.xconf, it will enable versioning
for the entire database.
collection.xconf at a lower level in the hierarchy will
overwrite any configuration on higher levels, including
these trigger definitions. Triggers are not inherited from ancestor
configurations. If the new configuration doesn't define a trigger, the trigger
map will be empty.
When working with nested collection configurations, you need to make sure that
the trigger definitions are present in all
VersioningTrigger accepts one parameter,
overwrite: if this is set to
no, the trigger will
check for potential write conflicts. For example, if two users opened the same
document and are editing it, it may happen that the first user saves his changes
without the second user recognizing it. The second user also made changes and if
eXist did allow him to store his version, he would just overwrite the modifications
already committed by the first user.
overwrite="no" setting prevents this. However, eXist has no
control over the client. It does not know where the conflicting document came from.
All it can do is reject the write attempt and raise an error. The error should then
be handled by the client. Right now there are no clients to support this. More work
will be required in this area. However, clients can already use the supplied XQuery
functions to check for write conflicts (see below).
Enabling the serialization filter
In order to detect conflicting writes, the versioning extension needs to keep track of the base revision to which changes were applied. It does this by inserting special metadata attributes into a document when it is retrieved from the database. For this purpose, a custom filter has to be registered with eXist's serializer.
This is done in the
<serializer> section in the main configuration file,
conf.xml. Add a
<custom-filter> child tag to the
<serializer> element and set its
class attribute to
eXist must be restarted for the versioning filter to become active.
Accessing the versioning information
The versioning extension uses the collection
store base revisions and differences. The collection hierarchy below
/db/system/versions mirrors the main collection tree. For each
versioned resource, you'll find a document with suffix
contains the base revision (the first version of the document). Each revision is stored
in a document which starts with the original document name and ends with the revision
number, for instance
eXist provides an XQuery module to access the revision history or restore a given revision. For example, to view the history of a resource:
import module namespace v="http://exist-db.org/versioning"; v:history(doc("/db/shakespeare/plays/hamlet.xml"))
This returns an XML fragment like this:
<v:history> <v:document> /db/shakespeare/plays/hamlet.xml </v:document> <v:revisions> <v:revision rev="35"> <v:date> 2009-08-22T22:19:33.777+02:00 </v:date> <v:user> admin </v:user> </v:revision> <v:revision rev="36"> <v:date> 2009-08-22T22:38:41.629+02:00 </v:date> <v:user> admin </v:user> </v:revision> </v:revisions> </v:history>
The most important function is
v:doc, which is used to restore an
arbitrary revision of a document on the fly. You can use this function similar to the
fn:doc to query the revision. For example:
import module namespace v="http://exist-db.org/versioning"; v:doc(doc("/db/shakespeare/plays/hamlet.xml"), 35)//SPEECH[SPEAKER="HAMLET"]
This will restore revision 35 of
hamlet.xml and then find all
<SPEECH> elements with a
<SPEAKER> called "HAMLET". No indexes are
available to the query engine when processing a restore document.
Detecting write conflicts
To avoid a user overwriting the changes made by another user, eXist needs to know upon which revision the user's changes are based. To make this possible, the versioning filter adds a number of metadata attributes to the root element of a document when it is serialized (for instance when opening it in an editor). The inserted metadata attributes are all in a separate versioning namespace and will never be stored in the database. The following fragment shows the added attributes:
<PLAY xmlns:v="http://exist-db.org/versioning" v:revision="36" v:key="12343e4940b24" v:path="/db/shakespeare/plays/hamlet.xml"> ... </PLAY>
When eXist detects a potential write conflict, it cannot do more than reject the update and raise an error. However, there's an XQuery function to check if newer revisions exist. You pass it the revision number and the unique key as given in the versioning attributes of the document root element. If the function returns the empty sequence, no newer revisions exist in the database. Otherwise, the function returns the version documents for each newer revision.
import module namespace v="http://exist-db.org/versioning"; v:find-newer-revision(doc("/db/shakespeare/plays/hamlet.xml"), 36, "12343e4940b24")
Once you made sure that you really want to store the document and overwrite any revisions, simply remove the version attributes from the root element.