org.exist.indexing
Interface IndexWorker

All Known Implementing Classes:
NGramIndexWorker

public interface IndexWorker

Provide concurrent access to the index structure. Implements the core operations on the index. The methods in this class are used in a multi-threaded environment. Every thread accessing the database will have exactly one IndexWorker for every index. Index.getWorker() should thus return a new IndexWorker whenever it is called. Implementations of IndexWorker have to take care of synchronizing access to shared resources.


Method Summary
 Object configure(IndexController controller, NodeList configNodes, Map namespaces)
          Read an index configuration from an collection.xconf configuration document.
 void flush()
          Flush the index.
 String getIndexId()
          Returns an ID which uniquely identifies this index.
 String getIndexName()
          Returns an name which uniquely identifies this index.
 StreamListener getListener(int mode, DocumentImpl document)
          Return a stream listener to index the specified document in the specified mode.
 MatchListener getMatchListener(NodeProxy proxy)
          Returns a MatchListener, which can be used to filter (and manipulate) the XML output generated by the serializer when serializing query results.
 StoredNode getReindexRoot(StoredNode node, NodePath path, boolean includeSelf)
          When adding or removing nodes to or from the document tree, it might become necessary to reindex some parts of the tree, in particular if indexes are defined on mixed content nodes.
 void removeCollection(Collection collection)
          Remove all indexes for the given collection, its subcollections and all resources..
 Occurrences[] scanIndex(DocumentSet docs)
           
 void setDocument(DocumentImpl doc, int mode)
          Notify this worker to operate on the specified document, using the mode given.
 

Method Detail

getIndexId

String getIndexId()
Returns an ID which uniquely identifies this index. This will usually be the class name.

Returns:
a unique ID identifying this index.

getIndexName

String getIndexName()
Returns an name which uniquely identifies this index.

Returns:
a unique name identifying this index.

configure

Object configure(IndexController controller,
                 NodeList configNodes,
                 Map namespaces)
                 throws DatabaseConfigurationException
Read an index configuration from an collection.xconf configuration document. This method is called by the CollectionConfiguration while reading the collection.xconf configuration file for a given collection. The configNodes parameter lists all top-level child nodes below the <index> element in the collection.xconf. The IndexWorker should scan this list and handle those elements it understands. The returned Object will be stored in the collection configuration structure associated with each collection. It can later be retrieved from the collection configuration, e.g. to check if a given node should be indexed or not.

Parameters:
configNodes - lists the top-level child nodes below the <index> element in collection.xconf
namespaces - the active prefix/namespace map
Returns:
an arbitrary configuration object to be kept for this index in the collection configuration
Throws:
DatabaseConfigurationException - if a configuration error occurs

flush

void flush()
Flush the index. This method will be called when indexing a document. The implementation should immediately process all data it has buffered (if there is any), release as many memory resources as it can and prepare for being reused for a different job.


setDocument

void setDocument(DocumentImpl doc,
                 int mode)
Notify this worker to operate on the specified document, using the mode given. mode will be one of StreamListener.STORE, StreamListener.REMOVE_NODES or StreamListener.REMOVE_ALL_NODES.

Parameters:
doc - the document which is processed
mode - the current operation mode

getListener

StreamListener getListener(int mode,
                           DocumentImpl document)
Return a stream listener to index the specified document in the specified mode. There will never be more than one StreamListener being used per thread, so it is safe for the implementation to reuse a single StreamListener. Parameter mode specifies the type of the current operation.

Parameters:
mode - one of StreamListener.STORE, StreamListener.REMOVE_NODES or StreamListener.REMOVE_ALL_NODES.
document - the document to be indexed.
Returns:
a StreamListener

getMatchListener

MatchListener getMatchListener(NodeProxy proxy)
Returns a MatchListener, which can be used to filter (and manipulate) the XML output generated by the serializer when serializing query results. The method should return null if the implementation is not interested in receiving serialization events.

Parameters:
proxy - the NodeProxy which is being serialized
Returns:
a MatchListener or null if the implementation does not want to receive serialization events

removeCollection

void removeCollection(Collection collection)
Remove all indexes for the given collection, its subcollections and all resources..

Parameters:
collection -

scanIndex

Occurrences[] scanIndex(DocumentSet docs)

getReindexRoot

StoredNode getReindexRoot(StoredNode node,
                          NodePath path,
                          boolean includeSelf)
When adding or removing nodes to or from the document tree, it might become necessary to reindex some parts of the tree, in particular if indexes are defined on mixed content nodes. This method will call getReindexRoot(org.exist.dom.StoredNode, org.exist.storage.NodePath, boolean) on each configured index. It will then return the top-most root.

Parameters:
node - the node to be modified.
path - path the NodePath of the node
includeSelf - if set to true, the current node itself will be included in the check
Returns:
the top-most root node to be reindexed


Copyright (C) Wolfgang Meier. All rights reserved.