Monday, November 22, 2010

Using JOTM as a TransactionManager in Neo4j

If you follow the neo4j mailing list you may have noticed that lately I have been working on providing support in Neo for plugging in JTA compliant Transaction Manager implementations. In this post I will try to explain how this was done, how you can enable JOTM support that is provided mainly as an example and what further improvements are possible after completing this project.

Flashback

During my series of posts for the internals of Neo, I had touched upon the whole XA business and how Neo comes with (complete, as it turned out) support for fitting into an XA environment. There I had described the various classes that expose the database store as an XAResource and how the TxManager class enlists it as needed in top level transactions, along with other XAResources, typically ones from the Index store. I will not explain more here, but it is a good idea to have in mind the general framework. The idea was that, since the TxManager implements the javax.transaction.TransactionManager interface, what keeps us from substituting it with a different, third party implementation? More importantly, how can it be done and how robust will the result be? If you want to see how this turned out and like to hear good news, keep reading. Note that the next paragraph is story telling, so if you want the main course, skip it.

Story time

One of the most used JTA compatible implementations of transaction managers out there is the objectweb's JOTM package. I prefered it over Atomikos' solution mainly because, being a newbie in the field, I wanted something that provided source code so it would help me with learning and debugging. A fine choice, as it turned out.
I had some teething problems, having to do with initializing the JOTM instance, enabling recovery, searching through versions for resolved bugs, but I came out victorious. After figuring all this out, I tore out the TxManager class from the TxModule in the Neo kernel and substituted it with a Jotm instance. No problems there, it worked first try. Enlisting the resources for recovery was a bit more trouble but I managed it, in a way that will be explained below. After that, I had to make sure that everything worked properly for recovery in all disaster scenarios. After getting bored of using the debugger to breakpoint/kill the JVM on commit()/rollback()/prepare(), after a suggestion by Tobias Ivarsson on IRC, I detoured a bit and hacked up a solution with JDI. After some (little) automation I had tested many scenarios with satisfactory precision and, as it turned out, all the hard work was already done by the Neo team - I did not have to touch not even one line of code to make things work out.
Now, of course, the kernel had a dependency on the JOTM package, which is unacceptable. I looked at the way Lucene is enlisted as a service automatically and using this framework I added a hook in the kernel to discover and register services that provide TransactionManager instances. Now one can create a .jar file, add it and its dependencies in the classpath and the kernel will pick it up and make it available for use. Neat! To verify the usability and stability of this, I branched the recovery-robustness suite that the dev team uses to check that crashes are not catastrophic when JOTM is used. For the time being, all works well.

Getting technical: The hooks

If you want to add support for a custom tx manager implementation, the first thing to look are the interface TransactionManagerImpl and the class TransactionManagerService. The first is there out of necessity - when the TxModule is started, the recovery process must be initiated. Each tx manager does it in its own way, so the init() method is there to abstract this concept. The XaDataSourceManager is passed to provide all XAResources that are registered with the kernel and which may have pending transactions. The stop() method is of course used to notify the tx manager of shutdown so that resources can be released and cleanup performed. The native implementation of Neo (TxManager and ReadOnlyTxManager) have been altered to implement this interface. In addition, it extends javax.transaction.TransactionManager so that it can be used "vanilla" after initialization.
The TransactionManagerService is a convenience class. It extends Service so that it can be discovered and loaded and implements the TransactionManagerImpl interface so that it can be used without a cast in the TxModule. The only thing defined is a constructor with a String argument, which is the name under which the Service will be registered - your implementation will be known by this name. To add a TransactionManager implementation as a service, you must extend this class, as it will be demonstrated by the JOTMServiceImpl.
The Config object has seen the addition of the constant TXMANAGER_IMPLEMENTATION, with a value of "tx_manager_impl". If you want your custom implementation of tx-manager-as-a-service to be used, you must start the EmbeddedGraphDatabase with a custom configuration that will contain at least this key with a value of the name of your service.

The sample implementation

Here you can see the project that holds the JOTM implementation of the above. It consists of only one class that extends TransactionManagerService, has a name of "jotm", on init() constructs a local Jotm instance and registers with it all XAResources reported by XaDataSourceManager, requests recovery and is done. All TransactionManager interface methods are passed to this local instance. stop() simply stops the Jotm instance. The annotation @Service.Implementation is from the org.neo4j.helpers package and is there to ensure that this class is registered. For the magic however to happen a META-INF/services/org.neo4j.kernel.impl.transaction.TransactionManagerService resource must exist, that will contain a single line with the fully qualified name of the service class. Bundle all that up in a jar, put it in your classpath, change your configuration parameters and bam! nothing will work. It is obvious that you must provide the bundles that consist the JOTM implementation and also, configure it right. So let's do that.

Making it play

The version I have used in all my tests is 2.1.9, I suggest you do the same. The jars needed are:

jotm-core-2.1.9.jar
which is the JOTM core impl and requires

avalon-framework-4.1.5.jar
commons-collections-3.2.jar
commons-logging-1.1.jar
commons-logging-api-1.1.jar
servlet-api-2.3.jar
log4j-1.2.16.jar
logkit-1.0.1.jar
jacorb-2.2.3-jonas-patch-20071018.jar
jacorb-idl-2.2.3-jonas-patch-20071018.jar
howl-1.0.1-1.jar
carol-3.0.6.jar
carol-interceptors-1.0.1.jar
irmi-1.1.2.jar
ow2-connector-1.5-spec-1.0-M1.jar
ow2-jta-1.1-spec-1.0-M1.jar


If you are a maven user, the dependencies to add are

<dependency>
    <groupId>org.ow2.jotm</groupId>
    <artifactId>jotm-core</artifactId>
    <version>2.1.9</version>
</dependency>
<dependency>
    <groupId>org.ow2.spec.ee</groupId>
    <artifactId>ow2-connector-1.5-spec</artifactId>
    <version>1.0-M1</version>
</dependency>


Also, you will need to provide a configuration for JOTM to use, necessary for recovery which is turned off by default. The location is passed via a JVM argument as

-Djotm.home=/path/to/directory


which must contain a jotm.properties file in which at least the property jotm.recovery.Enabled must be set to true. Here is a sample. Another property in there is howl.log.FileDirectory which is a path where the transaction WAL will be kept.
I confess to not have studied CAROL, so if you have any problems I suggest you use the whole configuration directory as provided here which works.

After that, adding the jotm-connector-service bundle as described above, jta kernel with correct parameters and all, you should get a working environment. The howl logs must appear and you must not have tm_tx_log.* files in your store directory, since the native TxManager is not used. If you want to use it instead, an empty configuration or a "tx_manager_impl" parameter with a value of "native" will do the trick. There you go, you used an alternative TransactionManager implementation, as promised. Go ahead, implement a long operation that involves indexing and kill it on commit(). When restarted, the store will be brought up to a consistent state. You can also alter the sample service project to bind the Jotm instance to a JNDI location and enlist resources from a different store altogether.

For maven users

Apart from the dependencies for jotm above, you can download and install the jta kernel and the JOTMService project. Add their coordinates to your pom.xml as in

<dependency>
   <groupId>org.neo4j.jta</groupId>
   <artifactId>jotm-service-provider</artifactId>
   <version>0.0.1</version>
</dependency>
<dependency>
   <groupId>org.neo4j</groupId>
   <artifactId>neo4j-kernel</artifactId>
   <version>1.2-jta</version>
</dependency>


substituting the main kernel for the jta one. Things should work.

From here

This whole thing lacks some safety measures, mainly the fact that the database is allowed to come up with a different tx manager than that which was used when it crashed. I think it should be added and the best location I can think is the neostore top level store file. I will have to get back to you on this.
I now know enough to go to the second phase of my work: make Neo work in an application server via JCA, providing container managed transactions in the same way JDBC resources do it. This way you can use your favorite graph database from an EJB side by side with a relational store and rest assured that the 2PC will work while you are safe from the details. How cool would that be? Until then, happy hacking.

Thursday, November 4, 2010

Neo4j Internals: Transactions (Part 3) as a complete run and a conclusion

This is the last post in the series, at least the core one. Here I will try to follow a path from the initialization of the db engine and through the begin() of a transaction and creation of a Node to the commit and shutdown. It will require knowledge of all the material so far covered but it shouldn't be hard to follow, since everything needed has been covered almost completely. So, without further ado...

Creating the EmbeddedGraphDatabase

The first thing to do to start working with a Neo instance is to create a new EmbeddedGraphDatabase. This is a thin shell over an EmbeddedGraphDbImpl, so we instantiate that. We provide it with the user-provided parameter map (or an empty one) and the default implementations for factories for LockManager that will create LockManager instances which will use a provided TransactionManager, IdGenerator for assigning ids to created entities by delegating to the store managers, RelationshipTypeCreator for creating RelationshipTypes, TxIdGenerator for assigning ids to txs that delegates to the TransactionManager, TxFinishHook for creating tx synchonization hooks on tx finish that does nothing and LastCommittedTxIdSetter for storing the last successfully committed tx id that also does nothing. EmbeddedGraphDbImpl in turn creates a new TxModule to hold a TransactionManager and a XaDataSourceManager, a new LockManager, a new Config, and a GraphDbInstance (as of late there is also an Indexer but we conveniently ignore that).

The Config object has a long story ahead of it. It accepts and stores most of the instances created by the factories at EmbeddedGraphDbImpl and also creates a new IdGeneratorModule, a new PersistenceModule, an AdaptiveCacheManager and a GraphDbModule. That last one parses the parameters map and decides on what type of Cache to ask NodeManager to create and whether the user has requested a full r/w database or a read-only one, creating as a result - wait for it - yes, the proper NodeManager instance, though that work will actually happen later on, in GraphDbInstance. References to all these are stored and the whole GraphDbModule is kept in the Config.

GraphDbInstance is where the database store is started. On start(), it uses an AutoConfigurator, which in its very brief lifetime computes some sensible defaults for the sizes in-memory images of the various stores, regardless of whether they will be buffered or memory mapped. These sizes are also placed in the Config object. Next comes what we all have been waiting for - that's right, the Store instantiation. The TxModule is retrieved from the Config and is asked to registerDataSource() as the DEFAULT_DATA_SOURCE_NAME the NIO_NEO_DB_CLASS which currently is NeoStoreXaDataSource. The XaDataSourceManager in TxModule is passed the parameters and instantiates the class via reflection, assuming that there is a constructor accepting a Map argument (which represents the configuration as a parameter map) and stores the result in a Map, pointed to by the resource's name. As we have seen previously, NeoStoreXaDataSource creates the actual store via instantiating a NeoStore and create()ing a XaContainer, possibly triggering a recovery or else "simply" instantiating the various datafiles and id stores. This is the major performance hiccup in the startup sequence, since all the above run in the same thread, a necessary measure to ensure a successful log recovery if it proves necessary. Obviously, if the database was created, from now on you can see the files in the db directory.

Going back to GraphDbInstance, a NioNeoDbPersistenceSource is created and stored in the Config and also provided in the IdGeneratorModule, where it is used as the actual source of entity ids. Note that the actual association of the PersistenceSource with the DataSource is made when a few lines below, GraphDbInstance calls start on the PersistenceSource passing as an argument the XaDataSourceManager. After that, init() is called on the modules in the Config (which currently do nothing) and then they are start()ed. This makes the TxModule to bind the XaDataSourceManager to the TransactionManager, the PersistenceModule to create the PersistenceManager, the PersistenceSource to aquire the DataSource, the IdGenerator to get the PersistenceSource and the GraphDbModule to create and start() the NodeManager. Starting the NodeManager causes the parameter map to be parsed and discover the sizes and type of the caches and register them with the CacheManager.
So there we are. The store is initialized, as well as the DataSource over it, the TxModule is up and running with its TransactionManager, the NodeManager has built all that it needs and the LockReleaser and LockManager are in place. This is pretty much what is needed to start working, so it is about time we did that.

Beginning a transaction

Explicitly beginning a tx is necessary only if you need to perform write operations. For read-only scenarios, where no locks are acquired, you can get away by simply asking the db for the element. This is not the typical (or interesting for that matter) scenario, so let's create something. This requires a tx so let's start that. Calling beginTx() on an EmbeddedGraphDbImpl asks the referenced GraphDbInstance for the TransactionManager (stored in the TxModule in the Config) and then asking to begin() one. No reference needs to be stored, recall that txs are thread bound, so as long as we are in the same thread we know which is our tx. However, an API must be provided for the demarcation of transactional code, so a TopLevelTransaction object is created, wrapped around the TxManager and returned to the user. This object is a simple wrapper around the TxManager, forwarding all calls of the Transaction interface to it, relying on the thread-to-tx mapping stored in the TxManager for the operation success. That is the object you receive on beginTx() so that you can work your magic.
We have already seen in some detail the workings of the TxManager class (which is the implementation of the TransactionManager interface) but let's follow the code. Calling begin() retrieves the currentThread() and maps a new TransactionImpl to it. Creating the TxImpl also assigns it a global Tx identifier via the Xid mechanism. Note that for now, no resources are enlisted, no log entries have been written and the state of the tx is STATUS_ACTIVE. To see in action the full mechanism we have to create a Node.

Creating the Node

To create a Node we call createNode() on EmbeddedGraphDatabase which forwards it to EmbeddedGraphDbImpl which sends it to NodeManager.createNode(). There the IdGenerator is asked for the nextId() for Node objects, which hands it off to the PersistenceSource. The implementing class is NioNeoDbPersistenceSource, which forwards it to the NeoStoreXaDataSource, which finally retrieves the NodeStore and returns the nextId(). You must have noticed here that in fact there is no Resource enlisted in the current tx yet. Now that we have the id, we can create a new NodeImpl and acquire a WRITE lock on it, creating also a NodeProxy object to return to the user.
Now comes the fun part. Still in NodeManager.createNode(), we ask the PersistenceManager to nodeCreate() the brand new Node for this id. The PersistenceManager has no idea how to do that, so it getResource() to get a ResourceConnection to do it. Of course the ResourceConnection is returned by the referenced PersistenceSource instance (which in our case is a NioNeoDbPersistenceSource returning NioNeoDbResourceConnections) and it indeed has a reference to an XAResource (that is an inner class that simply holds an Object as an id). So, after retrieving the current tx it asks it to enlist the XAResource, leading to all the magic described here. Also, a TxCommitHook is added to the tx that releases the connection and the makes the resources used reclaimable by the garbage collector upon the end of the connection usable life. Note that the XaResource is registered with the XaResourceManager and mapped to a WriteTransaction when the resource is start()ed in the TransactionImpl.
After we write to the TxLog and setup the XAResource to the tx, we still have an operation to do. Recall that the Connection holds the EventConsumers which forward to the corresponding store. The related consumer in this case is the NeoStoreXaConnection.NodeEventConsumerImpl, which for createNode() events (and all others for that matter) retrieves the WriteTransaction and tells it to nodeCreate() for the id and the WriteTransaction creates the NodeRecord object and stores it.
Let's make a check here: Nothing is written on the disk, including the Resource tx. The store is untouched, not even the id store has been tampered. The only thing in permanent storage is the global tx record marking it as STARTed and the branchId for the Resource. The fact that a record has been created is in memory only. If we crash here, there is an implicit rollback guarrantee.
We are not done yet, since the user has nothing to work with. Before returning the NodeProxy from the NodeManager, first we cache it and then we ask the LockReleaser (via NodeManager.releaseLock()) to release the Lock for this Node, an action that eventually results in keeping in memory the fact that this Node is locked and ask the current tx to add a Synchronization that will release the locks after completion. Now we can return the NodeProxy to the user.

Committing the transaction

So, the time has come to commit() the tx and make our valuable Node permanent. Calling success() on the TopLevelTransaction marks it simply as successful, meaning that on finish() it will be commit()ted. So, let's call finish(). The TransactionManager is asked for the current tx and commit() is called on it. This calls back the TransactionManager, which gets the current tx and does all those nice things that we discussed some time ago, such as writing to the TxLog and calling commit hooks. In a nutshell, the tx gets the single enlisted resource (since we are not using but one) and decides that, since this is a 1PC we just tell the resource to commit(). In our case, this leads to the NeoStoreXaConnection.NeoStoreXaResource to call the XaResourceManager to commit, which means that the WriteTransaction is asked to prepare(), compiling our new NodeRecord to a NodeCommand and writing that out to the XaLogicalLog, an action after which we can rest assured that our change will be performed no matter what. If this succeeds, XaResourceManager calls commit() on the WriteTransaction, meaning the Node creation command is executed, asking the NodeStore to write out that Record. We are now done, meaning the XaLogicalLog adds a DONE entry, the Transaction is removed from the txThreadMap in the TxManager, the TxLog marks the Transaction also as TX_DONE and the Transaction status is set to STATUS_NO_TRANSACTION. Now everything is on disk and both the Resource and the global txs are marked as successful.

Closing the database

We wrote our target 9 bytes on the disk (you do remember that a Node record is 9 bytes, right?) and we are ready to close the database. So we go ahead and call shutdown() on the EmbeddedGraphDatabase, which ends up calling GraphDbInstance.shutdown(). The config is asked for the various modules and tells them to stop(). The GraphDbModule tells the NodeManager to clearPropertyIndexes() and clearCache() and then stop(), operations that do nothing fancy, they just null out various references. The IdGeneratorModule and PersistenceModule have no-op stop() methods. The TxModule.stop() asks the TxManager to close() the TxLog that in turn close()es the log file channel. The most interesting part is in PersistenceSource.stop(). This forwards to NeoStoreXaDataSource which calls flushAll() to NeoStore, leading eventually to every store force()ing all buffers out to disk. This ensures a nice, clean shutdown of the disk store. The XaLogicalLog must also be closed, an operation already described in detail <a>some time before</a>. Then we call close() on NeoStore, which in essence closes the file channels used for persistence operations. This ends the file channel closing cycle, leaving the GraphDbImpl to call destroy() on the same modules, which currently are all no-ops. We can now exit.

From here

This post concludes a rough description of the core operations in Neo. I estimate around 2/3 of the code in the kernel component have been covered, leaving out aspects such as kernel component registrations, the indexing framework, traversals and a host of other extremely interesting components. I do not know to what extent I could have fitted them here, but I think that by understanding what I have discussed so far, one can navigate the code and understand the remaining pieces easily.
Truth been told, I have reached a point where I no longer want to write about Neo but instead I want to start hacking it. If the need arises and I find it interesting, I may write again about some other operation, but first I want to get a better feeling for the code. If you have specific requests/questions regarding my articles, I suggest you send a mail to the Neo mailing list and we can discuss it there, or hang around the #neo4j channel at freenode.

I hope that by reading my work you got at least part of the knowledge I got out of writing it.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.

Tuesday, November 2, 2010

Neo4j Internals: Interlude - Xa roundup and consistency

The time has come for the most boring of these posts (imagine that!). There are some details that haven't been referenced yet, mainly regarding the interactions between the various classes that lead from the persistence store up to the NodeManager and from there to you, the user. There are many classes that have to be explained, with a lot of cross-references, but if you have followed my work thus far, then it shouldn't be that difficult to digest. Possibly, this post should have come before the talk about transactions, but hey!, I am currently myself on the path to enlightenment concerning the internals of Neo, so I understood things somewhat out of order. This is the reason I label this article as interlude, since practically no tx talk will take place. Prerequisites are not that demanding in this post, although if this is your first contact with Neo4j internals you will in all probability be overwhelmed. So, we begin.

DataSources and XaConnections over a persistence store

XaConnections encapsulate a XaResource towards a XA compatible persistence store. It is not part of the XA specification but is a useful abstraction provided by Neo that couples a XaTransaction with a XAResource. The concrete implementation is NeoStoreXaConnection that holds a NeoStoreXaResource as the implementation of the XAResource and a WriteTransaction as the implementation of the XaTransaction. The XA related interface exposed by NeoStoreXaConnection is getXaResource() : XAResource that returns an instance of the inner class NeoStoreXaResource, which forwards all XAResource operations to a XaResourceManager implementation and defines the isSameRm() XA-required method for XAResources by equality comparison on the filename of the supporting store. Finally, NeoStoreXaConnection returns event consumers for operations on Neo primitives such as NodeEventConsumer and PropertyIndexEventConsumer that forward the requested operations to the WriteTransaction encapsulated by the parent NeoStoreXaConnection. These event consumers are used by NioNeoDbPersistenceSource to implement ResourceConnections, but that is discussed in detail later in this post.
XaDataSource is an abstract class that defines the means of obtaining XaConnections from a data source and some facilities for recovering txs. The idea is that classes extending XaDataSource will encapsulate a transactional resource, capable of supporting XAResources so that they can fit in a XA environment. This is obvious from the LogBackedXaDataSource extending class, where all tx recovery operations are forwarded to an underlying XaLogicalLog. Neo extends this with NeoStoreXaDataSource which, apart from creating XaConnections, is pretty busy: On instantiation is responsible for creating the NeoStore that is the on-disk storage of the graph, creates a XaContainer to help with housekeeping (more on that next), even creates the IdGenerators for the various Stores. Is also provides implementations (as inner classes) for a XaCommandFactory and a XaTransactionFactory that it passes to the XaContainer for recovery purposes. This gives it the role of a Facade over the details of a lot of things I have described previously, summing up XaLogicalLogs, XaResourceManagers, Stores and their paraphernalia into a data source practically ready for fitting into a XA environment.
Before we leave NeoStoreXaDataSource, a note on its instantiation. Instead of the usual new call for creating an instance, there is a more roundabout way for getting a Neo DataSource up and running. When the database starts, the TxModule object held by the Config is asked to register DataSources, as it goes around the various components (the Indexer service is another example of a user of DataSources). For the Neo kernel, when GraphDbInstance is start()ed, the TxModule in the Config object is asked to register a DataSource with an implementing class of NeoStoreXaDataSource and there it is passed to the DataSourceManager which instantiates it via reflection. DataSourceManager keeps a mapping from identifying Strings to XaDataSource instances, maintaining this way a single instance for every data source. The identifying String is kept in the Config as DEFAULT_DATA_SOURCE_NAME.

Management of XaResources

The mapping of a XAResource to a XaTransaction represented by a XaConnection is realized in the XaResourceManager. This class mainly keeps a Map<XAResource,Xid> and a Map<Xid,XidStatus>, XidStatus being an inner class that, with the help of another inner class, TransactionStatus, holds the current status of the tx and the tx itself identified by its xid and mapped by an XAResource. Essentially, from this mapping, all tx operations on an XAResource are forwarded to the related tx. This helps the decoupling of tx operations that XaResources are asked to perform from any implementation details of the XaLogicalLog or the backing store, leaving XaResources, XaTransactions and XaConnections as thin shells that can be useful in higher layers.
XaResourceManager also participates in failure recovery in conjunction with the XaLogicalLog, accepting via the methods the recreated txs as the logical log reads them and then completing them. In a sense, a XaResourceManager coupled with a XaLogicalLog are the equivalent of the TxManager+TxLog as we saw them last time but with the addition of a backing persistence store, in the form of a DataSource.

Binding related things together

The various components that help out a XaDataSource must be instantiated with a specific order and it is a nice idea to keep them together since they are closely coupled. This is a job for XaContainer, which keeps a XaDataSource, a XaCommandFactory, a XaTransactionFactory, a XaLogicalLog and a XaResourceManager. The idea is that the XaResourceManager and the XaLogicalLog must have access to a txFactory and a commandFactory before they start and additionally the log needs a XaResourceManager before being open()d, else the recovery will be impossible to proceed. This leads to a specific serialization of the instantiation/initialization operations and this is done by the XaContainer. XaContainer in turn is created by NeoStoreXaDataSource when it is instantiated, which passes its internal implementations of CommandFactory and TransactionFactory to the create() method of XaContainer, leaving to it the creation of instances of XaLogicalLog and XaResourceManager. To open the log (and possibly trigger a recovery) you must call  on XaContainer.open() after initializing it, ensuring that everything is in place.

An intermediate interface: The ResourceConnection

PersistenceSource defines an interface that exposes the expected functionality for every persistence store that Neo can use to store its data on disk. The operations themselves are abstracted as ResourceConnections that are returned by a PersistenceSource. For that reason, NioNeoDbPersistenceSource implements this as an inner class, NioNeoDbResourceConnection, that accepts a NeoStoreXaDataSource, extracts from it the XaConnection and from there the various primitive event consumers, dispatching to them the operations each is supposed to handle. This 2-level indirection is a purely engineering construct, having no other impact on the logic of any subsystem.

Addressing problems with store-memory consistency: The LockReleaser

There is an issue I haven't touched upon yet. We have seen how the various records are updated in the store and kept locked for the duration of a tx, ensuring their isolation guarantees. However, there remains to be seen how the modifications upon a Primitive are kept in memory for reading within a tx and how overlapping creation/deletions/changes of properties are managed. This is the task assigned to LockReleaser, with the more general responsibility of locking the entities that are to be modified and releasing the locks upon commit. The core idea is that, per transaction, we keep a set of changes for every element and its properties. The set of changes in the properties of a primitive are kept as instances of the inner classes CowNodeElement or CowRelElement for Nodes and Relationships respectively and the set of those elements (one for each corresponding primitive) are kept as instances of the inner class PrimitiveElement. The cowMap field is a Map<Transaction,PrimitiveElement> that keeps the mapping of the changes for the current tx. The easy part is deletion, where calling delete() on a primitive passes the call to NodeManager, which forwards the call first to LockReleaser, marking the corresponding CowElement as deleted via a boolean field and then to the PersistenceManager which updates the XaTransaction (WriteTransaction in the case of NioNeoDbPersistenceSource). The great management overhead and the bulk of the code is the addition, deletion and changing of properties for Nodes and Relationships. Two sets are kept for each one, a propertyAddMap and a propertyRemoveMap. When a property is added, it is appended in the propertyAddMap for the primitive, while removals are appended in the propertyRemoveMap. Asking a primitive for a property passes from the Proxy (that implements the Node or Relationship interface and is the user visible class) to the NodeManager, which retrieves the corresponding NodeImpl or RelationshipImpl and there propertyAddMap and propertyRemoveMap are consolidated, keeping the actual changeset and finally retrieving the requested property, if present. To make this clear, let's see an example.
Say you have a Node and you add a property via setProperty("foo","bar"). Initially, the NodeProxy simply forwards the call to the corresponding (based on id) NodeImpl. There it is locked (in the Primitive.setProperty() method) for WRITE by the LockReleaser. The full propertyMap is brought into  memory if not already there (NodeManager.loadProperties()) and the addProperty and removeProperty maps for this primitive are obtained. Note that currently there are 3 sets of properties in memory for this primitive. The ones loaded from the store (the propertyMap), the so far in this tx added (the addPropertyMap) and the ones so far removed (the removePropertyMap). These have to be aggregated into a currently consistent set so that we can decide whether to create a new property or to change an existing one. There are three stages for this. First, we check the currently known property indexes that are resident in memory. If it is not there, we make sure we bring all property indexes in memory and we check those. If it is also not there, then we create it. In the mean time, if the property value was found in either the stored property set or in the add map (it was added previously in this tx) then we retrieve it, removing it from the removePropertyMap. Its value is changed, it is added in the addPropertyMap and the WRITE lock is released. Similar paths are followed in all other operations, including additions and removals of Relationships for Nodes. Finally, before commit(), the addPropertyMap, removePropertyMap and propertyMap are consolidated in the final version at Primitive.commitPropertyMaps(), which adds all properties in addPropertyMap and then removes all properties in removePropertyMap from propertyMap. This brings the in-memory copy of this Primitive back to a consistent state with the now updated in-file version, getting rid of all the versions in the temporary add and remove maps.
LockReleaser is also used by WriteTransaction to invalidate cached entries. The various removeFromCache() calls are there to ensure that after a tx which deletes a primitive is committed, the corresponding entry in the cache is removed so that it cannot be referenced again. This is used in WriteTransaction, where after a delete command, a rollback of a creation command or the execution of a recovered command, the matching removeFromCache call is made to the LockReleaser, which forwards it to the keeper of the cache, the omnipotent NodeManager.

Managing locks: Again, the LockReleaser (with help from the NodeManager)

LockManager and RagManager were described in a previous post as the core mechanism that provides isolation between txs. Now we will see where the lock aquisition and release are done. First, WRITE locks are acquired on Node and Relationship creation events from NodeManager (in createNode() and createRelationship()) and from Primitives (via NodeManager.acquireLock()) on removeProperty, setProperty and delete events. Releasing them is not done right away, although the call to NodeManager.releaseLock() is done at the end of every method that acquires a lock (ie, the aforementioned). Obviously, we cannot release a WRITE lock before the tx completes, since that would lead to other txs reading uncommited changes (Neo currently guarantees only SERIALIZABLE isolation levels). So, we must postpone the releasing of the lock to be done on commit. This is done in LockReleaser.addLockToTransaction(), which adds a LockElement to a List<LockElement> that is mapped by the current tx (kept in lockMap, a Map<Transaction, List<LockElement>>) and also adds a tx Synchronization hook to the current tx that afterCompletion() releases all locks held by this tx.

Almost there

This post concludes a description of the core classes that provide the performance and ACID guarrantees of Neo. What remains to be seen is a walkthrough of the path from creating a new EmbeddedGraphDatabase to its shutdown. This will be the topic of the next post.



Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.