Chapter 3. Design overview

Chapter 3. Design overview
Prev		Next

The HeliDB Database interface is the API used by database clients. There are three different Database implementations, each with different properties.

To store and retrieve data in the persistent data storage (a database file, for instance), the Database uses an implementation of the DatabaseBackend interface. Different backend implementations have different characteristics and properties. One may have fast inserts but slow searches, another slower inserts but faster searches. By combining a Database implementation with a DatabaseBackend implementation, the client can tailor the database characteristics to fit its needs. The performance test results is a useful guide to which implementations to choose.

Figure 3.1. Database components

Database

The Database interface extends the Map interface. Some of the extra methods in Database add extra functionality such as searching for closest matching keys. The other extra methods have similar functions as other methods that already exist in the Map interface, but they are designed to be possible to implement more efficiently^[1] than their Map counterparts. This means that any Database can be used as a Map, but if the client is able to choose, it should preferably use the methods defined in the Database interface over those from the Map interface.

Keys and values must be immutable objects!

The client should use immutable objects for keys and values in the Database, especially if the database somehow caches objects. If the key or value objects are not immutable, the client should treat them as if they were immutable anyway. The behavior of the database if a cached object is modified is unspecified, and most likely also unpleasant.

Table 3.1. Database implementations

Implementation	Description
SimpleDatabase	The simplest Database implementation. Does not support transactions.
LoggingTransactionalDatabase	Transactional database that keeps a rollback log for all operations in a transaction.
ShadowCopyTransactionalDatabase	Transactional database that writes updates to a shadow copy of the entire database and replaces the database with the shadow copy when the transaction is committed.

Example 3.1. Creating a database and inserting two records

// Create a new SimpleDatabase that uses a HeapBackend.
// The database will store String keys and values.
//  - f is a File where the database data will be stored.

// A LogAdapter that logs to stdout and stderr.
LogAdapterHolder lah = 
  new LogAdapterHolder(
    new StdOutLogAdapter());

Database<String, String> db = 
  new SimpleDatabase<String, String, Long>(
    // Use a HeapBackendBuilder object to build the HeapBackend.
    // Objects in HeliDB with many configurable parameters often have builder 
    // objects to make it easier to construct them.
    new HeapBackendBuilder<String, String>().
      setKeySerializer(StringSerializer.INSTANCE).
      setValueSerializer(StringSerializer.INSTANCE).
      // Adapt the File to a ReadWritableFile.
      create(new ReadWritableFileAdapter(f)),
    lah);

// Insert a couple of values
db.insert("key 1", "value 1");
db.insert("key 2", "value 2");

DatabaseBackend

The DatabaseBackend is responsible for storing and retrieving data from the persistent storage, which commonly is one or several database files. To write data and interpret written data, it uses one Serializer instance for keys and one for values. What a Serializer is is explained below.

The database uses the position of a record to locate it in the database backend. Exactly what the position is depends on the backend implementation. The client only has to bother with how the positions are represented when creating the database object.

Table 3.2. Database backend implementations

Implementation	Description
HeapBackend	A basic backend implementation that puts all data on a heap.
ConstantRecordSizeHeapBackend	A variant of the heap backend that is slightly more efficient since its data records are of a constant size.
ConstantRecordSizeBPlusTreeBackend	Uses a B+ Tree to store data records in. The records must be of a constant size.
MapBackend	Puts all data in a Map. Perhaps most useful for testing.
BPlusTreeIndexBackend	Keeps a key index for another backend in a B+ Tree. Used to speed up searches in the database.
LruCacheBackend	Caches some of the database records stored in another backend in an in-memory Least Recently Used (LRU) cache.

Serializers

Serializer objects are used to serialize database keys and values to the bytes that are written to the persistent storage, and then to interpret those bytes into keys and values again. There are several Serializer implementations for primitive datatypes. See the API documentation for details.

It is also fairly easy to implement custom Serializer:s. See Appendix A, Custom Serializer implementation for an example.

Transactions

A database transaction makes it possible for an execution thread to keep its updates of one or several databases ACID, Atomic, Consistent, Isolated and Durable, with regards to other execution threads in the program. For more information on transaction theory, see for instance this Wikipedia article on ACID:ity. A HeliDB transaction may be either read only or read/write.

Transactions are coordinated by the Transaction class. It contains static methods for starting a new transaction and for getting the current transaction, and methods for committing and rolling back a transaction.

Database:s that support transactions implement the TransactionalDatabase interface. It extends Database with a method for manually joining a transaction. If the Database has not already joined the current transaction, it automatically does so when any reading or updating operation is called on it. If a reading operation is called, the Database joins the current Transaction in read/only mode. If an updating operation is called, the Database joins the transaction in read/write mode.

A Database may join a read/write transaction in either read/write mode or read only mode. If it joins it in read/write mode, the entire database is locked for exclusive access by the thread having the transaction. If it joins it in read only mode, the database is locked with a read lock that may be shared with other read only transactions.

Example 3.2. Running a transaction over two databases

// Run a read/write transaction over two 
// TransactionalDatabase<String, String>:s, db1 and db2.
boolean successful = false;
Transaction txn = Transaction.startTransaction(false);
try
{
  // db1 automatically joins txn in read/write mode since we're doing an
  // updating operation on it.
  db1.insertOrUpdate("lastUpdate", "" + System.currentTimeMillis());
  
  // We have to manually join db2 in the transaction in read/write mode. 
  // Otherwise the get operation would have joined db2 in read only mode.
  db2.joinTransaction(false);
  
  String s = db2.get("years");
  db2.update("years", s + ", 2008"); 
  
  txn.commit();
  successful = true;
}
finally
{
  if (!successful)
  {
    // The transaction failed somehow. Roll back the changes.
    txn.rollback();
  }
}

Thread safety

Transactional Database implementations are thread safe, the SimpleDatabase is not. If the SimpleDatabase should be used concurrently from different threads, it has to be synchronized by the client.

Builder objects

Since Database and DatabaseBackend implementations often are configurable with a lot of properties, they can be created using builder objects such as LoggingTransactionalDatabaseBuilder and BPlusTreeIndexBackendBuilder. Using builders gives client programs more readable code. See the examples in the next chapters.

EntityFS

HeliDB database backends use EntityFS ReadWritableFile:s to persist data to. This means that data can be stored in any EntityFS file system (the RAM file system, for instance), or in plain, old Java File:s.

Database and backend classes log debug and trace output to an EntityFS LogAdapter.

^[1]Efficient here means few requests to the database backend.

Prev		Next
Chapter 2. Getting started	Home	Chapter 4. Database implementations