HeliDB programmer's guide

Karl Gustafsson

Revision History
Revision 1.1	2009-07-24

List of Figures

3.1. Database components

List of Tables

3.1. Database implementations
3.2. Database backend implementations

List of Examples

3.1. Creating a database and inserting two records
3.2. Running a transaction over two databases
4.1. Creating a logging transactional database on a heap backend
4.2. Creating a shadow copy transactional database on a heap backend
5.1. Creating a database using a constant record size heap backend
5.2. Creating a database using a constant record size B+ Tree backend
5.3. Creating a B+ Tree index for a heap backend
5.4. Creating a LRU cache and a B+ Tree index for a heap backend
6.1. Using a cursor to find records to process

Chapter 1. Introduction

Table of Contents

Please help!
License

HeliDB is a simple, lightweight database for storing data records consisting of a key and a value. A data record is identified by its key, which is required to be unique in the database. The Database interface defines the API that database clients use. It extends Map with methods for more efficient data management. There are three Database implementations, each with its own unique properties such as transaction handling.

This book is the Programmer's guide. It is written for programmers who want to use HeliDB in their applications. It gives an overview of HeliDB concepts and features, with examples and pointers to the API documentation which serves as the reference documentation.

Please help!

Feedback and contributions from users is essential for making HeliDB a better tool. Please share your thoughts and opinions on HeliDB with the rest of the community through the mailing lists.

License

HeliDB is licensed under the Gnu Lesser General Public License, version 3. If you want other licensing terms, contact Holocene Software – www.holocene.se.

Chapter 2. Getting started

Table of Contents

Requirements
Downloading HeliDB
Installing HeliDB
Running unit tests (optional)
Running performance tests (optional)

Requirements

HeliDB requires Java 5 or a newer Java version to run.

Downloading HeliDB

Get HeliDB from http://www.helidb.org. There are two different HeliDB distributions. The binary archive contains everything necessary for using HeliDB. The source archive contains everything that the binary archive contains as well as the complete source code and an Eclipse workspace for developing HeliDB.

Installing HeliDB

Unzip the distribution into a directory.

HeliDB Jar files are in the lib sub directory of the distribution. The helidb-full-1.1.jar file contains all HeliDB classes, except for the unit tests. The other Jar files contain subsets of the HeliDB classes. Together all subset Jars contain all of the classes in the "full" Jar. See the README.txt file in the lib directory for a listing of what the different Jar files contain.

Running unit tests (optional)

The HeliDB source distribution comes with a Schmant script for running all unit tests. To run it, open a command window (terminal, cmd) and change directory to the distribution's build directory. Set the JAVA_HOME environment variable to point to a Java 6 installation.

On Unix, run:

$ cd [HeliDB install path]/build
$ # For instance
$ export JAVA_HOME=/opt/java6
$ schmant.sh run_unit_tests.js

On Windows, run:

> cd [HeliDB install path]\build
> rem For instance. Note the absence of quotes in JAVA_HOME
> set JAVA_HOME=c:\Program Files\Java\jdk1.6.0_07
> schmant run_unit_tests.js

To run the tests using another JVM than the one used for running Schmant, set the Schmant variable javaCmd to point to the java command to use. For example, on Unix:

$ cd [HeliDB install path]/build
$ # For instance
$ export JAVA_HOME=/opt/java6
$ schmant.sh -p javaCmd=/opt/java5/bin/java run_unit_tests.js

Or on Windows:

> cd [HeliDB install path]\build
> rem For instance. Note the absence of quotes in JAVA_HOME
> set JAVA_HOME=c:\Program Files\Java\jdk1.6.0_07
> rem Note how the javaCmd argument is quoted
> schmant -p "javaCmd=c:\Program Files\Java\jdk1.5.0_16\bin\java.exe" run_unit_tests.js

See the header of the run_unit_tests.js script for more ways to configure how the units are test run.

On Windows, anti-virus software may interfere with the test. Run the tests with the anti-virus software disabled.

Running performance tests (optional)

Run the performance test suite by opening a command window and change directory to the distribution's root directory.

On Unix, run:

java -server -Xincgc -Xmx1024m \
  -cp lib/helidb-full-1.1.jar:lib/entityfs-core-[VERSION].jar:lib/helidb-junit4-1.1.jar:build/lib/helidb_perftest.jar \
  org.helidb.resources.perf.PerformanceTestRunner

Or, on Windows:

java -server -Xincgc -Xmx1024m \
  -cp lib\helidb-full-1.1.jar;lib\entityfs-core-[VERSION].jar;lib\helidb-junit4-1.1.jar;build\lib\helidb_perftest.jar \
  org.helidb.resources.perf.PerformanceTestRunner

To create graphs from the test results, add the argument -g last. The graphs are put in the platform's temporary directory (java.io.tmpdir).

All performance tests take approximately 50 hours to run…

Chapter 3. Design overview

Table of Contents

Database
DatabaseBackend
Serializers
Transactions
Builder objects
EntityFS

The HeliDB Database interface is the API used by database clients. There are three different Database implementations, each with different properties.

To store and retrieve data in the persistent data storage (a database file, for instance), the Database uses an implementation of the DatabaseBackend interface. Different backend implementations have different characteristics and properties. One may have fast inserts but slow searches, another slower inserts but faster searches. By combining a Database implementation with a DatabaseBackend implementation, the client can tailor the database characteristics to fit its needs. The performance test results is a useful guide to which implementations to choose.

Figure 3.1. Database components

Database

The Database interface extends the Map interface. Some of the extra methods in Database add extra functionality such as searching for closest matching keys. The other extra methods have similar functions as other methods that already exist in the Map interface, but they are designed to be possible to implement more efficiently^[1] than their Map counterparts. This means that any Database can be used as a Map, but if the client is able to choose, it should preferably use the methods defined in the Database interface over those from the Map interface.

Keys and values must be immutable objects!

The client should use immutable objects for keys and values in the Database, especially if the database somehow caches objects. If the key or value objects are not immutable, the client should treat them as if they were immutable anyway. The behavior of the database if a cached object is modified is unspecified, and most likely also unpleasant.

Table 3.1. Database implementations

Implementation	Description
SimpleDatabase	The simplest Database implementation. Does not support transactions.
LoggingTransactionalDatabase	Transactional database that keeps a rollback log for all operations in a transaction.
ShadowCopyTransactionalDatabase	Transactional database that writes updates to a shadow copy of the entire database and replaces the database with the shadow copy when the transaction is committed.

Example 3.1. Creating a database and inserting two records

// Create a new SimpleDatabase that uses a HeapBackend.
// The database will store String keys and values.
//  - f is a File where the database data will be stored.

// A LogAdapter that logs to stdout and stderr.
LogAdapterHolder lah = 
  new LogAdapterHolder(
    new StdOutLogAdapter());

Database<String, String> db = 
  new SimpleDatabase<String, String, Long>(
    // Use a HeapBackendBuilder object to build the HeapBackend.
    // Objects in HeliDB with many configurable parameters often have builder 
    // objects to make it easier to construct them.
    new HeapBackendBuilder<String, String>().
      setKeySerializer(StringSerializer.INSTANCE).
      setValueSerializer(StringSerializer.INSTANCE).
      // Adapt the File to a ReadWritableFile.
      create(new ReadWritableFileAdapter(f)),
    lah);

// Insert a couple of values
db.insert("key 1", "value 1");
db.insert("key 2", "value 2");

DatabaseBackend

The DatabaseBackend is responsible for storing and retrieving data from the persistent storage, which commonly is one or several database files. To write data and interpret written data, it uses one Serializer instance for keys and one for values. What a Serializer is is explained below.

The database uses the position of a record to locate it in the database backend. Exactly what the position is depends on the backend implementation. The client only has to bother with how the positions are represented when creating the database object.

Table 3.2. Database backend implementations

Implementation	Description
HeapBackend	A basic backend implementation that puts all data on a heap.
ConstantRecordSizeHeapBackend	A variant of the heap backend that is slightly more efficient since its data records are of a constant size.
ConstantRecordSizeBPlusTreeBackend	Uses a B+ Tree to store data records in. The records must be of a constant size.
MapBackend	Puts all data in a Map. Perhaps most useful for testing.
BPlusTreeIndexBackend	Keeps a key index for another backend in a B+ Tree. Used to speed up searches in the database.
LruCacheBackend	Caches some of the database records stored in another backend in an in-memory Least Recently Used (LRU) cache.

Serializers

Serializer objects are used to serialize database keys and values to the bytes that are written to the persistent storage, and then to interpret those bytes into keys and values again. There are several Serializer implementations for primitive datatypes. See the API documentation for details.

It is also fairly easy to implement custom Serializer:s. See Appendix A, Custom Serializer implementation for an example.

Transactions

A database transaction makes it possible for an execution thread to keep its updates of one or several databases ACID, Atomic, Consistent, Isolated and Durable, with regards to other execution threads in the program. For more information on transaction theory, see for instance this Wikipedia article on ACID:ity. A HeliDB transaction may be either read only or read/write.

Transactions are coordinated by the Transaction class. It contains static methods for starting a new transaction and for getting the current transaction, and methods for committing and rolling back a transaction.

Database:s that support transactions implement the TransactionalDatabase interface. It extends Database with a method for manually joining a transaction. If the Database has not already joined the current transaction, it automatically does so when any reading or updating operation is called on it. If a reading operation is called, the Database joins the current Transaction in read/only mode. If an updating operation is called, the Database joins the transaction in read/write mode.

A Database may join a read/write transaction in either read/write mode or read only mode. If it joins it in read/write mode, the entire database is locked for exclusive access by the thread having the transaction. If it joins it in read only mode, the database is locked with a read lock that may be shared with other read only transactions.

Example 3.2. Running a transaction over two databases

// Run a read/write transaction over two 
// TransactionalDatabase<String, String>:s, db1 and db2.
boolean successful = false;
Transaction txn = Transaction.startTransaction(false);
try
{
  // db1 automatically joins txn in read/write mode since we're doing an
  // updating operation on it.
  db1.insertOrUpdate("lastUpdate", "" + System.currentTimeMillis());
  
  // We have to manually join db2 in the transaction in read/write mode. 
  // Otherwise the get operation would have joined db2 in read only mode.
  db2.joinTransaction(false);
  
  String s = db2.get("years");
  db2.update("years", s + ", 2008"); 
  
  txn.commit();
  successful = true;
}
finally
{
  if (!successful)
  {
    // The transaction failed somehow. Roll back the changes.
    txn.rollback();
  }
}

Builder objects

Since Database and DatabaseBackend implementations often are configurable with a lot of properties, they can be created using builder objects such as LoggingTransactionalDatabaseBuilder and BPlusTreeIndexBackendBuilder. Using builders gives client programs more readable code. See the examples in the next chapters.

EntityFS

HeliDB database backends use EntityFS ReadWritableFile:s to persist data to. This means that data can be stored in any EntityFS file system (the RAM file system, for instance), or in plain, old Java File:s.

Database and backend classes log debug and trace output to an EntityFS LogAdapter.

^[1]Efficient here means few requests to the database backend.

Chapter 4. Database implementations

Table of Contents

SimpleDatabase
LoggingTransactionalDatabase
ShadowCopyTransactionalDatabase

Simple database

The SimpleDatabase is the simplest Database implementation. It does not have any special capabilities.

See Example 3.1, “Creating a database and inserting two records” for an example of how a SimpleDatabase can be created.

Transactional database with rollback log

The LoggingTransactionalDatabase is a transactional Database that keeps a rollback log during read/write transactions. If the transaction is rolled back, the rollback log is replayed to restore the database contents to what it was before the transaction. If the transaction is not completed before the JVM terminates, the rollback log is replayed when a new database object is created with the same rollback file. This ensures that the database can be restored to a consistent state in the event of a program crash.

Example 4.1. Creating a logging transactional database on a heap backend

// Create a LoggingTransactionalDatabase using a 
// LoggingTransactionalDatabaseBuilder object.
// Keep the database data in the File f and the rollback log in the File lf.
//
// If the rollback log file is not empty, the rollback log there is replayed to
// restore the database contents.
TransactionalDatabase<String, String> db =
  new LoggingTransactionalDatabaseBuilder<String, String>().
    // The logging transactional database needs serializers for writing data to
    // the rollback log.
    setKeySerializer(StringSerializer.INSTANCE).
    setValueSerializer(StringSerializer.INSTANCE).
    create(
      new HeapBackendBuilder<String, String>().
        setKeySerializer(StringSerializer.INSTANCE).
        setValueSerializer(StringSerializer.INSTANCE).
        // Adapt the File database file to a ReadWritableFile.
        create(new ReadWritableFileAdapter(f)),
      // Adapt the File log file to a ReadWritableFile.
      new ReadWritableFileAdapter(lf));

Transactional database using shadow copies

The ShadowCopyTransactionalDatabase uses a shadow copy of the database, i.e. a copy of the entire database, that it records modifications to during read/write transactions. If the transaction is committed, the original database is replaced with the shadow copy. If it is rolled back, the shadow copy is discarded. For large databases, this is really inefficient, but for small databases the performance is comparable to logging transactions.

Since the ShadowCopyTransactionalDatabase needs to be able to create a new DatabaseBackend object for each shadow copy, it is given a DatabaseBackendFactory for the backend when it is created instead of just one DatabaseBackend object.

Example 4.2. Creating a shadow copy transactional database on a heap backend

// Create a ShadowCopyTransactionalDatabase on the database EFile f
// and that keeps its shadow copy files in the Directory tmpDir.
//
// f or tmpDir must not be in a locking FileSystem. 
//
// The ShadowCopyTxnDatabaseFileManager is an interface that lets the
// database manipulate the backend's files without knowing exactly how many they
// are or how they are structured.

// A LogAdapter that logs to stdout and stderr.
LogAdapterHolder lah = 
  new LogAdapterHolder(
    new StdOutLogAdapter());

ShadowCopyTxnDatabaseFileManager fm =
  // There is an implementation tailored for the HeapBackend.
  new HeapBackendFileManager(f, tmpDir);

// The shadow copy transactional database uses a backend factory instead of a
// single DatabaseBackend object since it must be able to create new
// backend objects for its shadow copy files.
DatabaseBackendFactory<String, String, Long> dbf=
  new HeapBackendFactory<String, String>(
    // Key and value Serializer:s.
    StringSerializer.INSTANCE,
    StringSerializer.INSTANCE,
    lah);
    
TransactionalDatabase<String, String> db =
  new ShadowCopyTransactionalDatabase<String, String, Long>(fm, dbf, false, lah);

Chapter 5. Database backend implementations

Table of Contents

HeapBackend
ConstantRecordSizeHeapBackend
ConstantRecordSizeBPlusTreeBackend
MapBackend
BPlusTreeIndexBackend
LruCacheBackend

Heap backend

The HeapBackend stores data in a single database file. New records are added by appending them to the end of the file, and records are deleted by marking them as deleted in the file. After database records have been deleted, the file contains holes where the deleted records were stored. The holes are purged permanently if the backend is defragmented by calling the Database's compact() method.

When the client wants to search for a record in the heap backend, the backend iterates through the entire database file from the beginning until the requested record is found. The makes searches slow for large databases. Searches in large databases can be speeded up considerably by adding an index. See below.

See the preceding chapter for examples of how a HeapBackend can be created.

Constant record size heap backend

The ConstantRecordSizeHeapBackend is a variant of the HeapBackend for data records that have a constant size. Record insertion and searches works just like for the heap database – records are appended to the end of the file and searches involve scanning through the database file until the requested record is found. The difference between the two backends is in record deletion. This backend deletes a record by copying the last record in the database file to the place of the record to delete and then truncating the database file with one record. Hence, the database file never becomes defragmented.

Example 5.1. Creating a database using a constant record size heap backend

// Create a SimpleDatabase that uses a ConstantRecordSizeHeapBackend on the
// database File f. The database contains Integer keys and Long values.

// A LogAdapter that logs to stdout and stderr.
LogAdapterHolder lah = 
  new LogAdapterHolder(
    new StdOutLogAdapter());

Database<Integer, Long> db = 
  new SimpleDatabase<Integer, Long, Long>(
    // Use a ConstantRecordSizeHeapBackendBuilder object to build the backend.
    new ConstantRecordSizeHeapBackendBuilder<Integer, Long>().
      setKeySerializer(IntegerSerializer.INSTANCE).
      setValueSerializer(LongSerializer.INSTANCE).
      setLogAdapterHolder(lah).
      // Adapt the File to a ReadWritableFile.
      create(new ReadWritableFileAdapter(f)),
    lah);
  
// Insert a few values
db.insert(1, 1L);
db.insert(2, 1L);
db.insert(3, 2L);
db.insert(4, 3L);
db.insert(5, 5L);
db.insert(6, 8L);
db.insert(7, 13L);

Constant record size B+ tree backend

The ConstantRecordSizeBPlusTreeBackend stores its records, which must be of a constant size, in a B+ Tree (using the BPlusTree class). B+ Trees provide fast inserts, updates and searches for a small overhead in metadata. Read more about B+ Tree properties in the Wikipedia article on B+ Trees.

Example 5.2. Creating a database using a constant record size B+ Tree backend

// Create a SimpleDatabase that uses a ConstantRecordSizeBPlusTreeBackend on the
// database File f. The database contains Integer keys and String values. The
// strings may be up to nine characters long. Longer strings are truncated to
// nine characters.

// A LogAdapter that logs to stdout and stderr.
LogAdapterHolder lah = 
  new LogAdapterHolder(
    new StdOutLogAdapter());

// A SimpleDatabase that have Integer keys and String values.
Database<Integer, String> db = 
  new SimpleDatabase<Integer, String, KeyAndValue<Integer, String>>(
    new ConstantRecordSizeBPlusTreeBackend<Integer, String>(
      // The NodeRepository is used by the BPlusTree
      // Use a caching node repository.
      new LruCacheNodeRepository<Integer, String>(
        // Use a FileBackedNodeRepositoryBuilder to build the file-backed
        // node repository.
        // The Integer keys are Comparable, so we don't have to use a Comparator.
        new FileBackedNodeRepositoryBuilder<Integer, String>().
          // Use a node size of 4096 bytes for the B+ Tree. For optimal 
          // performance, this should be the same as the allocation size of the
          // underlying file system.
          setNodeSizeStrategy(new FixedSizeNodeSizeStrategy(4096)).
          // Have to use a null-aware serializer for the B+ Tree keys.
          setKeySerializer(IntegerNullSerializer.INSTANCE).
          // This value serializer will truncate text longer than 9 characters.
          setValueSerializer(
            new ConstantSizeStringSerializer(
              Charset.forName("ISO-8859-1"), 9, true)).
          // This pointer size sets the maximum size of the B+ Tree to
          // 2^(2*8) = 65536 bytes.
          setInternalPointerSize(2).
          setLogAdapterHolder(lah).
          // Adapt the File to a ReadWritableFile.
          create(new ReadWritableFileAdapter(f), false),
        // Set the maximum LRU cache size to 16 nodes.
        16),
      false, lah),
    lah);
  
// Insert a few values
db.insert(1, "apple");
db.insert(2, "banana");
// This value will be truncated to grapefrui when written to disk. As long as 
// its node remains in the LRU cache, it will be grapefruit, though. Be careful.
db.insert(3, "grapefruit");
db.insert(4, "orange");
db.insert(5, "tangerine");
db.insert(6, "pear");

Map backend

The MapBackend stores its data in a Map. It is perhaps most useful for testing.

B+ Tree index backend

The BPlusTreeIndexBackend creates a B+ Tree index for the positions of keys in another, proxied DatabaseBackend object. It can be used to speed up searches in backends that are slow to search through, such as the heap backend. The index is kept in a separate file.

The BPlusTree object requires that B+ Tree values should be of a constant size. To be able to accommodate proxied backend with both constant and variable size keys, the BPlusTreeIndexBackend uses a Hasher to hash the keys into a constant size value. If the proxied backend uses constant-size keys, a value-preserving hasher such as the LongToLongHasher or the ConstantSizeValueHasher can be used.

Example 5.3. Creating a B+ Tree index for a heap backend

// Create a new SimpleDatabase that uses a HeapBackend
// with a BPlusTreeIndexBackend to speed up searches in the database.
// The database will store String keys and values.
//  - f is a File where the database data will be stored.
//  - indf is a File where the B+ Tree index data will be stored.

// A LogAdapter that logs to stdout and stderr.
LogAdapterHolder lah = 
  new LogAdapterHolder(
    new StdOutLogAdapter());

Database<String, String> db = 
  new SimpleDatabase<String, String, Long>(
    // Use a BPlusTreeIndexBackendBuilder object to build the 
    // BPlusTreeIndexBackend.
    new BPlusTreeIndexBackendBuilder<String, Long>().
      // Hash the keys to a long value (eight bytes).
      // Beware of the birthday paradox! Use sufficiently large hash values.
      setKeyHasher(StringToLongHasher.INSTANCE).
      create(
        // Use a HeapBackendBuilder.
        new HeapBackendBuilder<String, String>().
          setKeySerializer(StringSerializer.INSTANCE).
          setValueSerializer(StringSerializer.INSTANCE).
          // Adapt f to a ReadWritableFile.
          create(new ReadWritableFileAdapter(f)),
        // The BPlusTree is <Long, Long> since it has Long keys (the key
        // hashes) and Long values (the positions in the HeapBackend).
        new BPlusTree<Long,Long>(
          // Use a LRU caching NodeRepository.
          new LruCacheNodeRepository<Long, Long>(
            // Use a FileBackedNodeRepositoryBuilder to build the
            // node repository.
            new FileBackedNodeRepositoryBuilder<Long, Long>().
              // Use a fixed node size of 4096 bytes. This is a common file
              // system allocation size.
              setNodeSizeStrategy(new FixedSizeNodeSizeStrategy(4096)).
              // Have to use a null-aware serializer for the keys.
              // This particular serializer uses Long.MIN_VALUE to represent
              // null, which means that that key value cannot be used at all in
              // the database.
              setKeySerializer(LongNullSerializer.INSTANCE).
              setValueSerializer(LongSerializer.INSTANCE).
              // An internal pointer size of 3 bytes sets the maximum size of
              // the B+ Tree to 2^(3*8) = 16777216 bytes.
              setInternalPointerSize(3).
              setLogAdapterHolder(lah).
              create(new ReadWritableFileAdapter(indf), false),
            // Use a cache size of 16 tree nodes.
            16),
          // We don't have to use a key Comparator (null) since the tree's keys
          // are Comparable themselves (Long:s). 
          null, lah)),
    lah);

// Insert a record.
db.insert("apples", "oranges");

LRU cache backend

The LruCacheBackend stores results from searches and updates to a proxied DatabaseBackend object in a LRU (Least Recently Used) cache with a configurable maximum size. If the cache is full when the backend tries to add a new object to it, the least recently object is purged from the cache to make room for the new object.

The backend also keeps a cache for negative search results; searches that did not find the key that the client was looking for.

Example 5.4. Creating a LRU cache and a B+ Tree index for a heap backend

// Create a new SimpleDatabase that uses a HeapBackend
// with a BPlusTreeIndexBackend and a LruCacheBackend
// to speed up searches in the database.
// The database will store String keys and values.
//  - f is a File where the database data will be stored.
//  - indf is a File where the B+ Tree index data will be stored.

// A LogAdapter that logs to stdout and stderr.
LogAdapterHolder lah = 
  new LogAdapterHolder(
    new StdOutLogAdapter());

Database<String, String> db = 
  new SimpleDatabase<String, String, Long>(
    new LruCacheBackend<String, String, Long>(
      // Use a BPlusTreeIndexBackendBuilder object to build the 
      // BPlusTreeIndexBackend.
      new BPlusTreeIndexBackendBuilder<String, BigInteger>().
        // Hash the strings to a twelve-byte value, with the most significant 
        // bit in the most significant byte removed. The most significant bit is
        // used to represent a null value by the FixedSizeBigIntegerNullSerializer
        // used below instead.
        setKeyHasher(new StringToBigIntegerHasher(12, 7)).
        create(
          // Use a HeapBackendBuilder.
          new HeapBackendBuilder<String, String>().
            setKeySerializer(StringSerializer.INSTANCE).
            setValueSerializer(StringSerializer.INSTANCE).
            // Adapt f to a ReadWritableFile.
            create(new ReadWritableFileAdapter(f)),
          // The BPlusTree is <Long, BigInteger> since it has BigInteger keys
          // (the key hashes) and Long values (the records' positions in the
          // HeapBackend).
          new BPlusTree<BigInteger, Long>(
            // Use a LRU caching NodeRepository.
            new LruCacheNodeRepository<BigInteger, Long>(
              // Use a FileBackedNodeRepositoryBuilder to build the
              // node repository.
              new FileBackedNodeRepositoryBuilder<BigInteger, Long>().
                // Use a fixed node size of 4096 bytes.
                setNodeSizeStrategy(new FixedSizeNodeSizeStrategy(4096)).
                // Have to use a null-aware serializer for the keys.
                //
                // This serializer uses a null value that has the most 
                // significant bit in the most significant byte set. We chose to
                // not use that particular bit for our hashes when we created
                // the StringToBigIntegerHasher above. By doing that we
                // have made sure that null won't collide with any real value.
                //
                // Set the total data size of the serializer to 13 bytes. The
                // first byte of the serialized data is the length of the
                // BigInteger data.
                setKeySerializer(new FixedSizeBigIntegerNullSerializer(13)).
                setValueSerializer(LongSerializer.INSTANCE).
                // An internal pointer size of 3 bytes sets the maximum size of
                // the B+ Tree to 2^(3*8) = 16777216 bytes.
                setInternalPointerSize(3).
                setLogAdapterHolder(lah).
                create(new ReadWritableFileAdapter(indf), false),
              // Use a cache size of 16 tree nodes.
              16),
            // We don't have to use a key Comparator (null) since the tree's keys
            // are Comparable themselves (BigInteger:s). 
            null, lah)),
      // The cache size and the negative cache size
      false, 256, 32),
    lah);

// Insert a record.
db.insert("Fiat", "Ritmo");

Chapter 6. Searching in a database

Table of Contents

Search modes
Cursors

The Database interface has several methods for searching:

get(key): Inherited from the Map interface. This method returns the value stored for a specific key in the database.
find(key): Returns a Cursor positioned at the database record with the specified key.
find(key, SearchMode): Returns a Cursor positioned at the database record matching the key and the SearchMode. The natural ordering of the database keys is used to find closest matches. This requires that the keys in the database are Comparable.
find(key, SearchMode, Comparator): Returns a Cursor positioned at the database record matching the key and the SearchMode. The supplied Comparator is used to find closest matches.
firstRecord(): Returns a Cursor positioned at the "first" database record. What "first" means depends on the search order of the database's DatabaseBackend.
lastRecord(): Returns a Cursor positioned at the "last" database record. What "last" means depends on the search order of the database's DatabaseBackend. This method is only supported by DatabaseBackend:s that support backwards iteration.

Search modes

There are four standard search modes that all databases and database backends support:

SearchMode.EXACT_MATCH: The find method will only return a Cursor if the key supplied to it exists in the database.
SearchMode.CLOSEST_ABOVE: Return a Cursor pointing to the database record with the closest larger key value than the supplied search key if no exact key match is found.
SearchMode.CLOSEST_BELOW: Return a Cursor pointing to the database record with the closest smaller key value than the supplied search key if no exact key match is found.
SearchMode.CLOSEST_MATCH: Return a Cursor pointing to the database record with the smallest absolute difference in key value compared with the supplied search key if no exact key match is found. If there are two records with the same key distance from the search key, any of those may be returned.

A database backend may also define its own, custom search modes.

By default, when used with any of the CLOSEST_X search modes, the find method iterates through the entire backend to find the closest match. However, if the backend has sorted keys (such as the ConstantRecordSizeBPlusTreeBackend) or has an index that sorts the keys (such as the BPlusTreeIndexBackend), find may use the sort order to find matches faster if it is called with the same Comparator that is used by the database backend (or with no Comparator if the backend uses the keys' natural ordering).

Cursors

A Cursor is a pointer to a record in the database. It can be used to navigate through the database records^[2]. In the example below, a Cursor is used to find all records with key values between 40 and 50 in a database. For this to work, the database backend must be one with sorted keys^[3], for instance a heap backend with a B+ Tree index.

Example 6.1. Using a cursor to find records to process

// db is a Database<Integer, String>
//
// Find the record with the key that has the closest larger value than 40
Cursor<Integer,String> c = db.find(40, SearchMode.CLOSEST_ABOVE);

// Maybe there is no such key in the database
if (c == null)
{
  System.out.println("Nothing found");
  return;
}

while(c.getKey() < 50)
{
  // Print the record
  System.out.println(c.getKey() + ": " + c.getValue());

  // Move to the next record
  c.next();
}

^[2]Unlike most other databases where cursors are used to navigate through result sets from queries, HeliDB Cursor:s are used for navigating through the entire database.

^[3]If the backend was not sorted, the Cursor.next() call could return any record from the database.

Appendix A. Custom Serializer implementation

This appendix contains an example of how a custom Serializer can be implemented and then used in a Database. The custom serializer, the MoneySerializer is used for serializing Money objects.

The Money object is an immutable Java bean containing an amount and a Currency.

package org.helidb.javabank;

public class Money
{
  private final long m_amount;
  private final Currency m_currency;
  
  public Money(long amount, Currency currency)
  {
    m_amount = amount;
    m_currency = currency;
  }
  
  public long getAmount()
  {
    return m_amount;
  }
  
  public Currency getCurrency()
  {
    return m_currency;
  }
}

Currency is an Enum.

package org.helidb.javabank;

// Persisting enums can be frustrating. There is a discussion of the topic in
// this Java Specialist's newsletter:
// http://www.javaspecialists.co.za/archive/Issue113.html
//
// This enum implementation uses a fairly naïve approach.  
public enum Currency
{
  KRONA(0), EURO(1), DOLLAR(2), POUND(3);
  
  // An unique index for each currency.
  private final int m_index;
  
  private Currency(int index)
  {
    m_index = index;
  }
  
  // Get this currency's unique index
  public int getIndex()
  {
    return m_index;
  }
  
  // Get the currency corresponding to the index value
  public static Currency valueOf(int index)
  {
    switch(index)
    {
      case 0: return KRONA;
      case 1: return EURO;
      case 2: return DOLLAR;
      case 3: return POUND;
      default: throw new IllegalArgumentException("Unknown index: " + index);
    }
  }
}

The MoneySerializer implementation uses serializers for primitive types to serialize the Money object's amount and Currency.

package org.helidb.javabank;

// This object serializes Money objects. It represents the currency with an
// unsigned byte. This limits the number of currencies to 256.
public class MoneySerializer implements Serializer<Money>
{
  // This object contains no mutable internal state, so this singleton instance
  // may be used instead of instantiating it.
  public static final MoneySerializer INSTANCE = new MoneySerializer();
  
  // The size of a long + an unsigned byte
  public static final int DATA_SIZE =
    LongSerializer.DATA_SIZE + UnsignedByteSerializer.DATA_SIZE; 

  public int getSerializedSize()
  {
    return DATA_SIZE;
  }  

  public boolean isNullValuesPermitted()
  {
    return false;
  }

  private Money interpretInternal(byte[] barr, int offset)
  {
    long amount = LongSerializer.INSTANCE.interpret(
      barr, 
      offset, 
      LongSerializer.DATA_SIZE);

    offset += LongSerializer.DATA_SIZE;
    
    Currency c = Currency.valueOf(
      UnsignedByteSerializer.INSTANCE.interpret(
        barr,
        offset,
        UnsignedByteSerializer.DATA_SIZE));

    return new Money(amount, c); 
  }
  
  public Money interpret(byte[] barr)
  {
    return interpretInternal(barr, 0);
  }
  
  public Money interpret(byte[] barr, int offset, int length)
  {
    if (length != DATA_SIZE)
    {
      throw new SerializationException("Invalid length " + length);
    }
    return interpretInternal(barr, offset);
  }

  public int serialize(Money m, byte[] barr, int offset)
  {
    LongSerializer.INSTANCE.serialize(
      m.getAmount(), 
      barr, 
      offset);

    offset += LongSerializer.DATA_SIZE;

    UnsignedByteSerializer.INSTANCE.serialize(
      (short) m.getCurrency().getIndex(),
      barr,
      offset);

    return DATA_SIZE;
  }
  
  public byte[] serialize(Money m)
  {
    byte[] res = new byte[DATA_SIZE];
    serialize(m, res, 0);
    return res;
  }
  
  private void validateSize(int size)
  {
    if (size != DATA_SIZE)
    {
      throw new SerializationException("Invalid data size " + size);
    }
  }
  
  public Money read(RandomAccess ra, int size)
  {
    validateSize(size);
    byte[] barr = new byte[DATA_SIZE];
    int noRead = ra.read(barr);
    if (noRead != DATA_SIZE)
    {
      throw new NotEnoughDataException(DATA_SIZE, noRead);
    }
    return interpret(barr);
  }
  
  public Money read(InputStream is, int size)
  {
    validateSize(size);
    byte[] barr = new byte[DATA_SIZE];
    try
    {
      int noRead = is.read(barr);
      if (noRead != DATA_SIZE)
      {
        throw new NotEnoughDataException(DATA_SIZE, noRead);
      }
    }
    catch (IOException e)
    {
      // Rethrow as an unchecked exception
      throw new WrappedIOException(e);
    }
    return interpret(barr);
  }  
}

This is how the MoneySerializer may be used in a database.

// Create a SimpleDatabase that uses a ConstantRecordSizeBPlusTreeBackend on the
// database File f. The database contains Long keys (account numbers?) and 
// Money values.

// A LogAdapter that logs to stdout and stderr.
LogAdapterHolder lah = 
  new LogAdapterHolder(
    new StdOutLogAdapter());

// A SimpleDatabase that have Long keys and Money values.
Database<Long, Money> db = 
  new SimpleDatabase<Long, Money, KeyAndValue<Long, Money>>(
    new ConstantRecordSizeBPlusTreeBackend<Long, Money>(
      // The NodeRepository is used by the BPlusTree
      // Use a caching node repository.
      new LruCacheNodeRepository<Long, Money>(
        // Use a FileBackedNodeRepositoryBuilder to build the file-backed
        // node repository.
        // The Long keys are Comparable, so we don't have to use a Comparator.
        new FileBackedNodeRepositoryBuilder<Long, Money>().
          // Use a node size of 4096 bytes for the B+ Tree. This is a common
          // file system allocation size.
          setNodeSizeStrategy(new FixedSizeNodeSizeStrategy(4096)).
          // Have to use a null-aware serializer for the B+ Tree keys.
          setKeySerializer(LongNullSerializer.INSTANCE).
          // Here is our MoneySerializer
          setValueSerializer(new MoneySerializer()).
          // This pointer size sets the maximum size of the B+ Tree to
          // 2^(2*8) = 65536 bytes.
          setInternalPointerSize(2).
          setLogAdapterHolder(lah).
          // Adapt the File to a ReadWritableFile.
          create(new ReadWritableFileAdapter(f), false),
        // Set the maximum LRU cache size to 16 nodes.
        16),
      false, lah),
    lah);
  
// Insert a value
db.insert(9019506L, new Money(1000000, Currency.KRONA));

Bibliography

[performance test results] Performance test results.

[Wikipedia article on ACID:ity] Wikipedia article on ACID:ity.

[Wikipedia article on B+ Trees] Wikipedia article on B+ Trees.