org.helidb.util.bplus
Class FileBackedNodeRepository<K,V>

java.lang.Object
  extended by org.helidb.util.bplus.FileBackedNodeRepository<K,V>
Type Parameters:
K - The type of keys in the tree.
V - The type of values in the tree.
All Implemented Interfaces:
NodeRepository<K>

public class FileBackedNodeRepository<K,V>
extends Object
implements NodeRepository<K>

This NodeRepository stores data in a file. It uses a key and a value Serializer to serialize values and interpret serialized values. Keys and values must serialize to byte arrays of a constant size.

The size allocated for internal pointers in the tree is configurable, between one and eight bytes. Shorter pointers means that less space is allocated for keeping them, but also makes the maximum size of the tree smaller. For pointer size n, the maximum size of the tree is 2^(n*8) bytes, up to a maximum size of Long.MAX_VALUE bytes. (The last bit of the last byte is wasted since long is signed.)

Leaf nodes may be configured to hold pointers to their adjacent leaf nodes. In that case the B+ Tree using this node repository is Iterable.

Since:
1.0
Author:
Karl Gustafsson
See Also:
BPlusTree, FileBackedNodeRepositoryBuilder
In_jar:
helidb-core

Field Summary
static int DEFAULT_BUFFER_SIZE
          The default size of in-memory buffers used by writeContentsTo(OutputStream) is 65535 bytes.
 
Constructor Summary
FileBackedNodeRepository(ReadWritableFile f, boolean readOnly, long startPosOfData, NodeSizeStrategy nss, boolean leafNodeHasPointersToAdjacentLeafNodes, Serializer<K> keySerializer, Serializer<V> valueSerializer, int internalPointerSize, int bufSize, Comparator<? super K> keyComparator, LogAdapterHolder lah)
          Constructor.
 
Method Summary
 long bookPositionForLeafNode()
          Book a position for a future leaf node.
 long bookPositionForNonLeafNode()
          Book a position for a future non-leaf node.
 void clear()
          Remove all contents from the node repository.
 void close()
          Close the node repository and release all its allocated resources.
 boolean compact()
          Defragment the node repository to recover wasted disk space.
 void deleteNode(long pos, boolean leafNode)
          Delete the node at the supplied position.
protected  void finalize()
          Close the file if it is still open before discarding the object.
 int getMaxNumberOfRecordsForLeafNode(long pos)
          Get the maximum number of records for a leaf node in the tree at the supplied position.
 int getMaxNumberOfRecordsForNonLeafNode(long pos)
          Get the maximum number of records for a non-leaf node in the tree at the supplied position.
 long getPositionOfRootNode()
          Get the root node's position in the node repository.
 boolean leafNodeHasPointersToAdjacentLeafNodes()
          Does leaf nodes have pointers to their adjacent leaf nodes? If so, the tree supports iteration.
 BPlusTreeNode<K,?> readNode(long pos, K keyForFirstPointer)
          Read the node at the supplied position.
 void replaceContentsWith(InputStream is, long dataLen)
          Replace the contents of the node repository with data read from the stream.
 void returnBookedLeafNodePosition(long p)
          Return a previously booked position for a leaf node.
 void returnBookedNonLeafNodePosition(long p)
          Return a previously booked position for a non-leaf node.
 void verifyRepositoryIntegrity()
          Verify the node repository's integrity.
 long writeContentsTo(OutputStream os)
          Write the contents of the node repository to the stream.
 K writeNode(BPlusTreeNode<K,?> node)
          Write the node.
 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_BUFFER_SIZE

public static final int DEFAULT_BUFFER_SIZE
The default size of in-memory buffers used by writeContentsTo(OutputStream) is 65535 bytes.

See Also:
Constant Field Values
Constructor Detail

FileBackedNodeRepository

public FileBackedNodeRepository(ReadWritableFile f,
                                boolean readOnly,
                                long startPosOfData,
                                NodeSizeStrategy nss,
                                boolean leafNodeHasPointersToAdjacentLeafNodes,
                                Serializer<K> keySerializer,
                                Serializer<V> valueSerializer,
                                int internalPointerSize,
                                int bufSize,
                                Comparator<? super K> keyComparator,
                                LogAdapterHolder lah)
                         throws IllegalArgumentException
Constructor. The FileBackedNodeRepositoryBuilder is much more convenient to use than this constructor.

Parameters:
f - The file where tree data is stored. If the file is empty, this constructor will write the root node to it, even if the repository is read only.
readOnly - Should the repository be read only?
startPosOfData - The position in the file where the data starts. All data before this position is ignored by this node repository.
nss - The strategy that is used to determine the size of the tree's nodes.
leafNodeHasPointersToAdjacentLeafNodes - Should leaf nodes have pointers to their adjacent leaf nodes? If so, the B+ Tree using this node repository will be Iterable.
keySerializer - The Serializer used for serializing the tree's keys. This must permit null values since null is used internally by the repository.
valueSerializer - The Serializer used for serializing the tree's values. Does only have to permit null values if the tree should support them.
internalPointerSize - The size of internal pointers, in bytes.
bufSize - The size of in-memory buffers used by writeContentsTo(OutputStream). 65535 is probably a good value.
keyComparator - Comparator used for comparing key values. If this is set, the Comparator is used to determine the locations of the keys in the tree. If it is not set, the keys themselves must be Comparable.
lah - A log adapter holder.
Throws:
IllegalArgumentException - If the key or value serializers do not produce constant size values, if the internal pointer size does not fall in the range 1 <= internalPointerSize <= 8, if the temporary buffer size is less than 1, or the node size as determined by the NodeSizeStrategy does not accommodate two records per node.
Method Detail

getMaxNumberOfRecordsForLeafNode

public int getMaxNumberOfRecordsForLeafNode(long pos)
Description copied from interface: NodeRepository
Get the maximum number of records for a leaf node in the tree at the supplied position. This number is determined by the NodeRepository's NodeSizeStrategy, and by what extra metadata the NodeRepository stores for each leaf nodes.

Specified by:
getMaxNumberOfRecordsForLeafNode in interface NodeRepository<K>
Parameters:
pos - The position of the leaf node.
Returns:
The maximum number of records for the leaf node at the supplied position

getMaxNumberOfRecordsForNonLeafNode

public int getMaxNumberOfRecordsForNonLeafNode(long pos)
Description copied from interface: NodeRepository
Get the maximum number of records for a non-leaf node in the tree at the supplied position. This number is determined by the NodeRepository's NodeSizeStrategy, and by what extra metadata the NodeRepository stores for each non-leaf nodes.

Specified by:
getMaxNumberOfRecordsForNonLeafNode in interface NodeRepository<K>
Parameters:
pos - The position of the non-leaf node.
Returns:
The maximum number of records for the non-leaf node at the supplied position

getPositionOfRootNode

public long getPositionOfRootNode()
Description copied from interface: NodeRepository
Get the root node's position in the node repository. The root node is the beginning of all searches, etc.

Specified by:
getPositionOfRootNode in interface NodeRepository<K>
Returns:
The position of the root node in the node repository.

leafNodeHasPointersToAdjacentLeafNodes

public boolean leafNodeHasPointersToAdjacentLeafNodes()
Description copied from interface: NodeRepository
Does leaf nodes have pointers to their adjacent leaf nodes? If so, the tree supports iteration.

Specified by:
leafNodeHasPointersToAdjacentLeafNodes in interface NodeRepository<K>
Returns:
true if leaf nodes have pointers to their adjacent leaf nodes.

bookPositionForLeafNode

public long bookPositionForLeafNode()
Description copied from interface: NodeRepository
Book a position for a future leaf node. The position must be used for writing a new node, or returned by calling NodeRepository.returnBookedLeafNodePosition(long).

Specified by:
bookPositionForLeafNode in interface NodeRepository<K>
Returns:
The position of the booked leaf node.
See Also:
NodeRepository.bookPositionForNonLeafNode(), NodeRepository.returnBookedLeafNodePosition(long)

returnBookedLeafNodePosition

public void returnBookedLeafNodePosition(long p)
Description copied from interface: NodeRepository
Return a previously booked position for a leaf node. This is used when recovering from an error that prevented us to write a new leaf node.

Specified by:
returnBookedLeafNodePosition in interface NodeRepository<K>
Parameters:
p - The position to return.
See Also:
NodeRepository.bookPositionForLeafNode(), NodeRepository.returnBookedNonLeafNodePosition(long)

bookPositionForNonLeafNode

public long bookPositionForNonLeafNode()
Description copied from interface: NodeRepository
Book a position for a future non-leaf node. The position must be used for writing a new node, or returned by calling NodeRepository.returnBookedNonLeafNodePosition(long).

Specified by:
bookPositionForNonLeafNode in interface NodeRepository<K>
Returns:
The position of the booked node.
See Also:
NodeRepository.bookPositionForLeafNode(), NodeRepository.returnBookedNonLeafNodePosition(long)

returnBookedNonLeafNodePosition

public void returnBookedNonLeafNodePosition(long p)
Description copied from interface: NodeRepository
Return a previously booked position for a non-leaf node. This is used when recovering from an error that prevented us to write a new non-leaf node.

Specified by:
returnBookedNonLeafNodePosition in interface NodeRepository<K>
Parameters:
p - The position to return.
See Also:
NodeRepository.bookPositionForNonLeafNode(), NodeRepository.returnBookedLeafNodePosition(long)

readNode

public BPlusTreeNode<K,?> readNode(long pos,
                                   K keyForFirstPointer)
Description copied from interface: NodeRepository
Read the node at the supplied position.

Specified by:
readNode in interface NodeRepository<K>
Parameters:
pos - The position to read the node at. The root node has position 0.
keyForFirstPointer - The key for the node's first pointer, if this is a non-leaf node. The key for the root node's first pointer is null.
Returns:
The node, or null if the node position is beyond the end of the file.

writeNode

public K writeNode(BPlusTreeNode<K,?> node)
Write the node.

Specified by:
writeNode in interface NodeRepository<K>
Parameters:
node - The node to write.
Returns:
The first key in the node. null for the root node if it is empty.

deleteNode

public void deleteNode(long pos,
                       boolean leafNode)
Description copied from interface: NodeRepository
Delete the node at the supplied position.

Specified by:
deleteNode in interface NodeRepository<K>
Parameters:
pos - The position of the node to delete.
leafNode - Is the node to delete a leaf node?

clear

public void clear()
Description copied from interface: NodeRepository
Remove all contents from the node repository.

Specified by:
clear in interface NodeRepository<K>

compact

public boolean compact()
Description copied from interface: NodeRepository
Defragment the node repository to recover wasted disk space.

Specified by:
compact in interface NodeRepository<K>
Returns:
true if the node repository was defragmented, or false if there was nothing to defragment right now.

writeContentsTo

public long writeContentsTo(OutputStream os)
Description copied from interface: NodeRepository
Write the contents of the node repository to the stream.

Specified by:
writeContentsTo in interface NodeRepository<K>
Parameters:
os - The stream to write to. The stream is not closed after writing.
Returns:
The number of bytes written.

replaceContentsWith

public void replaceContentsWith(InputStream is,
                                long dataLen)
Description copied from interface: NodeRepository
Replace the contents of the node repository with data read from the stream.

Specified by:
replaceContentsWith in interface NodeRepository<K>
Parameters:
is - The stream to read from. The stream is not closed after reading.
dataLen - The total number of bytes to read from the stream.

close

public void close()
Description copied from interface: NodeRepository
Close the node repository and release all its allocated resources.

Specified by:
close in interface NodeRepository<K>

finalize

protected void finalize()
                 throws Throwable
Close the file if it is still open before discarding the object.

Overrides:
finalize in class Object
Throws:
Throwable

verifyRepositoryIntegrity

public void verifyRepositoryIntegrity()
Description copied from interface: NodeRepository
Verify the node repository's integrity. This should only be used for testing purposes.

Specified by:
verifyRepositoryIntegrity in interface NodeRepository<K>