org.helidb.lang.hasher
Interface Hasher<V,H extends Comparable<H>>

Type Parameters:
V - The type of values hashed.
H - The type of the hash values.
All Known Implementing Classes:
AbstractMessageDigestStringHasher, CharacterToCharacterHasher, ConstantSizeValueHasher, IntegerToIntegerHasher, LongToLongHasher, ShortToShortHasher, StringToBigIntegerHasher, StringToLongHasher

public interface Hasher<V,H extends Comparable<H>>

A Hasher is used for converting values into a hash. The hash should have the following properties:

Hash objects are required to be Comparable.

Some Hasher implementations support an optional bit mask for the most significant byte (the zeroth byte) of the hash. This can be used to mask away the most significant bit of that byte in order to reserve a value range for special values such as null. Most null-capable serializers have a default null value with the most significant byte set. See the documentation for individual Serializer implementations for details.

Hasher implementations may or may not support hashing null values.

Be sure to select a long enough hash value when using hashes that must be unique. For hash functions that spread their values well, there is still a surprisingly high risk that values collide if the value sets are large (the birthday paradox). See, for instance, this Wikipedia article.

Implementations of this interface should not contain any mutable internal state. They may be used concurrently by several threads and should preferably not have to resort to any synchronization to be able to be used in that way.

Since:
1.0
Author:
Karl Gustafsson
In_jar:
helidb-core

Method Summary
 H fromBytes(byte[] barr)
          Interpret the serialized hash value.
 int getHashLength()
          Get the size of the hash in bytes.
 H hash(V o)
          Hash the supplied value.
 boolean isPreservingValues()
          Does the hashing operation not modify the hashed value at all? This is true if v.equals(hasher.hash(v)) for all values v accepted by the hasher.
 byte[] toBytes(H hash)
          Serialize the hash to a byte array.
 

Method Detail

hash

H hash(V o)
                             throws HashException
Hash the supplied value.

Hashing may be a fairly expensive operation, depending on the implementation. Clients should try to reuse the result from this method as much as possible.

Parameters:
o - The value to hash.
Returns:
The hash value.
Throws:
HashException - If an error occurs when hashing.

toBytes

byte[] toBytes(H hash)
               throws NullPointerException
Serialize the hash to a byte array.

For a given Hasher object, the byte arrays returned by this method must always be of the same size.

Parameters:
hash - The hash to serialize.
Returns:
The serialized hash.
Throws:
NullPointerException - If the hash is null and this Hasher implementation does not support null values.

fromBytes

H fromBytes(byte[] barr)
                                  throws IllegalArgumentException
Interpret the serialized hash value.

Parameters:
barr - The serialized hash value.
Returns:
The hash value.
Throws:
IllegalArgumentException - If the serialized hash value could not be interpreted.

getHashLength

int getHashLength()
Get the size of the hash in bytes. This is the size of the byte array created by toBytes(Comparable) and accepted by fromBytes(byte[]).

Returns:
The size of the hash, in number of bytes.

isPreservingValues

boolean isPreservingValues()
Does the hashing operation not modify the hashed value at all? This is true if v.equals(hasher.hash(v)) for all values v accepted by the hasher.

Returns:
true if this hasher preserves the values that it hashes.