gov.nist.nlpir.irf.index.braf
Class PersistentIrfHashtable

java.lang.Object
  |
  +--gov.nist.nlpir.irf.index.braf.PersistentIrfHashtable

public class PersistentIrfHashtable
extends java.lang.Object
implements java.io.Serializable

Persistent version of IrfHashtable. This hashtable has been designed to remain only partly in memory, separating the hash space in slices that can be or not be in memory at one point. The coming to/back from disk of those slices is managed via a LFU-like algorithm thanks to slots receiving blocks.

Version:
$Revision: 1.3 $
Author:
This software was produced by NIST, an agency of the U.S. government, and by statute is not subject to copyright in the United States. Recipients of this software assume all responsibilities associated with its operation, modification and maintenance.
See Also:
DualKeyContainer, Hashtable, Serialized Form

Field Summary
private  int accesses
          Statistics on the Hashtable efficiency.
private  int blockInitialCapacity
          Blocks will be created with this parameter.
private  float blockLoadFactor
          Blocks will be created with this parameter.
private  int[] blockMap
          For each block number, gives the slot number where this block is, or -1 if the block isn't in memory.
private  HashBlock[] blocks
          This array is composed of slots.
private  int blockSize
          Size of each block computed knowing the number of blocks and the maximum number of hashcodes susceptible of being coded (Integer.MAX_VALUE).
private  int[] blocksUsed
          For each slot, gives the number of times it has been used since it was filled with a block.
private  int count
          Total number of elements stored in the blocks.
private  java.lang.String filePath
          The blocks will be stored on disk in files in the directory fully specified by filePath
private  java.lang.String filePrefix
          The blocks will be stored on disk in files named filePrefixOO where OO is a number from 0 to (numberOfBlocks - 1).
private  int hits
           
private  int[] inversedBlockMap
          for each slot number, gives the number of the block this slot contains, -1 if none.
(package private)  int numberOfBlocks
          Total number of blocks composing the hashtable.
private  int numberOfSlots
          Size of the previous array.
(package private) static long serialVersionUID
          serial version universal id - put here so Java does not insert one which may change due to revisions and make it impossible to deserialize earlier versions of serialized objects
 
Constructor Summary
PersistentIrfHashtable(java.lang.String filePath, java.lang.String filePrefix, int numberOfSlots, int numberOfBlocks, int blockInitialCapacity, float blockLoadFactor)
          Builds a PersistentIrfHashtable with the given charateristics:
 
Method Summary
 java.util.Enumeration elements()
          Exact same result as values().
 java.lang.Object get(java.lang.Object key)
          The comment for put is also valid here.
 java.lang.Object getActualKey(java.lang.Object key)
          With special hashCode() and equals() methods, you may want to know at one point what is the key you are really working with if you call get(thisKey) or put(thisKey, something).
(package private)  HashBlock getBlock(int blockNr)
          This method is only used by the PersistentIrfHashtable itself and by its enumerator.
 java.util.Enumeration keys()
          Returns the keys available in the hashtable for the get method.
 int length()
          Exact same result as size().
private  int load(int blockNr)
          This method finds out where to load a block with a LFU-like algorithm, loads the block and returns the slot number where it just loaded the block.
 java.lang.Object put(java.lang.Object key, java.lang.Object value)
          This put is the equivalent of the one in Hashtable.
private  HashBlock readBlock(int blockNr)
          This method is the symetric of writeBlock(), as it reads a HashBlock back from disk.
private  void readObject(java.io.ObjectInputStream in)
          The serializable interface method for matrialization.
 void showStatistics()
          Displays statistics about the table and its usage with the given format: PersistentIrfHashtable tableName: x hits for y accesses, ie z% hits.
 int size()
          Number of values stored in the persistent hashtable.
 java.util.Enumeration values()
          Returns the values stored in the hashtable, exactly like the elements() method.
private  void writeBlock(HashBlock toWrite, int blockNr)
          As the name tells, this method writes a block to disk, using a tuned serialization scheme with a BufferedRandomAccessFile.
private  void writeObject(java.io.ObjectOutputStream out)
          The serializable interface method for storage.
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait
 

Field Detail

serialVersionUID

static final long serialVersionUID
serial version universal id - put here so Java does not insert one which may change due to revisions and make it impossible to deserialize earlier versions of serialized objects

blocks

private transient HashBlock[] blocks
This array is composed of slots. Each of them may contain a block.

numberOfSlots

private int numberOfSlots
Size of the previous array. The more slots, the less swap but the more memory usage.

numberOfBlocks

int numberOfBlocks
Total number of blocks composing the hashtable.

blockSize

private int blockSize
Size of each block computed knowing the number of blocks and the maximum number of hashcodes susceptible of being coded (Integer.MAX_VALUE).

blockMap

private transient int[] blockMap
For each block number, gives the slot number where this block is, or -1 if the block isn't in memory.

inversedBlockMap

private transient int[] inversedBlockMap
for each slot number, gives the number of the block this slot contains, -1 if none.

blocksUsed

private transient int[] blocksUsed
For each slot, gives the number of times it has been used since it was filled with a block. Used to implement a kind of LFU algorithm.

filePath

private java.lang.String filePath
The blocks will be stored on disk in files in the directory fully specified by filePath

filePrefix

private java.lang.String filePrefix
The blocks will be stored on disk in files named filePrefixOO where OO is a number from 0 to (numberOfBlocks - 1).

count

private int count
Total number of elements stored in the blocks.

blockInitialCapacity

private int blockInitialCapacity
Blocks will be created with this parameter.

blockLoadFactor

private float blockLoadFactor
Blocks will be created with this parameter.

accesses

private transient int accesses
Statistics on the Hashtable efficiency.

hits

private transient int hits
Constructor Detail

PersistentIrfHashtable

public PersistentIrfHashtable(java.lang.String filePath,
                              java.lang.String filePrefix,
                              int numberOfSlots,
                              int numberOfBlocks,
                              int blockInitialCapacity,
                              float blockLoadFactor)
Builds a PersistentIrfHashtable with the given charateristics:
Parameters:
filePath - the path from root to the file(s)
filePrefix - the prefix that will be used for the names of the files where the differentblocks of the hashtable will be stored.
numberOfSlots - the more slots there is, the less swapping should occur but the more memory used.
numberOfBlocks - number of chunks for the hashtable. The more numerous they will be, the smaller and the faster to load, but again it may mean several swapping phases.
blockInitialCapacity - - as in @ref java.util.Hashtable
blockLoadFactor - - as in @ref java.util.Hashtable
Method Detail

put

public java.lang.Object put(java.lang.Object key,
                            java.lang.Object value)
This put is the equivalent of the one in Hashtable. The only restriction is that as it has been hardwired for IRF purposes, only ProxyFeatureLists can be stored with this method, so this persistent version of the hashtable can only store ProxyFeatureLists. But keys can be of whatever type you'd like. Just don't forget to have tuned hashCode() and equals() methods if you may need them.
See Also:
Hashtable.put(java.lang.Object, java.lang.Object), ProxyFeatureList

get

public java.lang.Object get(java.lang.Object key)
The comment for put is also valid here.
See Also:
Hashtable.get(java.lang.Object)

getBlock

HashBlock getBlock(int blockNr)
This method is only used by the PersistentIrfHashtable itself and by its enumerator. It would be private if it was possible. It implements the question "Where is this block of the hashtable I need ?".

load

private int load(int blockNr)
This method finds out where to load a block with a LFU-like algorithm, loads the block and returns the slot number where it just loaded the block.

writeBlock

private void writeBlock(HashBlock toWrite,
                        int blockNr)
As the name tells, this method writes a block to disk, using a tuned serialization scheme with a BufferedRandomAccessFile.
Parameters:
toWrite - the block that is to be written.
blockNr - the actual BLOCK number, NOT the SLOT number. Be careful.

readBlock

private HashBlock readBlock(int blockNr)
This method is the symetric of writeBlock(), as it reads a HashBlock back from disk. The special behavior is that if the file supposed to contain the block isn't found, this method assumes the block needs to be created and then returns a new empty HashBlock. This method also uses a tuned serialization mechanism with a BufferedRandomAccessFile.
Parameters:
blockNr - the number of the block to be loaded.
Returns:
the block read from disk, a new one if the block didn't already exist. Never null unless an IOException has occured.

keys

public final java.util.Enumeration keys()
Returns the keys available in the hashtable for the get method.
See Also:
Hashtable.keys()

values

public final java.util.Enumeration values()
Returns the values stored in the hashtable, exactly like the elements() method.
See Also:
Hashtable.elements()

size

public final int size()
Number of values stored in the persistent hashtable.

length

public final int length()
Exact same result as size().

getActualKey

public java.lang.Object getActualKey(java.lang.Object key)
With special hashCode() and equals() methods, you may want to know at one point what is the key you are really working with if you call get(thisKey) or put(thisKey, something). This method returns the actual object that will be used as a key if you give thisKey as a parameter to one of the two methods.

elements

public final java.util.Enumeration elements()
Exact same result as values().
See Also:
values()

writeObject

private void writeObject(java.io.ObjectOutputStream out)
                  throws java.io.IOException
The serializable interface method for storage.

readObject

private void readObject(java.io.ObjectInputStream in)
                 throws java.io.IOException,
                        java.lang.ClassNotFoundException
The serializable interface method for matrialization.

showStatistics

public void showStatistics()
Displays statistics about the table and its usage with the given format:
 PersistentIrfHashtable tableName: x hits for y accesses, ie z% hits.