Things to do

Incorporate a persistent storage engine (PSE) into the broker hierarchy

IRF is designed to minimize the effort required to use various means for supporting persistence. A file-based set of brokers have been implemented. Implement an additional set that use a PSE (e.g., ObjectStore PSE) to test the effort required. See the section Examples of persistence mechanism evolution for information on how far we got in changing IRF to use a PSE.

Making both the framework and the sample application more dynamic

A few brokers have references to classes so that they can create instances of those classes. Such information shouldn't bespread all over the place but should be centralized in a registering class that one could easily extend, for example with a Vector containing the names of the classes to be stored. An extending registering class would then just need to extend this Vector in its own constructor.
Right now, the method getBroker() is overriden in every proxy. It should exist only in VirtualProxy implemented in a generic way thanks to reflection, querying the IrfManager to know which broker it should use. Switching the persistence scheme would then be very easy: in IrfManager, there would be a Hashtable matching objects and the brokers they should use, so that this Hashtable would just have to be updated by an application wishing to use a different type of broker, even if it is only for one type of object. Let's note that different mechanisms are compatible, and that a particular persistence scheme may only require one broker.

A Broker for PersistentIrfHashtable

As a class working closely with disk, PersistentIrfHashtable needs specialized features for its underlying components, which are HashBlock, IoAddrIntern and DeIntern, in order for them to write themselves to disk and be able to come back. Right now, everything is hardwired in those components but it should change so that PersistentIrfHashtable could be stored differently. Currently, the class doesn't have a broker because it doesn't have a proxy, but we may change one or two of these affirmations.

Soft References

See the section Using Soft References to know how far we got in those tests.

A recursive makePersistent for VirtualProxy

makePersistent() is not currently recursive. Every proxy has to override it if the proxy references an object that contains proxies. We may think of a default recursive makePersistent() method, using reflection. If the name is kept, the problem will be the overhead of the reflective part, if a new name is given the problem is that everyone will have the choice of which method to call. In both cases, the method would start with a reflective enquiry looking for fields in the real object that are Proxies, and then checking for each of them whether they have a tuned makePersistent() method or not. If they have, call this one, otherwise call the default reflective one. The main problem here is that it would be much slower but would again slow down a little the application. Providing such a method could be OK to decrease the amount of time necessary to model something, so that it is possible afterwards to optimize methods to make them much faster, overriding the default methods (see the writeObject()/readObject() principle of BufferedRandomAccessFileBroker, or the AWT in general).

Flexible indexing

The type of index to be created is hard-coded in DocCollection.setDefaultIndexingModalities(). For IRF to be more readily extendible, support for the runtime configuration of the index type should be added. It might also be desirable to enable indexing of each modality by different means (e.g. KeyWord for title/author and IDF for body of text).

One-level hashing

Right now, there is still a mechanism allowing an eventual switch to a two-level hashing mechanism, a hashtable being only partly in memory. But because of an efficiency problem with this mechanism, only one table is now used, even if the mechanism is still present. Thus, the overhead for the two-level hash (computing the block number, extra call to the inner hash function, gathering of statistics) is still present without any advantage. If we go back to a single level hashing mechanism (like it was before), it could save a little time.

Object Manager

Allow the easy redefinition of buildX() by splitting all those methods in readCode and then buildXfromCode().

National Institute of Standards and Technology Home

Last updated:

Date created: Monday, 31-Jul-00
For further information contact Paul Over ([email protected]) with
copy to Darrin Dimmick ([email protected])