jshashtable 2.1
Contents
Introduction
jshashtable is a JavaScript implementation of a hash table. It associates objects ("keys") with other objects ("values"). Each key is associated with precisely one value.
Version 2.0 of jshashtable features significant performance
improvements over version 1.0 (typically 200% faster for put
and get
operations) and adds a few new methods: clone()
, putAll()
,
entries()
and each()
. Version 2.1 is minor upgrade that allows use of ActiveX
objects as keys in IE.
Set-up
-
Download the code
Download jshashtable. You can download a compressed or uncompressed version of
jshashtable.js
which are functionally identical or a zip containing this document and both JavaScript files. -
Include jshashtable in your page
Include
jshashtable.js
in a script tag in your page. This file creates one object in the global scope, which isHashtable
. -
Create your hash table
Create your hash table, as in the example below. Any non-null JavaScript object can be used as a key or a value.
<script type="text/javascript" src="jshashtable.js"></script> <script type="text/javascript"> var typesHash = new Hashtable(); typesHash.put("A string", "string"); typesHash.put(1, "number"); var o = new Object(); typesHash.put(o, "object"); alert( typesHash.get(o) ); // "object" </script>
Usage
jshashtable is a JavaScript implementation of a hash table. It associates objects ("keys") with other objects ("values"). Each key is associated with precisely one value.
The code is contained in one JavaScript file, which creates a single constructor function called
Hashtable
in the global scope.
Doesn't JavaScript already do this?
No, although JavaScript's built-in objects do provide this functionality using the square brackets notation for properties, provided your keys are strings or numbers:
var key = "A key"; var o = new Object(); o[key] = 1;
However, it is often desirable to use other kinds of objects as keys. JavaScript doesn't complain if you try to use objects other than strings and numbers inside square brackets; indeed, it superficially looks like this kind of association works for any object:
var key = new Object(); var o = new Object(); o[key] = "First"; alert( o[key] ); // Alerts "First"
It doesn't take much effort to discover the flaw in this approach:
var key1 = new Object(); var key2 = new Object(); var o = new Object(); o[key1] = "First"; o[key2] = "Second"; alert( o[key1] ); // Alerts "Second", not "First"
The reason for this is that JavaScript silently calls toString()
on the object
being passed in as the key and uses the result as the property name. In the example above,
key1
and key2
both become "[object Object]", hence the second
assignment simply replaces the original value of the property "[object Object]".
With jshashtable, any object can be used as the key:
var key1 = new Object(); var key2 = new Object(); var h = new Hashtable(); h.put(key1, "First"); h.put(key2, "Second"); alert( h.get(key1) ); // Alerts "First" alert( h.get(key2) ); // Alerts "Second"
The above examples shows use of put()
to add a key/value pair to
the hash table and get()
to retrieve a value. If a value already
exists in the hash table for the key used, it is replaced with the new value.
Equality
This is all very well but there is a drawback. Imagine you have a hash table of colour values at
various different positions on the screen. Each position on the screen is represented by a
Point
object, which has properties x
and y
, representing
the x and y coordinates of the point on the screen.
function Point(x, y) { this.x = x; this.y = y; } var coloursForPoints = new Hashtable(); function getColourAt(x, y) { var point = new Point(x, y); return coloursForPoints.get(point); } coloursForPoints.put( new Point(1, 2), "green" ); alert( getColourAt(1, 2) ); // Alerts null
Why do we get null
? Because the Point
object that gets created in
getColourAt
is not the self same object as the original Point
that got
used as key in the coloursForPoints
hash table. This is clearly not the ideal
behaviour - any two Point
objects with the same x and y values are to all intents
and purposes the same thing. What we need is a way of defining when two objects are equal. By
default, jshashtable uses the strict equals (===
) operator
in JavaScript. However, if one of the objects being compared has an equals
method,
it will use that instead. We can implement an equals
method in the above example to
get the behaviour we want:
function Point(x, y) { this.x = x; this.y = y; } Point.prototype.equals = function(obj) { return (obj instanceof Point) && (obj.x === this.x) && (obj.y === this.y); }; var coloursForPoints = new Hashtable(); function getColourAt(x, y) { var point = new Point(x, y); return coloursForPoints.get(point); } coloursForPoints.put( new Point(1, 2), "green" ); alert( getColourAt(1, 2) ); // Alerts "green"
Hash codes
This works but is still not quite ideal. Internally, jshashtable stores
key/values pairs in arrays, called buckets. When put()
is called, jshashtable converts the key into a hash code, and
stores the key/value pair in the bucket for that particular hash code. A hash code in
jshashtable is a string so that the buckets themselves can be
associated with hash codes using an object and JavaScript's built-in string property names. When
get()
is called, jshashtable finds
the correct bucket for the key it's looking for and then searches the contents of that bucket
for that key. Since the process of locating the correct bucket is massively faster than
searching through a bucket's contents, it is most efficient to have as many buckets as possible
containing the least possible number of items (ideally one). My tests have shown that for a hash
table with 1000 elements, it is around 70 times faster to replace or retrieve a
value if each key has a unique hash code than if each key had the same hash code (tested using
objects with a very simple hashCode
method). For 10000 elements, it's closer to 300
times faster. So, generating meaningful hash codes for keys makes the hash table much more
efficient.
jshashtable generates a hash code for an object by checking to see if
it has a hashCode()
method and using that if it exists. Otherwise it calls
toString()
on the object, like JavaScript's built-in square bracket property
accessor does. In the above example, Point
has no hashCode
method, so
every point placed in coloursForPoints
goes into the "[object Object]" bucket,
which is very inefficient, particularly with a large number of points. What would be better
would be to implement a hashCode()
method on Point
that returns a
different value for every distinct point but the same value for equal points (this is very
important: objects that are considered equal according to their equals
method must
return the same hash code). Something like the following would be good:
Point.prototype.hashCode = function(obj) { return "Point:" + this.x + "," + this.y; };
So for our example point, this returns "Point:1,2", for any Point
object with x
and y coordinates of 1 and 2 respectively. Every point will therefore go in its own bucket and
the hash table is efficient.
The "Point" at the start of the hash code is there to distinguish points from any other similar object that may have x and y coordinates and potentially have the same hash code. Note that even if a non-Point object did return a hash code of "Point:1,2", the hash table would still work fine, since the non-Point object would fail the equality test when searching through the "Point:1,2" bucket.
If you don't want to violate the keys of your hash table by giving them extra methods like
equals
and hashCode
, you can instead pass in functions into the
Hashtable
constructor to generate hash codes and test key equality. The hash code
generator function is passed an object and should return a hash code for that object, while the
equality function is passed two objects and should return a boolean value representing whether
the two objects are equal. This also has the advantage of improving performance, since the hash
table does not have to check for the existence of equals
and hashCode
methods on every key object.
For a hash table where you knew in advance that all the keys would be Point
objects, the previous example could be rewritten to be more efficient as follows:
function Point(x, y) { this.x = x; this.y = y; } function hashPoint(p) { return "Point:" + p.x + "," + p.y; } function pointsEqual(p1, p2) { return p1.x === p2.x && p1.y === p2.y; } var coloursForPoints = new Hashtable(hashPoint, pointsEqual); function getColourAt(x, y) { var point = new Point(x, y); return coloursForPoints.get(point); } coloursForPoints.put( new Point(1, 2), "green" ); alert( getColourAt(1, 2) ); // Alerts green
Public API
Constructors
-
Hashtable()
Creates a new, empty hash table.
-
Hashtable(Function hashingFunction, Function equalityFunction)
Parameters:-
hashingFunction
A function that provides hash codes for keys placed in the hash table. It is passed the object to be hashed as its only parameter. If not provided, the hash table checks whether the object has ahashCode
method, and if not, callstoString
on the object. -
equalityFunction
A function that checks for equality between two keys with the same hash code. Two keys that are considered equal will map to the same value in the hash table. This function is passed the two objects to be compared as its parameters. If not provided, the hash table checks whether either object being compared has anequals
method, and if not, compares the objects using the===
operator.
-
Methods
-
void put(Object key, Object value)
Updated in version 2.0: now returns previous value associated with the key
Sets the value associated with the key supplied. If the hash table already contains the key then the old value is overwritten and the old value is returned, otherwise
null
is returned. -
void putAll(Hashtable hashtable[, Function conflictCallback])
New in version 2.0
Adds all entries from the supplied hash table to this hash table. For any key in the supplied hash table for which an entry already exists in this hash table, the optional callback function
conflictCallback
is called to resolve the conflict. This function should accept three parameters:key
: the key for the conflicting entry;thisValue
: the current value for this key in the current hash table;value
: the value for this key in the hash table supplied.
The value returned by the callback function will be used as the value for the new entry in the current hash table. If no callback function is supplied, the existing value in the current hash table will be overwritten by the value in the hash table supplied.
-
void get(Object key)
Returns the value associated with the key supplied, or
null
if no value is found for that key. -
Boolean containsKey(Object key)
Returns whether the hash table contains the specified key.
-
Boolean containsValue(Object value)
Returns whether the hash table contains the specified value.
-
void clear()
Removes all entries from the hash table.
-
Boolean isEmpty()
Returns true if the hash table contains no key/value pairs.
-
Array keys()
Returns an array containing all the keys contained in the hash table.
-
Array values()
Returns an array containing all the values contained in the hash table.
-
Array entries()
New in version 2.0
Returns an array containing all the entries contained in the hash table. Each entry is a two element array containing the key and value respectively for that entry.
-
void remove(Object key)
Updated in version 2.0: now returns the value associated with the removed key
Removes the key and its corresponding value from the hash table.
-
Number size()
Returns the number of key/value pairs contained in the hash table.
-
Hashtable clone()
New in version 2.0
Creates and returns a shallow copy of the hash table. If hashing and equality functions were provided to the hash table when it was constructed, they are passed into the new hash table.
-
void each(Function callback)
New in version 2.0
Iterates over all the entries in the hash table, calling the
callback
function for each entry. This function is passed two parameters:key
: the key for the current entry;value
: the value for the current entry.