I think I know what a hash table is, and I think it would be very useful in an application that I'm writing. Unfortunatley there is very little documentation (reference material). Any help would be useful, Thanks!
I am keen to find out about hash tables. The problem I have is that the JAVA (1.2)api documentation says enough about hashtables to make me think they would be usefull to what I am doing but I don't actually know what a hashtable in general is so I don't understand the jargon. A general description would be much apreciated.
Imagine that you have a number spelled out like "sixteen" and you need to figure out what the actual integer is. You could make a long list of "if else if" statements comparing your string to "one", "two", "three", etc. This would be terribly slow and bulky. Another option might be to store the strings and integers in a big array and then perform a binary search on the array. Faster and not as bulky, but for most cases there is a better way: A hash table. First, suppose you have an algorithm that when you pass it a string, it will return a unique number from 0 to 4 billion. The smallest change in the string returns a very different number. Let us also suppose that this algorithm is very fast. Let's say that we have 200 strings we need to work with. I make an array to hold 500 strings and integers. For each string, I run the string through the algorithm and perform a modulo 500 on the result. I now have a value from 0 to 499 which I will use to be the index of the array to store my string and integer combination. The important thing here is that when I need to look up "sixteen", a computation tells me exactly where in the array I can find the value I need: 16. Very fast. There will be collisions when two different strings will access the same part of the array - but don't worry. The Java library takes care of that for you.
Joined: Jan 07, 1999
Sorry I wasn't more specific. But I'm in the same boat as Simon. Now in your explination, you made an array to hold 500 string and int definitions when there were only 200. Do you always need to make more than what you need, and why?
OK, I'll have a go. While Paul's description is correct as far as it goes, I think it misses a few points, which might be important. The first thing to keep in mind is that a Hashtable can be implemented internally in any way the original author likes, it's the external interface which is important - so don't get too hung up on whether it's a big or small array internally, or a linked-list, tree or whatever. If you are still interested. The key thing that makes Hashtables actually useful is that as a user, you don't need to worry about how many spaces ther are in the internal "array". If there is a clash (where two items happen to evaluate to the same slot number) then the Hashtable takes care of it, either by recalculating another slot to put it in, or by keeping a list (or tree etc.) of items which "hash" to the same slot, and searching that list when you attempt a "get". Hashtables are generally faster if there are plenty of free slots (although this can still go wrong if a large proportion of the items "hash" to the same slot anyway) but a hashtable with more slots obviously takes up more memory space. In Java, Hashtable is a very useful class, and most of the projects I have written have ended up with at least one, somewhere. It makes sense to use a Hashtable any time you want to store and retrieve more than one or two named data items. Hashtable might not be a good choice in certain circumstances where the data is very structured, the names are very similar, or the data is fixed at compile time. I've never found it necessary to tweak the settings of a Hashtable (just like I very rarely specify a size for a Vector). The default implementation seems to work fine for the vast majority of uses. Also keep in mind that if both the names and the values you wish to store are Strings, Properties is often a better choice than a raw Hashtable. It has methods which assume Strings and so don't need class casting and (best of all) it can be saved and loaded in a human-readable, human editable form. Frank.
Excellent post Frank! Matt - I think Frank answered your second question. Having 500 elements to store 200 things makes the hashtable work more efficiently. And in Java, you don't really need to worry about this. If you start adding lots of stuff, the object will make a bigger array and redistribute everything.
Joined: Jan 07, 1999
Paul and Frank, between both of your explinations and a little more research, I think I've got it! THANK YOU VERY MUCH! This UBB is GREAT! Matt
While hash tables are great for many things, note that they have a few disadvantages under certain circumstances. One is that the number of "probes" that are used to locate a particular entry usually is very small, one or two. But, if the hashing algorithm is poor (i.e. lots of hashing to the same location), then more probes have to be made. Thus for most lookups, the response time is very short, but for a few it might be much longer. Usually this is not an issue, but if you were designing a system where the time could never exceed some number, then things like balanced B-Trees are better, because given the number of entries in the table, the maximum number of searches goes like the log base 2 of the number. Although this is usually longer than a hash table lookup, its maximum can be determined. Again, hash tables are great for most things, but always check that the requirements don't preclude them! ------------------ Member WWISA