Hash tables

Ranch Hand

Posts: 57

posted 18 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Hi, I'm wondering if there are any common uses of hash tables that are not too complex. Are hash tables mainly useful when dealing with large amounts of data where you do lookup often, but don't want the overhead of either searching sequentially or keeping array sorted for binary search?

Also, what are the most common ways of dealing with conflicts (e.g. overflow area searched sequentially at the end of array, or chaining entries a la linked list with each item containing "pointer" to subsequent entries with same hash key)?

I get the basic concept but can't quite work out when to use hash tables. Thanks!
[ August 15, 2005: Message edited by: Flo Powers ]

David Ulicny

Ranch Hand

Posts: 724

posted 18 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Use Hashtable or HashMap when you need store and access data which are in[key;value] pairs.

SCJP<br />SCWCD <br />ICSD(286)<br />MCP 70-216

Rick O'Shay

Ranch Hand

Posts: 531

posted 18 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

The most common use of a hash table is an associative array. Instead of gettting element 1 or 99 you get element "Bob" or "Alice" for example. it returns the mapping. A hash map is used because it performs those lookups quickly. Specifically, log(n). The search time scales on a log scale so it takes twice as long to search 1000 items versus 100 items, not 900 times longer.

Ernest Friedman-Hill

author and iconoclast

Posts: 24207

I like...

posted 18 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Originally posted by Rick O'Shay:
twice as long to search 1000 items versus 100 items, not 900 times longer.

Twice the time is correct, but the linear alternative is 10 times longer, not 900 times.

[Jess in Action][AskingGoodQuestions]

Rick O'Shay

Ranch Hand

Posts: 531

posted 18 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Thanks for correction. Should have just said if it takes T units of time for one item it takes 1000 * T in linear and log(1000) * T in a hash table. That assumes no duplicates and the algorithm itself may exceed log(n).

BTW, anybody reading that who is concerned about dupes, don't be. Hash tables always manage that transparently for you. You only need to be aware that when there are a lot of duplicates you start to approach a linear search time since the hash table just store dupes in an array.

In this "pictorial" example the C and E buckets have dupes.

A0
B0
C0 C1 C2
D0
E0 E1 E2 E3 E4

Don't get me started about those stupid light bulbs.