Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Hash tables

 
Flo Powers
Ranch Hand
Posts: 57
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, I'm wondering if there are any common uses of hash tables that are not too complex. Are hash tables mainly useful when dealing with large amounts of data where you do lookup often, but don't want the overhead of either searching sequentially or keeping array sorted for binary search?

Also, what are the most common ways of dealing with conflicts (e.g. overflow area searched sequentially at the end of array, or chaining entries a la linked list with each item containing "pointer" to subsequent entries with same hash key)?

I get the basic concept but can't quite work out when to use hash tables. Thanks!
[ August 15, 2005: Message edited by: Flo Powers ]
 
David Ulicny
Ranch Hand
Posts: 724
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Use Hashtable or HashMap when you need store and access data which are in[key;value] pairs.
 
Rick O'Shay
Ranch Hand
Posts: 531
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The most common use of a hash table is an associative array. Instead of gettting element 1 or 99 you get element "Bob" or "Alice" for example. it returns the mapping. A hash map is used because it performs those lookups quickly. Specifically, log(n). The search time scales on a log scale so it takes twice as long to search 1000 items versus 100 items, not 900 times longer.
 
Ernest Friedman-Hill
author and iconoclast
Marshal
Pie
Posts: 24211
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Rick O'Shay:
twice as long to search 1000 items versus 100 items, not 900 times longer.


Twice the time is correct, but the linear alternative is 10 times longer, not 900 times.
 
Rick O'Shay
Ranch Hand
Posts: 531
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for correction. Should have just said if it takes T units of time for one item it takes 1000 * T in linear and log(1000) * T in a hash table. That assumes no duplicates and the algorithm itself may exceed log(n).

BTW, anybody reading that who is concerned about dupes, don't be. Hash tables always manage that transparently for you. You only need to be aware that when there are a lot of duplicates you start to approach a linear search time since the hash table just store dupes in an array.

In this "pictorial" example the C and E buckets have dupes.

A0
B0
C0 C1 C2
D0
E0 E1 E2 E3 E4
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic