Google Answers Logo
View Question
 
Q: C++ hash table help ( Answered 4 out of 5 stars,   0 Comments )
Question  
Subject: C++ hash table help
Category: Computers > Programming
Asked by: lighthousedweller-ga
List Price: $10.00
Posted: 01 May 2003 19:32 PDT
Expires: 31 May 2003 19:32 PDT
Question ID: 198221
hi there-

i'm trying to provide fast data access using a disc file hash table in C/C++.
this should be done using something like a dictionary input file. so i
need some help finding a basic input dictionary file and also with
writing the program to do this. also i would really appreciate
commented code to help me understand how it works better. thanks for the help.
Answer  
Subject: Re: C++ hash table help
Answered By: dogbite-ga on 01 May 2003 20:31 PDT
Rated:4 out of 5 stars
 
Hello lighthousedweller-ga,

  Okay, I have put code with a simple
  hash table and a simple data file here:

http://nms.lcs.mit.edu/~gch/google/hashtable.tar.gz

  Please let me know how it works for you.

         dogbite-ga

Clarification of Answer by dogbite-ga on 01 May 2003 20:34 PDT
Hi again lighthousedweller-ga,

  I thought I should mention that I explained
  how to compile and run the program at the
  top of "hashtable.cpp."  Also, there are
  comments in the code to help you understand it.

           dogbite-ga

Request for Answer Clarification by lighthousedweller-ga on 02 May 2003 10:27 PDT
hi there-

thanks for the quick response.

i use metrowerks codewarrior for my programs, but also have ms visual
c++. how can i compile this program with these two compilers?

also what is the hashtable file without any extension for? its in the
tar file, just named 'hashtable'.

thanks for the help!

Clarification of Answer by dogbite-ga on 02 May 2003 10:33 PDT
Hello lighthousedweller-ga,

  The "hashtable" file without an extention
  is an executable that I created on my
  Linux machine.  Since you're not on Linux
  (unless you have a Mac) you can ignore it.

  I am not very familiar with either codwarior
  or visual c++.  But in either case, I imagine
  you open the hashtable.cpp source code file.
  Then select "compile" or "execute" or something
  like that from the menu.

  Does that sound okay to you?

                dogbite-ga

Request for Answer Clarification by lighthousedweller-ga on 03 May 2003 02:37 PDT
OK got it to run. Usually I do compile/execute, but in this case i
just needed to compile it to an exe and run it from dos with the
command line arguments.

a few questions.

1) in the output it says hashed 28 into 28, is it not hashing them
into different values?

2) also what happens if there are two or more with the same id? how
would you access it then?

3) what is the memcpy function doing? and why is id always referred to
as a reference?

4) do you think you could convert the ? : statment to an if/else? i'm
always confused by that syntax.

thanks for the help

Clarification of Answer by dogbite-ga on 03 May 2003 10:12 PDT
Hi lighthousedweller-ga,

  I'm glad you were able to compile
  and run the program.  Here are
  answers to your questions:

) in the output it says hashed 28 into 28, is it not hashing them
into different values?

  the hashtable is simple -- it holds a fixed 
  number of elements.  it's set at 2003 elements,
  but you could change that by changing the
  MAX define.  also, the hash function is simple --
  it just takes the id modulo the size to produce
  the hash code.  so anything with an id under 2003
  will have a hash code equal to its id.

  when you have a collision we use linear probing
  to find a spot for the item.  that means you go
  down the list looking for an open spot.
 
2) also what happens if there are two or more with the same id? how
would you access it then?

   id's are assumed to be unique.  think of
   them as a person's social security number.

   note that two elements with different ids
   can have identical hash codes.  having unique
   ids is essential so we can tell them apart.
 
3) what is the memcpy function doing? and why is id always referred to
as a reference?
 
    memcpy copys two chunks of memory.  so it is
    copying all the information from the given
    student structure into the hashtable.

    memcpy needs needs pointers to its arguments.
    so we take a reference of those structures so
    that memcpy can do its work.

4) do you think you could convert the ? : statment to an if/else? i'm
always confused by that syntax.
 
     sure thing -- i updated the hashtable.tar.gz
     file with an if/else block.

                dogbite-ga

Request for Answer Clarification by lighthousedweller-ga on 03 May 2003 17:02 PDT
thanks for those clarifications...
heres a few more small questions

1) you said that having unique ids is essential so we can tell them
apart and this makes sense, but then if they are two ids that are the
same you also say we do linear probing to find a place for it. instead
of doing linear probing, should it be an error? cause whats the point
of finding a place for it, if its not going to be able to be reached
by it's id number?

2) also could you suggest a way to change the hash function if i only
had a list of words that i wanted to put into the hash table, cause i
dont believe %max would word if it was a string i was hashing...

thanks alot for the help!

Clarification of Answer by dogbite-ga on 03 May 2003 18:58 PDT
Hi lighthousedweller-ga,

  Sure, I can answer those two questions:

1) you said that having unique ids is essential so we can tell them
apart and this makes sense, but then if they are two ids that are the
same you also say we do linear probing to find a place for it. instead
of doing linear probing, should it be an error? cause whats the point
of finding a place for it, if its not going to be able to be reached
by it's id number?

    Sorry for not being clear.  Two items cannot
    have the same ID.  Two items can have ids that
    map to the same hashcode though.      
 
2) also could you suggest a way to change the hash function if i only
had a list of words that i wanted to put into the hash table, cause i
dont believe %max would word if it was a string i was hashing.... . .

    For strings, the easiest thing to do would be:

      1) write a function that returns a unique
         id (a number) for any string.  you can
         use something like:

int stringtonum(char *s, int len) {
  int i;
  int total = 0;

  for (i = 0; i < len; i++) {
    total += 3*i*((int)index(s,i));
  }
  return total;
}

      2) use the student structure again, but
         put your string in the "first name"
         field.

  Sound good?

            dogbite-ga

Request for Answer Clarification by lighthousedweller-ga on 03 May 2003 23:28 PDT
hi again-

i'm having alot of problems converting the code from an int hash to a
string hash. is there a reason to use char* instead of just using
strings?

i'm trying to convert it to something that hashes a word and then
stores the word and its definition in the hash table. so i only two
inputs instead of three.

then you're able to retrieve whether or not the word exists, and what
its definition is, using a function similar to getStudent. can you
help me rework this code using strings instead of char?

also in your previous post you said you can have "Two items can have
ids that map to the same hashcode though." OK, so if they did how
would you be able to find the correct one you're looking for when
using the function getStudent? For example, say the words run and ruin
mapped to the same hashcode. ruin would need to be put in a place
using linear probing. now if you did something like getStudent(ruin)
wouldn't it return the info for run? if you could also clarify this a
bit for me i would appreciate it.

thanks alot once again.

Clarification of Answer by dogbite-ga on 04 May 2003 08:25 PDT
Hi lighthousedweller-ga,

  Those are two good questions.

  First, if two items have different ids but 
  the same hash code, you find the correct item
  using linear probing.  You start at the
  hash code's location in the table and then
  move down, checking the id of each item
  until it matches.  Does that make sense?

  Regarding changing the table to handle
  strings, I feel this is too far beyond
  the scope of your origional question to
  answer (for $10).  I suggest that you
  submit it as a follow-up question.  If 
  you want me to answer it, just indicate
  that in the subject line.

               dogbite-ga

Request for Answer Clarification by lighthousedweller-ga on 04 May 2003 14:15 PDT
OK, I will post a followup question and I really do appreciate the
help. But also notice that your answer also wasnt really what i
requested in the intial post. I mean it works, but it requires some
modification, ie working with hashing strings instead of hasing ints,
which is what I was looking for. If you re-read the intial post you'll
probably agree.

I do realize you have put in alot of work on this problem regardless
so I will post a followup. Thanks again for the help.

Clarification of Answer by dogbite-ga on 04 May 2003 15:43 PDT
Hi lighthousedweller-ga,

  I appologize -- I did not properly interpret 
  "dictionary" in your question.  Thank you
  for reposting your question.

           dogbite-ga
lighthousedweller-ga rated this answer:4 out of 5 stars
Question wasn't completely answered, but still did a great job and was
a very helpful guy.

Comments  
There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy