Google Answers: C++ hash table help

View Question

Q: C++ hash table help ( Answered 4 out of 5 stars

, 0 Comments )

Question

Subject: C++ hash table help
Category: Computers > Programming
Asked by: lighthousedweller-ga
List Price: $10.00

Posted: 01 May 2003 19:32 PDT
Expires: 31 May 2003 19:32 PDT
Question ID: 198221

hi there-

i'm trying to provide fast data access using a disc file hash table in C/C++.
this should be done using something like a dictionary input file. so i
need some help finding a basic input dictionary file and also with
writing the program to do this. also i would really appreciate
commented code to help me understand how it works better. thanks for the help.

Answer

Subject: Re: C++ hash table help
Answered By: dogbite-ga on 01 May 2003 20:31 PDT
Rated: 4 out of 5 stars

Hello lighthousedweller-ga, Okay, I have put code with a simple hash table and a simple data file here: http://nms.lcs.mit.edu/~gch/google/hashtable.tar.gz Please let me know how it works for you. dogbite-ga
Clarification of Answer by dogbite-ga on 01 May 2003 20:34 PDT Hi again lighthousedweller-ga, I thought I should mention that I explained how to compile and run the program at the top of "hashtable.cpp." Also, there are comments in the code to help you understand it. dogbite-ga
Request for Answer Clarification by lighthousedweller-ga on 02 May 2003 10:27 PDT hi there- thanks for the quick response. i use metrowerks codewarrior for my programs, but also have ms visual c++. how can i compile this program with these two compilers? also what is the hashtable file without any extension for? its in the tar file, just named 'hashtable'. thanks for the help!
Clarification of Answer by dogbite-ga on 02 May 2003 10:33 PDT Hello lighthousedweller-ga, The "hashtable" file without an extention is an executable that I created on my Linux machine. Since you're not on Linux (unless you have a Mac) you can ignore it. I am not very familiar with either codwarior or visual c++. But in either case, I imagine you open the hashtable.cpp source code file. Then select "compile" or "execute" or something like that from the menu. Does that sound okay to you? dogbite-ga
Request for Answer Clarification by lighthousedweller-ga on 03 May 2003 02:37 PDT OK got it to run. Usually I do compile/execute, but in this case i just needed to compile it to an exe and run it from dos with the command line arguments. a few questions. 1) in the output it says hashed 28 into 28, is it not hashing them into different values? 2) also what happens if there are two or more with the same id? how would you access it then? 3) what is the memcpy function doing? and why is id always referred to as a reference? 4) do you think you could convert the ? : statment to an if/else? i'm always confused by that syntax. thanks for the help
Clarification of Answer by dogbite-ga on 03 May 2003 10:12 PDT Hi lighthousedweller-ga, I'm glad you were able to compile and run the program. Here are answers to your questions: ) in the output it says hashed 28 into 28, is it not hashing them into different values? the hashtable is simple -- it holds a fixed number of elements. it's set at 2003 elements, but you could change that by changing the MAX define. also, the hash function is simple -- it just takes the id modulo the size to produce the hash code. so anything with an id under 2003 will have a hash code equal to its id. when you have a collision we use linear probing to find a spot for the item. that means you go down the list looking for an open spot. 2) also what happens if there are two or more with the same id? how would you access it then? id's are assumed to be unique. think of them as a person's social security number. note that two elements with different ids can have identical hash codes. having unique ids is essential so we can tell them apart. 3) what is the memcpy function doing? and why is id always referred to as a reference? memcpy copys two chunks of memory. so it is copying all the information from the given student structure into the hashtable. memcpy needs needs pointers to its arguments. so we take a reference of those structures so that memcpy can do its work. 4) do you think you could convert the ? : statment to an if/else? i'm always confused by that syntax. sure thing -- i updated the hashtable.tar.gz file with an if/else block. dogbite-ga
Request for Answer Clarification by lighthousedweller-ga on 03 May 2003 17:02 PDT thanks for those clarifications... heres a few more small questions 1) you said that having unique ids is essential so we can tell them apart and this makes sense, but then if they are two ids that are the same you also say we do linear probing to find a place for it. instead of doing linear probing, should it be an error? cause whats the point of finding a place for it, if its not going to be able to be reached by it's id number? 2) also could you suggest a way to change the hash function if i only had a list of words that i wanted to put into the hash table, cause i dont believe %max would word if it was a string i was hashing... thanks alot for the help!
Clarification of Answer by dogbite-ga on 03 May 2003 18:58 PDT Hi lighthousedweller-ga, Sure, I can answer those two questions: 1) you said that having unique ids is essential so we can tell them apart and this makes sense, but then if they are two ids that are the same you also say we do linear probing to find a place for it. instead of doing linear probing, should it be an error? cause whats the point of finding a place for it, if its not going to be able to be reached by it's id number? Sorry for not being clear. Two items cannot have the same ID. Two items can have ids that map to the same hashcode though. 2) also could you suggest a way to change the hash function if i only had a list of words that i wanted to put into the hash table, cause i dont believe %max would word if it was a string i was hashing.... . . For strings, the easiest thing to do would be: 1) write a function that returns a unique id (a number) for any string. you can use something like: int stringtonum(char s, int len) { int i; int total = 0; for (i = 0; i < len; i++) { total += 3i*((int)index(s,i)); } return total; } 2) use the student structure again, but put your string in the "first name" field. Sound good? dogbite-ga
Request for Answer Clarification by lighthousedweller-ga on 03 May 2003 23:28 PDT hi again- i'm having alot of problems converting the code from an int hash to a string hash. is there a reason to use char* instead of just using strings? i'm trying to convert it to something that hashes a word and then stores the word and its definition in the hash table. so i only two inputs instead of three. then you're able to retrieve whether or not the word exists, and what its definition is, using a function similar to getStudent. can you help me rework this code using strings instead of char? also in your previous post you said you can have "Two items can have ids that map to the same hashcode though." OK, so if they did how would you be able to find the correct one you're looking for when using the function getStudent? For example, say the words run and ruin mapped to the same hashcode. ruin would need to be put in a place using linear probing. now if you did something like getStudent(ruin) wouldn't it return the info for run? if you could also clarify this a bit for me i would appreciate it. thanks alot once again.
Clarification of Answer by dogbite-ga on 04 May 2003 08:25 PDT Hi lighthousedweller-ga, Those are two good questions. First, if two items have different ids but the same hash code, you find the correct item using linear probing. You start at the hash code's location in the table and then move down, checking the id of each item until it matches. Does that make sense? Regarding changing the table to handle strings, I feel this is too far beyond the scope of your origional question to answer (for $10). I suggest that you submit it as a follow-up question. If you want me to answer it, just indicate that in the subject line. dogbite-ga
Request for Answer Clarification by lighthousedweller-ga on 04 May 2003 14:15 PDT OK, I will post a followup question and I really do appreciate the help. But also notice that your answer also wasnt really what i requested in the intial post. I mean it works, but it requires some modification, ie working with hashing strings instead of hasing ints, which is what I was looking for. If you re-read the intial post you'll probably agree. I do realize you have put in alot of work on this problem regardless so I will post a followup. Thanks again for the help.
Clarification of Answer by dogbite-ga on 04 May 2003 15:43 PDT Hi lighthousedweller-ga, I appologize -- I did not properly interpret "dictionary" in your question. Thank you for reposting your question. dogbite-ga

lighthousedweller-ga rated this answer: 4 out of 5 stars

Question wasn't completely answered, but still did a great job and was
a very helpful guy.

Comments

There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy