View Question
Q: Mapping possibility ids to possibilities ( Answered ,   3 Comments )
 Question
 Subject: Mapping possibility ids to possibilities Category: Computers > Algorithms Asked by: carbon-ga List Price: \$4.00 Posted: 27 Jul 2003 15:22 PDT Expires: 26 Aug 2003 15:22 PDT Question ID: 235739
 ```Suppose I want to generate all possible 'words' from a character set like 'abc'. The desired word sequence would be 0 a 1 b 2 c 3 aa 4 ab and so forth. The question is, given sequence number 3 and character set 'abc', how can we determine 'aa'? Hopefully it is clear that I mean the question in the most general sense. That is, given a sequence number and any character set, can we determine what 'word' we're on?```
 Subject: Re: Mapping possibility ids to possibilities Answered By: mathtalk-ga on 28 Jul 2003 12:42 PDT Rated:
 ```Hi, carbon-ga: Let's start with the direction that goes from "aa" to 3. Then I think the reverse direction will make more sense to explain. The numbering you give starts with zero, an important point to keep in mind. Also the empty sequence (length zero) is not listed. Given an "alphabet" of M symbols (M = 3 in the illustration) and a sequence of length N (N = 2 in the illustration), we can first ask how many sequences of length less than N (but more than length zero) will precede it. Note that there are M sequences of length 1, M^2 sequences of length 2, etc. Within the sequences of a given length, the ordering is "dictionary" (lexicographic) and thus we can determine the relative position by converting the sequence as a radix N number. Treat the symbols of the "alphabet" as zero-index values in the assigned order. Here: a = 0 b = 1 c = 2 Therefore base-3 we have "aa" equivalent to zero. Add this to the number of sequences of shorter length (here, of length 1, there are 3), and we get the index of the sequence on the list: 3. aa Now let's consider the reverse direction, in which we are given (say) K = 3 and we wish to "locate" the corresponding sequence. We begin by comparing K to the number of one symbol sequences M. If K >= M, then we subtract K - M and continue, next comparing K - M to the number M^2 of two symbol sequences. Eventually we find the length into which the sequence falls in this manner. In the case illustrated, K - M = 0 < M^2 = 9, so we have a sequence of length 2. More generally, the result might be of the form: 0 <= K - M - M^2 - ... - M^(L-1) < M^L indicating a sequence of length L. Let: K' = K - M - M^2 - ... - M^(L-1) be the corresponding "reduction" of K, and convert this number into base M using the M symbols of the "alphabet" as if they were values 0,1,...,M-1. In this way K = 3 would (for M = 3) map to the length two sequence: K' = 3 - 3 = 0 00 is then "aa" after replacing the digits with symbols. regards, mathtalk-ga regards, mathtalk-ga``` Request for Answer Clarification by carbon-ga on 28 Jul 2003 15:41 PDT ```I understand your explanation of how to determine the 'word' length from the sequence number. I don't understand how to determine what symbols make up the sequence. 'aa' seems to work alright, maybe. 3 - 3 = 0, and we already determined that K = 3 had sequence length 2, so 00 or charset[0] twice. But sequence 11 maps to 'cc'. 11 - 3 = 8. We can't do 8 - 9. Please baby-talk how I arrive at charset indices 2,2 ('cc') from 11. Thank you.``` Clarification of Answer by mathtalk-ga on 28 Jul 2003 19:18 PDT ```Hi, carbon-ga: I think you are on track here, just missing a bit at the end. If the index K = 11, then subtracting K - M = 11 - 3 = 8, but we cannot subtract M^2 = 9 from that. So, two things we know: 1) The length of the sequence will be 2 (because we are somewhere beyond the range of the sequences of length 1 but not over into the sequences of length 3). 2) Among sequences of length 2, ours will correspond to K' = K - M = 8. That is, we convert 8 into a base 3 expression (base M, in general). This is a matter of dividing by 3 (repeatedly) and writing down the remainders: 8 = 2*3 + 2 (first remainder is 2) 2 = 0*3 + 2 (last remainder is 2) That is, 8 (base 10) = 22 (base 3). Now replace the "digits" (remainders) by the symbols, using this correspondance or map: 0 = a 1 = b 2 = c Since our "digits" base 3 are 22, the respective sequence is "cc". I hope that clears things up for you. For future reference it is probably best to wait until after you gotten any clarification needed before doing the rating. regards, mathtalk-ga```
 carbon-ga rated this answer: `I think mathtalk has nailed it.`

 ```Hi, I’m assuming you want only words that are subsets of the given set – so, for instance, ‘aa’ can’t be a word because the set ‘abc’ has only one ‘a’. I’ll put my words in order using the following algorithm: Words with less characters come before words with more characters. Words with the same number of characters are ordered by the first differing character’s place in the original character set. I’ll show my answer in two ways: first with a math/coding algorithm, then with an example. The math/coding algorithm first: Some definitions: Let S be the original character set. Let X be the sequence number you are looking for. Let s be the word you are looking for. Let N be the original length of S. Let n be the final length of s. I’ll assume you know about permutations. If not, look at http://mathworld.wolfram.com/Permutation.html Step 1: Find n. Find the smallest n’ such that X - SIGMA( N P n’) <= 0, n’ = 1 to N. Then, n = n’. Let X = X - SIGMA(N P n’), n’ = 1 to n – 1. Step 1 time analysis: O(n) time. Step 2: Find character i of s. Find the smallest m such that [(N-i) P (n-i)] * m – x <= 0. Then, element i in s is element m in S. Set X to X - [(N-i) P (n-i)] * (m-1). Remove element m from S. Step 2 analysis: O(n) time for each character i (actually, a little less). And now, an example. Our character set S is ‘abcd’. Let’s look for element 33. Step 1: Find n. (4 P 1) + (4 P 2) = 4 + 12 = 16 33 – 16 > 0 (4 P 1) + (4 P 2) + (4 P 3) = 4 + 12 + 24 = 40 33 – 40 <= 0 So: Our word has length 3 (n = 3). X = 33 – [(4 P 1) + (4 P 2)] = 33 – (4 + 12) = 17 Step 2: Find all characters of s. 2a: Find character 1 of s. (3 P 2) * 1 = 6, 17 – 6 > 0 (3 P 2) * 2 = 12, 17 – 12 > 0 (3 P 2) * 3 = 18, 17 – 18 <= 0, so Element 1 in s is element 3 in S (element 3 is ‘c’) X = 17 – (3 P 2)*2 = 5 Remove ‘c’ from S: new S is ‘abd’ 2b: Find character 2 of s. (2 P 1) * 1 = 2, 5 – 2 > 0 (2 P 1) * 2 = 4, 5 – 4 > 0 (2 P 1) * 3 = 6, 5 – 6 <= 0 Element 2 in s is element 3 in S (element 3, notice, is now ‘d’) X = 5 – (2 P 1)*2 = 1 Remove ‘d’ from S: new S is ‘ab’ 2c: Find character 3 of s. (1 P 0) * 1 = 1, 1 – 1 <=0 Element 3 in s is element 1 in S (element 1 in S is ‘a’) Remove ‘a’ from S: new S is ‘b’ There’s element 33: ‘cda’ Why does it work? Step 1: (n P k) returns the number of words of length k that can be made with n letters. Subtracting the sum from the original sequence number until we find a negative number gives us an upper bound on the length of our word. Step 2: When we get to our word, let’s think about it in two different parts: its first character, and the rest of the word. How many words of length n start with this character? The answer is (N-i) P (n-i) – that is, a permutation of all the possible characters, *except for those already used*. We can ‘go down the list’ of S by increasing m until we find the correct first character. Here's the list of possible words from 'abcd': 1. a 2. b 3. c 4. d 5. ab 6. ac 7. ad 8. ba 9. bc 10. bd 11. ca 12. cb 13. cd 14. da 15. db 16. dc 17. abc 18. abd 19. acb 20. acd 21. adb 22. adc 23. bac 24. bad 25. bca 26. bcd 27. bda 28. bdc 29. cab 30. cad 31. cba 32. cbd 33. cda 34. cdb 35. dab 36. dac 37. dba 38. dbc 39. dca 40. dcb 41. abcd 42. abdc 43. acbd 44. acdb 45. adbc 46. adcb 47. bacd 48. badc 49. bcad 50. bcda 51. bdac 52. bdca 53. cabd 54. cadb 55. cbad 56. cbda 57. cdab 58. cdba 59. dabc 60. dacb 61. dbac 62. dbca 63. dcab 64. dcba Good luck. If this helps, consider making a donation to Howard Dean's Presidency: http://www.blogforamerica.com```
 ```I'm sorry to add a comment in the wrong place, but I see no other alternatives, and I just wanted to set the record straight should anyone check this thread. After requesting clarification on mathtalk's answer, Answers told me that I should rate the answer. I thought that was strange, but I did so anyway, giving a rating of 4/5, because while the answer was good, I felt that it wasn't complete. But it would have been much better for me to have rated the answer after receiving the clarification. If I had done so, I would have given it a 5/5. I'm afraid I've done mathtalk a disservice due to inexperience using Answers. Sorry.```
 ```I think all he wanted was given a charachter set of size "x" find the "nth" string. I belive this simply means converting "n" in "decimal" form to "n" in "xecimal" form and substituting the characters. a=0,b=1,c=2, => (x=3) 0 1 2 10 11 decimal number 4 in base 10 would be 11 in base 3 ans 4 = bb How to convert a number n from base 10 to base x int digits array[size]; int i=0; while (n div x >= x) { digits[i] = n mod x; n = n div x; i++ } digits[i]=n; for(int j=i; j>=0; j--) { print(digits[j]); } I think that should do it unless my code is buggy....```