Google Answers Logo
View Question
 
Q: Mapping possibility ids to possibilities ( Answered 4 out of 5 stars,   3 Comments )
Question  
Subject: Mapping possibility ids to possibilities
Category: Computers > Algorithms
Asked by: carbon-ga
List Price: $4.00
Posted: 27 Jul 2003 15:22 PDT
Expires: 26 Aug 2003 15:22 PDT
Question ID: 235739
Suppose I want to generate all possible 'words' from a character set
like 'abc'.  The desired word sequence would be
0 a
1 b
2 c
3 aa
4 ab
and so forth.  The question is, given sequence number 3 and character
set 'abc', how can we determine 'aa'?  Hopefully it is clear that I
mean the question in the most general sense.  That is, given a
sequence number and any character set, can we determine what 'word'
we're on?
Answer  
Subject: Re: Mapping possibility ids to possibilities
Answered By: mathtalk-ga on 28 Jul 2003 12:42 PDT
Rated:4 out of 5 stars
 
Hi, carbon-ga:

Let's start with the direction that goes from "aa" to 3.  Then I think
the reverse direction will make more sense to explain.

The numbering you give starts with zero, an important point to keep in
mind.  Also the empty sequence (length zero) is not listed.

Given an "alphabet" of M symbols (M = 3 in the illustration) and a
sequence of length N (N = 2 in the illustration), we can first ask how
many sequences of length less than N (but more than length zero) will
precede it.

Note that there are M sequences of length 1, M^2 sequences of length
2, etc.

Within the sequences of a given length, the ordering is "dictionary"
(lexicographic) and thus we can determine the relative position by
converting the sequence as a radix N number.  Treat the symbols of the
"alphabet" as zero-index values in the assigned order.  Here:

a = 0

b = 1

c = 2

Therefore base-3 we have "aa" equivalent to zero.  Add this to the
number of sequences of shorter length (here, of length 1, there are
3), and we get the index of the sequence on the list:

3. aa

Now let's consider the reverse direction, in which we are given (say)
K = 3 and we wish to "locate" the corresponding sequence.

We begin by comparing K to the number of one symbol sequences M.  If K
>= M, then we subtract K - M and continue, next comparing K - M to the
number M^2 of two symbol sequences.

Eventually we find the length into which the sequence falls in this
manner.  In the case illustrated, K - M = 0 < M^2 = 9, so we have a
sequence of length 2.

More generally, the result might be of the form:

0 <= K - M - M^2 - ... - M^(L-1) < M^L

indicating a sequence of length L.  Let:

K' = K - M - M^2 - ... - M^(L-1)

be the corresponding "reduction" of K, and convert this number into
base M using the M symbols of the "alphabet" as if they were values
0,1,...,M-1.

In this way K = 3 would (for M = 3) map to the length two sequence:

K' = 3 - 3 = 0

00 is then "aa" after replacing the digits with symbols.

regards, mathtalk-ga


regards, mathtalk-ga

Request for Answer Clarification by carbon-ga on 28 Jul 2003 15:41 PDT
I understand your explanation of how to determine the 'word' length
from the sequence number.

I don't understand how to determine what symbols make up the sequence.

'aa' seems to work alright, maybe.  3 - 3 = 0, and we already
determined that K = 3 had sequence length 2, so 00 or charset[0]
twice.

But sequence 11 maps to 'cc'.  11 - 3 = 8.  We can't do 8 - 9.  Please
baby-talk how I arrive at charset indices 2,2 ('cc') from 11.  Thank
you.

Clarification of Answer by mathtalk-ga on 28 Jul 2003 19:18 PDT
Hi, carbon-ga:

I think you are on track here, just missing a bit at the end.

If the index K = 11, then subtracting K - M = 11 - 3 = 8, but we
cannot subtract M^2 = 9 from that.  So, two things we know:

1) The length of the sequence will be 2 (because we are somewhere
beyond the range of the sequences of length 1 but not over into the
sequences of length 3).

2) Among sequences of length 2, ours will correspond to K' = K - M =
8.

That is, we convert 8 into a base 3 expression (base M, in general). 
This is a matter of dividing by 3 (repeatedly) and writing down the
remainders:

8 = 2*3 + 2 (first remainder is 2)

2 = 0*3 + 2 (last remainder is 2)

That is, 8 (base 10) = 22 (base 3).

Now replace the "digits" (remainders) by the symbols, using this
correspondance or map:

0 = a

1 = b

2 = c

Since our "digits" base 3 are 22, the respective sequence is "cc".

I hope that clears things up for you.  For future reference it is
probably best to wait until after you gotten any clarification needed
before doing the rating.

regards, mathtalk-ga
carbon-ga rated this answer:4 out of 5 stars
I think mathtalk has nailed it.

Comments  
Subject: Re: Mapping possibility ids to possibilities
From: farblest-ga on 28 Jul 2003 13:13 PDT
 
Hi,

I’m assuming you want only words that are subsets of the given set –
so, for instance, ‘aa’ can’t be a word because the set ‘abc’ has only
one ‘a’.

I’ll put my words in order using the following algorithm:
Words with less characters come before words with more characters.
Words with the same number of characters are ordered by the first
differing character’s place in the original character set.

I’ll show my answer in two ways: first with a math/coding algorithm,
then with an example.

The math/coding algorithm first:

Some definitions:
Let S be the original character set.
Let X be the sequence number you are looking for.
Let s be the word you are looking for.

Let N be the original length of S.
Let n be the final length of s.

I’ll assume you know about permutations.  If not, look at
http://mathworld.wolfram.com/Permutation.html

Step 1: Find n.

Find the smallest n’ such that X - SIGMA( N P n’) <= 0, n’ = 1 to N.
Then, n = n’.
Let X = X - SIGMA(N P n’), n’ = 1 to n – 1.
Step 1 time analysis: O(n) time.

Step 2: Find character i of s.

Find the smallest m such that  [(N-i) P (n-i)] * m – x <= 0.
Then, element i in s is element m in S.
Set X to X - [(N-i) P (n-i)] * (m-1).
Remove element m from S.
Step 2 analysis: O(n) time for each character i (actually, a little
less).

And now, an example.

Our character set S is ‘abcd’.
Let’s look for element 33.

Step 1: Find n.

(4 P 1) + (4 P 2) = 4 + 12 = 16
33 – 16 > 0
(4 P 1) + (4 P 2) + (4 P 3) = 4 + 12 + 24 = 40
33 – 40 <= 0
So: Our word has length 3 (n = 3).
X = 33 – [(4 P 1) + (4 P 2)] = 33 – (4 + 12) =  17

Step 2: Find all characters of s.

2a: Find character 1 of s.
(3 P 2) * 1 = 6, 17 – 6 > 0
(3 P 2) * 2 = 12, 17 – 12 > 0
(3 P 2) * 3 = 18, 17 – 18 <= 0, so
Element 1 in s is element 3 in S (element 3 is ‘c’)
X = 17 – (3 P 2)*2 = 5
Remove ‘c’ from S: new S is ‘abd’

2b: Find character 2 of s.
(2 P 1) * 1 = 2, 5 – 2 > 0
(2 P 1) * 2 = 4, 5 – 4 > 0
(2 P 1) * 3 = 6, 5 – 6 <= 0
Element 2 in s is element 3 in S (element 3, notice, is now ‘d’)
X = 5 – (2 P 1)*2 = 1
Remove ‘d’ from S: new S is ‘ab’

2c: Find character 3 of s.
(1 P 0) * 1 = 1, 1 – 1 <=0
Element 3 in s is element 1 in S (element 1 in S is ‘a’)
Remove ‘a’ from S: new S is ‘b’

There’s element 33: ‘cda’

Why does it work?

Step 1:
(n P k) returns the number of words of length k that can be made with
n letters.  Subtracting the sum from the original sequence number
until we find a negative number gives us an upper bound on the length
of our word.

Step 2:
When we get to our word, let’s think about it in two different parts:
its first character, and the rest of the word.  How many words of
length n start with this character? The answer is (N-i) P (n-i) – that
is, a permutation of all the possible characters, *except for those
already used*.  We can ‘go down the list’ of S by increasing m until
we find the correct first character.

Here's the list of possible words from 'abcd':
1.	a
2.	b
3.	c
4.	d
5.	ab
6.	ac
7.	ad
8.	ba
9.	bc
10.	bd
11.	ca
12.	cb
13.	cd
14.	da
15.	db
16.	dc
17.	abc
18.	abd
19.	acb
20.	acd
21.	adb
22.	adc
23.	bac
24.	bad
25.	bca
26.	bcd
27.	bda
28.	bdc
29.	cab
30.	cad
31.	cba
32.	cbd
33.	cda
34.	cdb
35.	dab
36.	dac
37.	dba
38.	dbc
39.	dca
40.	dcb
41.	abcd
42.	abdc
43.	acbd
44.	acdb
45.	adbc
46.	adcb
47.	bacd
48.	badc
49.	bcad
50.	bcda
51.	bdac
52.	bdca
53.	cabd
54.	cadb
55.	cbad
56.	cbda
57.	cdab
58.	cdba
59.	dabc
60.	dacb
61.	dbac
62.	dbca
63.	dcab
64.	dcba

Good luck.  If this helps, consider making a donation to Howard Dean's
Presidency:
http://www.blogforamerica.com
Subject: Re: Mapping possibility ids to possibilities
From: carbon-ga on 28 Jul 2003 22:34 PDT
 
I'm sorry to add a comment in the wrong place, but I see no other
alternatives, and I just wanted to set the record straight should
anyone check this thread.

After requesting clarification on mathtalk's answer, Answers told me
that I should rate the answer.  I thought that was strange, but I did
so anyway, giving a rating of 4/5, because while the answer was good,
I felt that it wasn't complete.

But it would have been much better for me to have rated the answer
after receiving the clarification.  If I had done so, I would have
given it a 5/5.

I'm afraid I've done mathtalk a disservice due to inexperience using
Answers.  Sorry.
Subject: Re: Mapping possibility ids to possibilities
From: tne-ga on 02 Aug 2003 17:47 PDT
 
I think all he wanted was given a charachter set of size "x" find the
"nth" string. I belive this simply means converting "n" in "decimal"
form to "n" in "xecimal" form and substituting the characters.

a=0,b=1,c=2, => (x=3)
0
1
2
10
11

decimal number 
4 in base 10 would be 11 in base 3
ans 4 = bb

How to convert a number n from base 10 to base x

int digits array[size];
int i=0;
while (n div x >= x) {
 digits[i] = n mod x;
 n = n div x;
 i++
}
digits[i]=n;
for(int j=i; j>=0; j--) {
 print(digits[j]);
}

I think that should do it unless my code is buggy....

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy