Google Answers Logo
View Question
 
Q: buckets, balls, marking, replacement -- what is the distribution? ( No Answer,   8 Comments )
Question  
Subject: buckets, balls, marking, replacement -- what is the distribution?
Category: Science > Math
Asked by: ken13-ga
List Price: $25.00
Posted: 18 Sep 2006 19:15 PDT
Expires: 18 Oct 2006 19:15 PDT
Question ID: 766437
Consider a bucket with initially unmarked balls in it (say, M balls). 
We now do the following:

1) Randomly choose a ball from the bucket
2) Mark it (if the ball is already marked, we can mark it again) 
3) Put the marked ball back in the bucket 
4) Repeat (1) thru (3) N times 

At the end of the procedure above, what is the mean number of marked
balls?  This is easy, I can answer it.  What is the distribution of
marked balls (or, at the least, the variance of the number of marked
balls)?  I cannot answer this and seek help from google answers.

The mean number of marked balls is as follows: (1 ? (1/M)) is the
probability that a randomly chosen ball is not marked after the first
trial.  After N trials, the probability of any randomly chosen ball
being not marked is (1 ? (1/M))^N.  So, the probability of a given
ball being marked is 1 ? (1 ? (1/M))^N.  And, thus the mean number of
marked balls is M *(1 ? (1 ? (1/M))^N).  I first thought that the
distribution of marked balls (or of unmarked balls) would be binomial,
but appears not to be.  The distribution (experimentally validated) is
normal-shaped and with a variance less than that of the binomial
distribution.

A 5 star answer will tell me 1) an expression for the distribution of
marked balls, and/or 2) how to analytically compute the variance.

I need this result for a paper I am working on.  I will be happy to
acknowledge the answerer in the paper if he/she desires this.

Any ideas?
Answer  
There is no answer at this time.

Comments  
Subject: Re: buckets, balls, marking, replacement -- what is the distribution?
From: ansel001-ga on 23 Sep 2006 15:37 PDT
 
You can compare your question to rolling dice.  Marking a ball that is
drawn is equivalent to rolling a number.  After a given number of
rolls, how many numbers have been rolled at least once?  To find the
distribution, first create a state transition matrix.  For example, if
you have already rolled two different numbers, what is the probability
of rolling a third number on your next roll?  There is a 2/3 chance of
rolling a new number and a 1/3 chance of rolling a number you already
rolled before.

State transition matrix								

	To							
From	0	1	2	3	4	5	6	Total
0	0    	1    						1    
1		 1/6	 5/6					1    
2			 1/3	 2/3				1    
3				 1/2	 1/2			1    
4					 2/3	 1/3		1    
5						 5/6	 1/6	1    
6							1    	1    

Now apply these probabilities to get the distribution below.
								
Number	   Probability Distribution of Number of Different Numbers	
Of Dice                 That Will Come Up at Least Once
Rolled	0	1	2	3	4	5	6	Total
0	1.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	1.0000
1	0.0000	1.0000	0.0000	0.0000	0.0000	0.0000	0.0000	1.0000
2	0.0000	0.1667	0.8333	0.0000	0.0000	0.0000	0.0000	1.0000
3	0.0000	0.0278	0.4167	0.5556	0.0000	0.0000	0.0000	1.0000
4	0.0000	0.0046	0.1620	0.5556	0.2778	0.0000	0.0000	1.0000
5	0.0000	0.0008	0.0579	0.3858	0.4630	0.0926	0.0000	1.0000
6	0.0000	0.0001	0.0199	0.2315	0.5015	0.2315	0.0154	1.0000
7	0.0000	0.0000	0.0068	0.1290	0.4501	0.3601	0.0540	1.0000
8	0.0000	0.0000	0.0023	0.0690	0.3646	0.4501	0.1140	1.0000
9	0.0000	0.0000	0.0008	0.0360	0.2776	0.4966	0.1890	1.0000
10	0.0000	0.0000	0.0003	0.0185	0.2031	0.5064	0.2718	1.0000

As you can see, after 4 rolls of the die, the distribution of the
number of different numbers rolled is:

1   .0046
2   .1620
3   .5556
4   .2778

   1.0000

Hope this helps.
Subject: Re: buckets, balls, marking, replacement -- what is the distribution?
From: berkeleychocolate-ga on 25 Sep 2006 14:44 PDT
 
This is very similar to an earlier question that ansel001 and I and
others worked  on. I don't  remember its number. Perhaps ansel001
does. The transition matrix is diagonalizable, and the matrix Q
diagonalizing it is essentially Pascal's triangle, made into a
triangular matrix. Then one can theoretically compute the distribution
after n draws since one can write down a closed expression for any
entry of the nth power of the matrix.
Subject: Re: buckets, balls, marking, replacement -- what is the distribution?
From: ansel001-ga on 25 Sep 2006 16:50 PDT
 
I also remember the similar question, but not its number.  If there is
a closed form for the equation as berkeleychocolate surmises, I am
unaware of it.
Subject: Re: buckets, balls, marking, replacement -- what is the distribution?
From: barneca-ga on 26 Sep 2006 10:18 PDT
 
in case it helps anyone, the question was, i believe, #736681.

just searched for berkeleychocolate and ansel001 and this was the only
one that looked similar.

-cab
Subject: Re: buckets, balls, marking, replacement -- what is the distribution?
From: berkeleychocolate-ga on 28 Sep 2006 17:33 PDT
 
Actually the mean number of balls marked is M-(M-1)*(1-1/M)^N, not
what you originally calculated. The reason your formula is close but
not quite correct is that the trials are not independent. So you can't
just raise (1-1/M) to a power.

The formula above was determined just for M=3 and M=4 by diagonalizing
the transition matrix of ansel001 along the lines discussed in the
earlier problem #736681. The pattern was so clear that I jumped to the
conclusion of the above formula.

I looked at the standard deviation, but the pattern was not so clear.
Subject: Re: buckets, balls, marking, replacement -- what is the distribution?
From: berkeleychocolate-ga on 28 Sep 2006 22:32 PDT
 
I computed the variance (square of standard deviation) for M = 3, 4,
and 5 and the pattern now seems clear: The variance is

(M - 1)* (1 - 1/M)^N + (M - 1)*(M - 2)*(1 - 2/M)^N - (M - 1)^2 * (1 - 1/M)^(2N).

Perhaps someone wants to check this for a few M's and N's to confirm the pattern.
Subject: Re: buckets, balls, marking, replacement -- what is the distribution?
From: qresq-ga on 29 Sep 2006 19:14 PDT
 
Let X_i be 1 if i'th ball is unmarked. You ask for information on N-X where 
X=\sum X_i.

as you note, E(X) = \sum E(X_i) = N E(X_1), and the expectation is
known since E(X_1) = (1-1/M)^N.

For the variance, 
E(X^2) = \sum_{i,j} E(X_i X_j) = N E(X_1^2) + {N \choose 2} E(X_1 X_2).
note that
E(X_1 X_2) = (1-2/M)^N, which gives a closed formula for the variance.

The third and other moments can be calculated similarly.

As for the complete distribution, there is no obvious distribution,
but there are several cases. If N<<sqrt(M) there are N marked balls.
If N<<M there are approximately N-N^2/2M, with gaussian error. This
probably works if N is of order M. Once N is of order M\log M there is
a poisson number of unmarked balls.
Subject: Re: buckets, balls, marking, replacement -- what is the distribution?
From: berkeleychocolate-ga on 02 Oct 2006 15:55 PDT
 
Using ansel001's numbers for M=6, N=4 we get a mean of 3.1066. Using
my formula for the mean 6-5*(1-1/6)^3 we get 3.1065 (My N is 3, not 4
since the first draw doesn't count. A ball gets marked.)

Using the formula of ken13 and qresq we get 6*( 1-(1-1/6)^4 ) = 5.4823
or 6*( 1-(1-1/6)^3 ) = 5.5787.

If you accept ansel001's matrix, then their formula is not correct. As
I said, they are incorrectly assuming independence. (This was already
discussed in problem  #736681.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy