

Subject:
buckets, balls, marking, replacement  what is the distribution?
Category: Science > Math Asked by: ken13ga List Price: $25.00 
Posted:
18 Sep 2006 19:15 PDT
Expires: 18 Oct 2006 19:15 PDT Question ID: 766437 
Consider a bucket with initially unmarked balls in it (say, M balls). We now do the following: 1) Randomly choose a ball from the bucket 2) Mark it (if the ball is already marked, we can mark it again) 3) Put the marked ball back in the bucket 4) Repeat (1) thru (3) N times At the end of the procedure above, what is the mean number of marked balls? This is easy, I can answer it. What is the distribution of marked balls (or, at the least, the variance of the number of marked balls)? I cannot answer this and seek help from google answers. The mean number of marked balls is as follows: (1 ? (1/M)) is the probability that a randomly chosen ball is not marked after the first trial. After N trials, the probability of any randomly chosen ball being not marked is (1 ? (1/M))^N. So, the probability of a given ball being marked is 1 ? (1 ? (1/M))^N. And, thus the mean number of marked balls is M *(1 ? (1 ? (1/M))^N). I first thought that the distribution of marked balls (or of unmarked balls) would be binomial, but appears not to be. The distribution (experimentally validated) is normalshaped and with a variance less than that of the binomial distribution. A 5 star answer will tell me 1) an expression for the distribution of marked balls, and/or 2) how to analytically compute the variance. I need this result for a paper I am working on. I will be happy to acknowledge the answerer in the paper if he/she desires this. Any ideas? 

There is no answer at this time. 

Subject:
Re: buckets, balls, marking, replacement  what is the distribution?
From: ansel001ga on 23 Sep 2006 15:37 PDT 
You can compare your question to rolling dice. Marking a ball that is drawn is equivalent to rolling a number. After a given number of rolls, how many numbers have been rolled at least once? To find the distribution, first create a state transition matrix. For example, if you have already rolled two different numbers, what is the probability of rolling a third number on your next roll? There is a 2/3 chance of rolling a new number and a 1/3 chance of rolling a number you already rolled before. State transition matrix To From 0 1 2 3 4 5 6 Total 0 0 1 1 1 1/6 5/6 1 2 1/3 2/3 1 3 1/2 1/2 1 4 2/3 1/3 1 5 5/6 1/6 1 6 1 1 Now apply these probabilities to get the distribution below. Number Probability Distribution of Number of Different Numbers Of Dice That Will Come Up at Least Once Rolled 0 1 2 3 4 5 6 Total 0 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 1 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 2 0.0000 0.1667 0.8333 0.0000 0.0000 0.0000 0.0000 1.0000 3 0.0000 0.0278 0.4167 0.5556 0.0000 0.0000 0.0000 1.0000 4 0.0000 0.0046 0.1620 0.5556 0.2778 0.0000 0.0000 1.0000 5 0.0000 0.0008 0.0579 0.3858 0.4630 0.0926 0.0000 1.0000 6 0.0000 0.0001 0.0199 0.2315 0.5015 0.2315 0.0154 1.0000 7 0.0000 0.0000 0.0068 0.1290 0.4501 0.3601 0.0540 1.0000 8 0.0000 0.0000 0.0023 0.0690 0.3646 0.4501 0.1140 1.0000 9 0.0000 0.0000 0.0008 0.0360 0.2776 0.4966 0.1890 1.0000 10 0.0000 0.0000 0.0003 0.0185 0.2031 0.5064 0.2718 1.0000 As you can see, after 4 rolls of the die, the distribution of the number of different numbers rolled is: 1 .0046 2 .1620 3 .5556 4 .2778 1.0000 Hope this helps. 
Subject:
Re: buckets, balls, marking, replacement  what is the distribution?
From: berkeleychocolatega on 25 Sep 2006 14:44 PDT 
This is very similar to an earlier question that ansel001 and I and others worked on. I don't remember its number. Perhaps ansel001 does. The transition matrix is diagonalizable, and the matrix Q diagonalizing it is essentially Pascal's triangle, made into a triangular matrix. Then one can theoretically compute the distribution after n draws since one can write down a closed expression for any entry of the nth power of the matrix. 
Subject:
Re: buckets, balls, marking, replacement  what is the distribution?
From: ansel001ga on 25 Sep 2006 16:50 PDT 
I also remember the similar question, but not its number. If there is a closed form for the equation as berkeleychocolate surmises, I am unaware of it. 
Subject:
Re: buckets, balls, marking, replacement  what is the distribution?
From: barnecaga on 26 Sep 2006 10:18 PDT 
in case it helps anyone, the question was, i believe, #736681. just searched for berkeleychocolate and ansel001 and this was the only one that looked similar. cab 
Subject:
Re: buckets, balls, marking, replacement  what is the distribution?
From: berkeleychocolatega on 28 Sep 2006 17:33 PDT 
Actually the mean number of balls marked is M(M1)*(11/M)^N, not what you originally calculated. The reason your formula is close but not quite correct is that the trials are not independent. So you can't just raise (11/M) to a power. The formula above was determined just for M=3 and M=4 by diagonalizing the transition matrix of ansel001 along the lines discussed in the earlier problem #736681. The pattern was so clear that I jumped to the conclusion of the above formula. I looked at the standard deviation, but the pattern was not so clear. 
Subject:
Re: buckets, balls, marking, replacement  what is the distribution?
From: berkeleychocolatega on 28 Sep 2006 22:32 PDT 
I computed the variance (square of standard deviation) for M = 3, 4, and 5 and the pattern now seems clear: The variance is (M  1)* (1  1/M)^N + (M  1)*(M  2)*(1  2/M)^N  (M  1)^2 * (1  1/M)^(2N). Perhaps someone wants to check this for a few M's and N's to confirm the pattern. 
Subject:
Re: buckets, balls, marking, replacement  what is the distribution?
From: qresqga on 29 Sep 2006 19:14 PDT 
Let X_i be 1 if i'th ball is unmarked. You ask for information on NX where X=\sum X_i. as you note, E(X) = \sum E(X_i) = N E(X_1), and the expectation is known since E(X_1) = (11/M)^N. For the variance, E(X^2) = \sum_{i,j} E(X_i X_j) = N E(X_1^2) + {N \choose 2} E(X_1 X_2). note that E(X_1 X_2) = (12/M)^N, which gives a closed formula for the variance. The third and other moments can be calculated similarly. As for the complete distribution, there is no obvious distribution, but there are several cases. If N<<sqrt(M) there are N marked balls. If N<<M there are approximately NN^2/2M, with gaussian error. This probably works if N is of order M. Once N is of order M\log M there is a poisson number of unmarked balls. 
Subject:
Re: buckets, balls, marking, replacement  what is the distribution?
From: berkeleychocolatega on 02 Oct 2006 15:55 PDT 
Using ansel001's numbers for M=6, N=4 we get a mean of 3.1066. Using my formula for the mean 65*(11/6)^3 we get 3.1065 (My N is 3, not 4 since the first draw doesn't count. A ball gets marked.) Using the formula of ken13 and qresq we get 6*( 1(11/6)^4 ) = 5.4823 or 6*( 1(11/6)^3 ) = 5.5787. If you accept ansel001's matrix, then their formula is not correct. As I said, they are incorrectly assuming independence. (This was already discussed in problem #736681. 
If you feel that you have found inappropriate content, please let us know by emailing us at answerssupport@google.com with the question ID listed above. Thank you. 
Search Google Answers for 
Google Home  Answers FAQ  Terms of Service  Privacy Policy 