Google Answers Logo
View Question
 
Q: Solve this statistical question ( Answered 4 out of 5 stars,   3 Comments )
Question  
Subject: Solve this statistical question
Category: Science > Math
Asked by: shikibobo-ga
List Price: $20.00
Posted: 03 Jun 2003 08:05 PDT
Expires: 03 Jul 2003 08:05 PDT
Question ID: 212436
Clyde takes the same 30-minute walk every day. During his walks, he
listens to music on his iPod. His iPod contains 611 songs in memory,
which Clyde listens to on a randomly shuffling basis.

Last week a certain song started playing at precisely the same point
along his walk 3 times in 5 days. My question: What is the precise
mathematical probability of this highly unusual occurrence?

Clyde allows that the second and third time, there may have been up to
a 3-second variance from the exact spot where the song began the first
time, but that "all three times were within +/- 3 seconds at the
most."

I would like 3 answers, changing one variable:
1. perfect alignment, no variance - 3 hits at exactly the same point
along the walk
2. with up to +/- 3 seconds variance - a 6-second spread
3. with up to +/- 6 seconds variance - a 12-second spread

To answer this question you will have to determine or make an educated
guess regarding average song length. All songs are in the Contemporary
Christian category. If a crooked number, like 3:56, back it up. (Good
backup merits a tip.) Otherwise choose a round number like 4:00.

You might also need to know how far into the walk the exact song-start
location is. I'll ask Clyde and post that info as a clarification.

If the Random function is not truly random with respect to iPod track
selection and playback, that is important to know. If this affects
your analysis, explain how it affected the analysis and how it
increased the degree of difficulty and I will tip you accordingly.

I require a thorough mathematical explanation. Walk me through the
math and explain all assumptions, variables, and statistical
variation. Your answer must be mathematically and scientifically
unassailable.

Clarification of Question by shikibobo-ga on 03 Jun 2003 13:46 PDT
Clyde reports that the point where the song began is "3/4 of the way
through the 1 mile walk."

Also my request that your answer be "unassailable" was perhaps too
extreme. I do not want to scare off a Researcher who can provide a
reasoned, methodical answer. Your methods and math need to be solid
and defensible, but I understand your answer is ultimately going to be
an approximation built on approximations.

The point of the exercise is to make clear that the odds of this
happening are astronomical - whether it's 60 million to 1 or 600
million to 1 is of lesser importance. But I want the analysis to be
solid because it might make its way into publication.
Answer  
Subject: Re: Solve this statistical question
Answered By: haversian-ga on 04 Jun 2003 20:12 PDT
Rated:4 out of 5 stars
 
Hi there!

Clyde has too much time on his hands:)

Ok, first of all we have to know how random the shuffle algorithm is. 
Information on this is hard to come by, with one site (
http://www.audiorevolution.com/equip/ipod/ ) saying the iPod's
algorithm is good, and another (
http://www.v-2.org/displayArticle.php?article_num=330 ) complaining
that it favors a dozen or so files over the others.  I will presume
first that the algorithm is "perfect" - that is, each song has
precisely the same probability of playing next as any other.

Next, an assumption.  Since we do not know the lengths of Clyde's
songs, nor precisely what time the "same time each day" was, I am
assuming that each song has an equal probability of beginning at any
given second.  This may not be a valid assumption.  Consider as a
simplification 2 songs, one 1 minute long, one 10 minutes long.  The
probability of a song beginning at 1 minute is 50% (either long-short
(not start at 1 minute) or short-short or short-long (starts at 1
minute)) whereas the probability of a song beginning at 2 minutes is
only 25% (short-short-short; short-short-long; short-long-short;
short-long-long; long-short-short; long-short-long; long-long-short;
long-long-long; 2 of 8 have songs starting at minute 2).  As the time
interval lengthens, the probabilities even out; as the number of songs
increases, the probabilities even out.  So, since this is an unknown,
I am assuming the simplest and most likely case, that they are equal.

Next, 1 more assumption.  I assume that "at the same time" you mean
"during the same second", and "within 3 seconds" means "during the
same second or the two adjacent seconds".  This addresses the concern
of a commenter below who notes that no two event will occur at
*precisely* the same time - there is some finite, albeit
infinitesimal, variation.

Another assumption: song length.  This establishes the exact
probability of *any* song starting during any given second.  Shorter
songs imply a greater probability of a song starting; longer songs
imply a lower probability.  The number we choose is unimportant to the
analysis since it is merely a scaling factor.  That is, should Clyde's
songs be, on average, half as long as mine, his chances of a song
starting at any given second will be twice what mine are.  The math
remains the same, and you can multiply the numbers by 2 if you would
like.  With that and one more disclaimer, that I'm not big into
Christian anything, I do have a diverse collection of audio, spanning
books on CD, classical, jazz, '80s, movie soundtracks, Celtic,
electronic/alternative/new-age, and many other categories.  According
to Winamp, there are a total of 11464 tracks spanning 2377220 seconds,
or an average 207.36 seconds per song.  Thus there is a 1/207.36
(0.48224%) probability of any song beginning at any given second (with
our simplifying assumption from above taken into account).

Yet more assumptions!  "3 times in 5 days" indicates the phenomenon
occurred on the first, 5th, and one other day.  After all, if it
happened 3 days in a row, we would be hearing "3 days in a row" and
the question would have been asked with perhaps more incredulity.  We
have 2 probabilities to calculate and multiply: first the chance of it
happening on a given day, and then the chance of it happening on the
requisite 3 of 5 days.  We assume that the probabilities associated
with song distribution do not change from day to day.

At long last, on to the calculation:

We have 32 possible day-patterns (which days the phenomenon occurs):
Y/Y/Y/Y/Y, Y/Y/Y/Y/N, Y/Y/Y/N/Y, Y/Y/Y/N/N, Y/Y/N/Y/Y/, Y/Y/N/Y/N,
etc.), of which 3 interest us: (Y/Y/N/N/Y, Y/N/Y/N/Y, Y/N/N/Y/Y). 
That is, regardless of the probability of song S starting during
second T on any given day, the chance of it happening on 3 of 5 days
according to my interpretation of the question are only 3/32nds as
high.

During second T we have a 0.48224% chance of a song starting.  Since
there are 611 songs, we have 1/611th that chance (0.00078927%) of the
*particular* song starting at the right time.

Multiplying that by the 3/32 from before, we get 0.000073994%, or 1
chance in 1,351,460.  Pretty slim.


There is one more wrinkle however.  The first day is really "free". 
Suppose the song had not played on day 1.  Then Clyde would not have
mentioned it.  Thus, the question only becomes important because it
happened the 2nd and 3rd time, not because it happened the first. 
After all, on day 1, *some* song started playing at *some* time, and
if it happened twice more in the following 4 days, it would be
important; otherwise not.  Without rehashing the calculations above
(which I can do if it's not obvious what changes I am making), the
final chances are 1 in 675,730.


That was part 1.  Fortunately, parts 2 and 3 are easy.  We know the
probability that the right song will start during the right second. 
The probability that the right song will start during 6 right seconds
is 6 times as high; during 12 right seconds, 12 times as high.  That
gives 1:112,622 and 1:56,311 respectively.



Probability has some very nonintuitive areas to it, so if my
explanation is unclear or seems wrong, don't hesitate to ask about it.
 If you take issue with any of the many assumptions I have made to
answer your question, I can change them in my analysis.  This has been
an interesting problem, and I would be more than happy to revisit it.

-Haversian

Request for Answer Clarification by shikibobo-ga on 10 Jun 2003 11:56 PDT
It happened again on the 9th day, and Clyde says he took special note
of the next song - it was one he did not remember ever hearing before.

Let's assume pure randomness for the purposes of the calculation.

Let's also eliminate the "window" by asking simply, What are the
chances that out of 611 possible songs, a certain song would be the
5th song played in a random sequence on day 1, day 2, day 5, and day
9?

Clarification of Answer by haversian-ga on 11 Jun 2003 12:43 PDT
> Let's assume pure randomness for the purposes of the calculation. 
    More precisely, that each song has an equal chance of playing,
regardless of what has played before?

> Let's also eliminate the "window" by asking simply, What are the
chances that out of 611 possible songs, a certain song would be the
5th song played in a random sequence on day 1, day 2, day 5, and day
9?

Again, we assume day 1 is "free", leaving us multiplying the
probability of the song *NOT* playing as #5 on days 3, 4, 6, 7, 8, and
9 by the probability of the song playing as #5 on days 2, 5 and 9.  If
each song is independently chosen, the odds of it playing at any given
time are 1/611; the odds of it not playing are 610/611.  Thus, we have
1/611 * 610/611 * 610/611 * 1/611 * 610/611 * 610/611 * 610/611 *
1/611 = 84459630100000 / 19423598036535983659681 = 1 in 230 million.
shikibobo-ga rated this answer:4 out of 5 stars

Comments  
Subject: Re: Solve this statistical question
From: endo-ga on 03 Jun 2003 11:20 PDT
 
I'll post this as a comment because I don't have the answers to all
your questions. One thing is sure though, there is no such thing as a
'random sequence of numbers'. The best you can achieve is
pseudo-random, the algorithm used on the Ipod is probably not very
good at randomizing song orders, another example of this is Winamp's
so called shuffle. So certain songs are more 'privileged' than others
and tend to appear more often than others. Then there's the fact that
Clyde probably skips some songs he doesn't really want to listen to,
and so invariably gets back to the privileged songs. Your question can
be reduced to: 'what is the probability that a given song will be
played after a given time'. But this is not easy to answer because we
don't know the probability of a given song appearing, we cannot assume
it to be equal for all songs.
Thanks.
endo
Subject: Re: Solve this statistical question
From: funkywizard-ga on 03 Jun 2003 19:36 PDT
 
It is likely that the Ipod is not using a random list at all. Most
players use a "Shuffle mode", which at least for me in winamp, tends
to play the same songs in order, though not in the order in which they
are listed. The result is that the songs are not played in the list
they are added, but indeed are always played in a definite order. To
alliviate this, one should either start playing music from a different
song in the list, or change the songs that are available to play.
Subject: Re: Solve this statistical question
From: racecar-ga on 04 Jun 2003 11:51 PDT
 
The answer to your question #1, 'perfect alignment', is ZERO, at least
if you regard the time a song begins as a continuous variable.  Even
if it's quite small, like 0.1 second, you need to specify a time
interval to get a finite probability.  The other possibility is to
regard the times as discrete: each song has an exact length, and the
starting time is exactly the same on each day, so a song cannot start
at any time, but only at a time that can be made by adding the lengths
of a number of songs together.  If this is the case, you need to
specify the exact lengths of all 611 songs to get an answer.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy