We are doing a test of 2 different versions of an online ad. We are
alternating which version is served to a particular customer (but that
customer only sees that particular version for their entire history on
our site).
To get a sense whether we have enough data to work with, we decided to
actually do the test as A-B-A. So we end up with two groups of
customers that have seen the exact same thing.
So far our data looks like this:
Segment Visitors Orders ConversionRate
A(1) 10375 312 3.01%
B 10706 326 3.05%
A(2) 10299 352 3.42%
The reason that the number of visitors is not equal in all three cases
is that we cannot set the tracking on some visitor's web browsers.
Therefore, some people are not included in the test.
The questions are these:
Is the difference in the number of Visitors in each segment a "normal"
sort of difference? To me, it seems that the disparity is too large.
I could understand numbers like 104, 107 and 103 but after 10,000
assignments, I would expect the percent differences to be smaller. I
would like a statistical explanation of what a "normal" difference
would be. ... please ask questions if this is not clear. Here's an
example of the sort of answer I am looking for: "If you ran a random
assignment trial with at least 10,000 assignments in each segment, you
would have a 50% chance of having a trial with this much difference
between your groups in 4.5 trials." OR "If you ran a random
assignment trial with at least 10,000 assignments in each segment, you
would have a 50% chance of having a trial with this much difference
between your groups in 115 trials."
2) What about the difference of conversion of the two A segments?
What is the liklihood of having this much difference in two equal
segments? I would like the answer in the same sort of format. |