 View Question Question
 Subject: Statistics of poker results Category: Science > Math Asked by: delard-ga List Price: \$100.00 Posted: 16 Sep 2006 07:04 PDT Expires: 16 Oct 2006 07:04 PDT Question ID: 765810
 ```Hi, The question is how to use statistics to establish whether one poker player is better than another. Some context : 1) You have the amount won or lost on every hand each player has played. 2) Each has played a large number of hands (10's of thousands at least). 3) Each plays in the same kind of games (NL holdem full ring cash games as it happens), with the same blind levels etc. 4) Each player has a different long term expectation (i.e. mean profit) from each hand - but you do not know what this is, other than what can be implied from the data. 5) Each player has a different level of variance (standard deviation) - you do not know what this is - but if it helps it is not unreasonable to assume both players have the same standard deviation. 6) The distribution of the amount won/lost on each hand is not normally distributed. However if you sum 100 consecutive hands together - then these '100 hand' results are much closer to a normal distribution. For the sake of this question you can assume '100 hand' results are normally distributed. 7) The nature of poker is such that the standard deviation of results is high compared to the long term mean result - i.e. in a given session of a few hundred hands you expect the typical result to be massively different from the long term average for that many hands. Because of this it is quite possible to have long periods of unusually good or bad results - meaning that player A could have a much higher long term expectation than player B - but by chance show a much lower result over a given eg 20k hands. So far - I have taken each players results and aggregated them into '100 hand' results. Then calculated the mean and standard deviation of the '100 hand' results. I've then calculated standard error and 95% confidence intervals for the mean (of each player). This gives me some indication of which is the best player of course - but to make any definitive statement I need to wait until the upper 95% CI of player A is less than the lower 95% CI of player B. This works but takes a LOT of hands (i.e. 100's of thousands - and I don't have that many!). Instead - I would like to use alternative statistical techniques to calculate the probability that player A has a higher long term mean than player B. I suspect that if done properly this could yield a high degree of confidence a lot quicker than the CI method above. As more and more data was added this probability would converge towards 0 or 1. Slightly more generally - I'd like to be able to calculate the probability that the long term mean of player A is X% higher than player B. Note - if this question is phrased in such a way that makes it very difficult - but you think the same underlying point can be answered another way - then go ahead and rephrase it. Please provide details of statistical techniques which can calculate this (assuming they do exist!) - along with details of the calculations which need to be performed. Don't assume I know much more than high school level math. You can assume I'm fine with implementing the calculations in Excel using VB (i.e. you don't need to tell me how to use some stats application - I will be coding the formulae). thanks - Delard``` Request for Question Clarification by elmarto-ga on 16 Sep 2006 07:58 PDT ```Hello! It might help if you provide us the actual numbers for "mean profit per hand" and "standard deviation of the profit per hand" for both players. Also, I see that you are interested in "the probability that the long term mean of player A is X% higher than player B". Do you need this for many different values of X? Or do you want to base this X on the actual sample means? For example, if the sample mean of player A is 10% higher than the sample mean of player B, are you simply interested in the probability that player A's mean is 10% greater than player B's, or do you also need to know the probability that it's 5% higher, 2% higher, etc? Thank you very much, elmarto``` Clarification of Question by delard-ga on 16 Sep 2006 09:04 PDT ```Hi, To answer your request for clarification - the "mean profit per hand" is a number somewhere between 10% and 20% for these players (expressed in units of big blind size), or alternatively 10 - 20 BB/100hands. Its hard to be more precise than that at the moment as the samples are too small - however the precise current value isn't very important as the point of this question is to be able to rerun the analysis as more data arrives. The "standard deviation of the profit per hand" is a number around 85 BB/100 (ie calculated using samples of 100 hands). Yes - I'm interested in "the probability that the long term mean of player A is X% higher than player B" for different values of X - not just the observed sample mean difference. The motivation here is to be able to ask "is one player significantly better than the other". For example - lets say we have 30k hands on both players - and those samples imply a mean of 12% for player A and 17% for player B. Although player B has performed much better over this small sample - its very plausible that A has a higher long term mean than B - all it takes is for A to have had slightly bad luck and B a bit of good luck. However if we were to ask whether A has a significantly higher long term mean (say 10% higher) then I would imagine even this much data is enough to say that is unlikely. I hope this clarifies the question - please don't hestitate to ask more questions if theres anything that isn't clear. thanks - Delard``` Clarification of Question by delard-ga on 16 Sep 2006 11:22 PDT ```Hi, I'm really keen to get a solution to this (theres a bet involved!) - so I've upped the price to \$100 in case this question represents too much work for \$50. However note I'm looking for a solution here - ie something I can understand and implement - not just some links to web pages on hypothesis testing. thanks - Delard``` There is no answer at this time. Subject: Re: Statistics of poker results From: berkeleychocolate-ga on 16 Sep 2006 11:35 PDT
 ```I believe you are looking for pie in the sky. That is, you can't improve on the confidence interval approach until you know more. Here is my reasoning: Let us assume that each player has a probability p of winning a hand and that the p's are independent from hand to hand. Then it doesn't matter how much or little is bet since the plays are independent. So we can say each pot is \$1. Then the profit on each hand is binomial and the winnings after n hands is the sum of binomials, which is approximately normal (by the central limit theorem - and the approximation is very good after 100+ hands). Then the confidence interval approach is the only way to test which player is better. There is no other statistical method. I'm sure that in the real world that the above assumption is not correct. The p's are not independent and the probability of winning is related to the amount bet. This is the psychological element in the game. If you have some handle on how that works, then you can build a better model and answer your question.```
 Subject: Re: Statistics of poker results From: delard-ga on 16 Sep 2006 12:37 PDT
 ```Hi, The actual distribution of the individual hands results is a complicated thing - ie as you say the assumption that each hand can be modelled as probability p of winning 1 isn't enough. A lot of the hands are zero, a lot have a small loss, then there are small number of hands with significant results - hopefully skewed towards there being more big hands you win than lose - and winning bigger pots vs losing smaller ones. It isn't normal or binomial - and even doing the 100hands trick doesn't get you a perfect normal distribution - just hopefully close enough that it doesn't matter. My problems with the Confidence Interval method of deciding if player A is better than player B (ie look at whether the lower 95% CI of A's mean is higher than the lower 95% CI of B's mean) are : 1) It seems to take an immense amount of data to reach this 95% confidence. ie by cloning my data I've done this on large samples - and it takes about 250k hands to get a 5% spread on the 95% CI - that is if a player has a mean of 15% - after 250k the CI's would be 12.5% - 17.5%. This means that if the was 5% difference between the two players you would need 250k hands on them to be 95% confident that one was better than the other. Intuitively this seems wrong - there is less chance than that of such a big difference being by chance over such a large sample. 2) Using the CI method I don't know how to make the test that player A is X% better than player B. ie significantly better. 3) I'd like the calculation to give me a confidence - rather than just a yes/no on 95%. But maybe you are right - and some elaboration of the CI method is as good as it gets.... - Delard```
 Subject: Re: Statistics of poker results From: dcjohn-ga on 16 Sep 2006 14:58 PDT
 ```It's a fascinating question, and one I've been pondering for a while without any solid answer. (And I teach graduate courses in research methods.) And it's even more complicated than the issues you've laid out. For example, how do we operationalize "goodness" in a poker player? Optimal poker strategy is situational (it's a game--in the game theory sense), and so it's easy to imagine a player who outperforms another at a low-stakes, online situation who does much poorer relative to the same "opponenet" in a high stakes, face-to-face game. (The stakes really don't themselves change things of course--it's a matter of the different likely player styles and abilities that you find at the different stake levels.) So... you'll probably want to be careful about the generalizability of whatever result you come up with, or at least be mindful of the variation in the game type you're getting the data from. Just a suggestion: raise this in the 2+2 online poker community forum. It'll lead to some interesting discussion.```
 Subject: Re: Statistics of poker results From: berkeleychocolate-ga on 16 Sep 2006 15:07 PDT
 ```Just to reiterate a point I tried to make in reply to your last comments: Since the trials are not independent there is no reason to believe that the sums are approximately normal. (Also even if they were normal, since the bet sizes vary, the sums cannot be assumed to be normal.) Therefore you cannot rely on confidence intervals which are based on the assumption that it is normal (or at least it is a known distribution). The answers your confidence interval gives are not reliable. You should not have any confidence in your confidence interval approach!```
 Subject: Re: Statistics of poker results From: delard-ga on 16 Sep 2006 15:59 PDT
 ```Hi, dcjohn - you are right about not generalising - I'm looking for a way to compare 2 players who both play online at the same type of games and stakes against the same types of opponents. berkeleychocolate - you are also right that the 100 hand session results themselves don't have a normal distribution - in fact I ran the data through a 'normal test' stats package and it confirmed that it doesn't quite fit a normal distribution. However - I suspect that the 100hand samples are close enough to a normal distribution that the results implied from normal type analysis are still interesting. Certainly the CI results yield figures that seem to roughly match intuition. I suspect that to solve the 'non normal' issue - you have to move away from doing 100hand samples altogether and start looking at the real distribution of the individual hand results - which is very un-normal - but could be analysed in various ways. Anyway - for the sake of this question I'm happy to just assume the 100hand samples have a percent normal distribution - and worry about that assumption later. One step at a time :). - Delard``` 