View Question
 ```Suppose I want to do some statistic tests like this problem. What can I do? A survey on car defects was done in Dane County, Wisconsin, based on a random sample of people who had just purchased used cars. Here we will look at brake defects. Each car was classified as to whether it was purchased from a car dealer or from a private owner. Defect. Not Defect. Dealer 931 2723 Private 1690 3498 Is there a relationship between a car having defective brakes and whether it was purchased from a dealer or a private owner? Do an appropriate test``` Request for Question Clarification by mathtalk-ga on 17 Mar 2004 21:59 PST ```Hi, chanchai-ga: Would you care for a succinct explanation of the chi-square test for approaching problems like this? regards, mathtalk-ga``` Clarification of Question by chanchai-ga on 18 Mar 2004 06:18 PST ```Hi mathtalk-ga, You can do whatever you want to get the best answer. Thanks``` Request for Question Clarification by mathtalk-ga on 18 Mar 2004 18:51 PST ```Hi, Chanchai-ga: I'd be happy to tailor my explanation to your background, if only I knew a bit more about your studies/interests. Was my previous Answer too technical, on target, or not geeky enough for your tastes? Of course, if you prefer me to take my best shot and ask for clarification later, that will work also. regards, mathtalk-ga``` Clarification of Question by chanchai-ga on 18 Mar 2004 22:09 PST ```Hi mathtalk-ga, Your previous answer is ok for me. I have a background in mathematic so it's ok for me to understand your work. You can go ahead and do your best shot. If I don't understand I will ask you later. Thanks,```
 ```Hi, chanchai-ga: This type of problem is treated by a branch of mathematical methods called "statistical inference". Given the data that you present, we are asked to evaluate whether it suggests that there is "a relationship between a car having defective brakes and whether it was purchased from a dealer or a private owner". What is often done in a situation like this is to apply the (Pearson's) chi-square test. The idea is to consider the chance that _if_ there were essentially no distinction between cars purchased from a dealer and cars purchased from a private owner, we'd get results which depart from a mutual "average" incidence of defective brakes at least as much as the observed results do. Before we dive into the relatively mindless details of computing this chance, it's important to realize that this probability is strictly speaking _not_ the same as the chance that there is no "relationship" between the type of sale and the likelihood of defective brakes. More formally we would draw up the definition of conditional probabilities and throw out Bayes formula as a way of rigorously bridging the gulf between "probability of getting results like this, given no difference in populations" and "probability of no difference in populations, given that we got results like this". Let's just leave it at that though, and proceed to do the chi-square test. Here we have the simplest of cases, the "two outcome" situation. Either a car has defective brakes or not. Assuming that the two groups, dealer sales and private sales, are actually samples of a common population ("the null hypothesis" meaning no real distinction between them), we would expect samples of different sizes to randomly depart from the perfect average according to binomial model. When the samples are as large as they are here, it's a quite practical simplification to use a continuous model, the normal distribution, for the sample averages instead. The chi-square test is most easily carried out by hand if we total both the rows and columns in the 2x2 table that you've already provided: Defect. Not Defect. Totals by Group Dealer 931 2723 3654 Private 1690 3498 5188 Totals by Outcome 2621 6221 8842 where the lower right hand corner, the "grand total", is the sum either of the group totals or the outcome totals. We next use these totals to find an "expected" value for each of the original four entries, based on the "null hypothesis". That is, if the two samples (groups) are drawn from a common population, then by merging them together we get the best estimate available of their common fraction of defective and nondefective brakes. Here we see that of the grand total of 8842 cars sold, we have altogther these fractions of the two outcomes, rounded appropriately: cars with defective brakes: 2621/8842 = 0.2964 cars with nondefective brakes: 6221/8842 = 0.7036 Now apply these two outome fractions to the respective group's sample sizes. In the dealer category we have 3654 cars and in private owner sales, 5188. Multiply each sample size by the fractions above, and we get the estimated "expected" value for all four entries. Expected Expected Defect. Not Defect. Totals by Group Dealer 1083 2571 3654 Private 1538 3650 5188 Totals by Outcome 2621 6221 8842 Here I've done a bit of rounding to keep the numbers "nice". The actual proportions would of course produce decimal fractions, but the large numbers of observations make it practical for our purposes to round to whole integers. Next we find the four differences between observed values O and expected values E in each category, O - E. For example, in the upper left hand corner, we observed 931 cars sold by dealers with defective brakes, but (assuming an average 0.2964 fraction of all cars have these) we expected 1083. The observed value is less than the expected, so O - E in this entry is: O - E = 931 - 1083 = -152 The undercount here must be exactly offset by an overcount in the complementary category of observed minus expected cars sold by dealers without defective brakes, and also in private owner sales of cars that turned out to _have_ defective brakes. And each of those overcounts must be offset by an undercount in the observed minus expected cars without defective brakes sold by private owners: O - E O - E Defect. Not Defect. Totals by Group Dealer -152 +152 0 Private +152 -152 0 Totals by Outcome 0 0 0 As you can see from this example, the 2x2 (two outcomes, two groups) format means that the O - E calculation only has to be done once! Now Pearson's chi-square statistic is a single number that we're about to calculate. Once we have it, we can look up in the appropriate table to find out how likely it is that the number would be as big as it is, if the "null hypothesis" holds. As to the interpretation of that, hold on for just one more minute... The chi-square statistic is the sum of the four ratios (O - E)^2 / E, one ratio for each of the four categories. That is, in our problem, 152 squared is: 152^2 = 23104 and we'd have these terms: (23104/1083) + (23104/2571) + (23104/1538) + (23104/3650) which works out to roughly 51.67. Admittedly we've rounded here and there a little to keep the numbers whole up to the end, but I'll point you to a Web page calculator here: [GraphPad QuickCalcs: Analyze a 2x2 Contingency Table] http://www.graphpad.com/quickcalcs/Contingency1.cfm where you can just enter your four original data values and have the crunching done for you, and their value will be just slightly higher (51.767 for chi-square without Yates' correction). Now the interpretation of this statistic depends on what is called the "degrees of freedom" in the "measurements". This is a fancy term, to be sure, but what it boils down to is one less than the number of possible outcomes. If (as here) there are two possible outcomes (defective vs. nondefective brakes), then the degrees of freedom ('df') is 1. Here's a table that gives "cut-off" values for the chi-square statistic with varying degrees of freedom (df) and "levels of significance" (which amount to probabilities such as mentioned earlier): [Chi-square Table] http://www.ento.vt.edu/~sharov/PopEcol/tables/chisq.html In that table we see that, for df = 1, the significance level P = 0.001 corresponds to a chi-square value of 10.83. Our value, 51.67 or so, is much bigger... and so one would say the observations are "statistically significant" at the 0.001 level. In fact the chi-square value we got is so big, that it's statistically significant even at the 0.0001 level. But what does all of that mean? Well, if the two given sample sizes had been drawn from a uniform population, then (based certain normal distribution approximations implied by the use of Pearson's chi-square statistic) it is estimated a value as large as 51.67 would occur "by chance" less than once in ten thousand times; hence "significant" at the 0.0001 level. Actually I didn't find a chi-square table that really gives cut-offs comparable to the value 51.67. The highest cut-off (for one degree of freedom) that I found cited was for significance level 0.000001 (one in one million), and even that was only 23.94, less than half the result we have. It's fair to say that the chance of getting the chi-square statistic to be as big as 51.67 with those sample sizes is extremely small, esp. in comparison to the levels of significance that are ordinarily used in surveys and other "social science" applications (where 5% is frequently used). One should be aware of making an interpretation, however, when turning this "math fact" around and concluding that _therefore_ it is likely that there is a relationship between the type of car sales and the defectiveness of brakes. This is where "mathematical rigor" requires some additional knowledge, often of a kind impractical to obtain, which is why most of us would be forgiven for throwing deduction out the window and just saying, "since there's less than one chance in a million this occurred by chance, chances are it wasn't by chance!" Pearson's chi-square statistic worked very nicely in your case. With the large number of observations available in your data, it left little room for doubt that there really is difference in the two underlying groups (cars sold by dealers vs. cars sold by private owners). Often one can only obtain much smaller numbers of observations, either because of budgetary or opportunity limitations, and then the appropriateness of an underlying (continuous) normal approximation to a (discrete) binomial model comes into question. A usual rule of thumb is not to use the chi-square test unless all four entries in the 2x2 "contingency" table have expected values of at least 5. Also there's a conservative "correction" (Yates' correction) that is often applied with fairly small numbers (and one degree of freedom). Conservative in this context means making it harder to concluded that it occurred "by chance" with some constant (low) level of significance, in effect by handicapping the chi-square statistic to smaller values. For example, in our data if the Yates' correction were applied (see the options in the calculator page linked above), then the statistic would have turned out to be 51.4 instead of 51.6 (still a mind-bogglingly large result, in terms of statistical significance). One sometimes sees binomial model computations touted as being "exact", as they are in some sense. Certainly when there are only a handful of observations to work with, the binomial model gives the best feel for how likely the numbers could have occurred "randomly". Yates' correction can then be viewed as a shortcut to nudge the chi-square statistic toward agreement with the binomial model's levels of significance in an intermediate range of sample sizes. By the point at which all the four entries have expected values of 30 or more, one would be on pretty safe ground using Pearson's chi-square test without a correction; of course the advent of computers has really spoiled us all with the luxury of analyzing computations to far more decimal places than our data really justify! regards, mathtalk-ga```