Google Answers Logo
View Question
 
Q: Basic statistics ( Answered 1 out of 5 stars,   7 Comments )
Question  
Subject: Basic statistics
Category: Science
Asked by: lauren411-ga
List Price: $20.00
Posted: 18 Oct 2005 21:24 PDT
Expires: 17 Nov 2005 20:24 PST
Question ID: 582026
In statistics, why do we use the square root of the average of the
squared deviations (i.e., the standard deviation) rather than the
average of the absolute values of the deviations?  I tried looking
this up on Google, and I found a lot of relevant sites, but amazingly,
every one of them said, either literally or in essence, "We COULD use
the average of the absolute deviations, but a better approach is to
use the standard deviation."  That's great, but I want to know WHY
this is better.
Answer  
Subject: Re: Basic statistics
Answered By: livioflores-ga on 20 Oct 2005 07:15 PDT
Rated:1 out of 5 stars
 
Hi laureen411!!


The short answer to your question is that the Standard Deviation is
more intuitive to handle it because it is in the same units than the
raw data, that is the STD is in the same units as the mean, which is
why STD is preferred rather than the variance to measure variabilty.

See the following definition:
"The Standard Deviation is the square root of the variance.
Whether you use the Standard Deviation or the Variance is a matter of
preference. Mathematicians tend to use the Variance as the more
?natural? unit. Engineers tend to prefer the Standard Deviation,
because it is in the same units as whatever is being studied."
"MiC Quality: Introduction to Statistics - Standard Deviation":
http://www.margaret.net/statistics/mic-free/is10.htm


There is a nice explanation that I found that shows why we "feel" the
STD as a more intuitive measure of the variability (and in some way it
is a mathematic "demonstration" of this fact):
Two reasons for the use of the STD are: 
(1) standard deviation is another kind of average distance to the mean and
(2) standard deviation corresponds to actual ruler distance between
the sample and the mean.
The reason (1) is quite obvious, it comes from the formula used to
calculate it (that is, by definition, the square root of the
Variance).
To explain the reason (2) you must note that the Standard Deviation
can be directly related to the Euclidean geometric distance between
the sample and its mean.
In effect, the square of the distance between two points, one with
coordinates xi, i = 1,2,...,n, the other with coordinates yi, i =
1,2,...,n, is:
(x1-y1)^2 + (x2-y2)^2 + . . . + (xn-yn)^2. 
If you replace each y by the sample mean, the result is the variance's
numerator. The standard deviation is just the square root of the
variance. So the standard deviation is the ordinary distance between
the sample and its mean, divided by the square root of n-1.
Summed up from the Measures of Variation paragraph of "Notes to
Accompany Chapter 3:   Numerical Data Description" by Prof. Stanley L.
Sclove - University of Illinois at Chicago - College of Business
Administration:
http://www.uic.edu/classes/mba/mba503/981/ntsch03.htm#5


For additional reference see the following paragraph:
"The idea of the variance is straightforward: it is the average of the squares
of the deviations of the observations from their mean. The details we have just
presented, however, raise some questions.
Why do we square the deviations? Why not just average the distances of the 
observations from their mean? There are two reasons, neither of them obvious.
First, the sum of the squared deviations of any set of observations from their
mean is the smallest that the sum of squared deviations from any number can 
possibly be. This is not true of the unsquared distances. So squared
deviations point to the mean as center in a way that distances do not.
Second,the standard deviation turns out to be the natural measure of
spread for a particularly important class of symmetric unimodal
distributions, the normal distributions.
We will meet the normal distributions in the next section. We
commented earlier that the usefulness of many statistical procedures
is tied to distributions of particular shapes. This is distinctly true
of the standard deviation.
Why do we emphasize the standard deviation rather than the variance? 
One reason is that s, not s^2, is the natural measure of spread for
normal distributions. There is also a more general reason to prefer s
to s^2. Because the variance involves squaring the deviations, it does
not have the same unit
of measurement as the original observations. The variance of the metabolic
rates, for example, is measured in squared calories. Taking the square root
remedies this. The standard deviations measures spread about the mean in
the original scale."
From "Reading from book on Standard Deviation" at Terry Berna's Math
Page at Souhegan High School:
http://www.sprise.com/shs/terryberna/Reading%20-%20Standard%20deviation.pdf


Search strategy:
"use standard deviation because"
"standard deviation because"
"standard deviation rather than the variance"
"standard deviation instead"
"standard deviation instead of the variance"


I hope that this helps you. Feel free to request for a clarification
if you need it.

Regards,
livioflores-ga

Clarification of Answer by livioflores-ga on 20 Oct 2005 13:27 PDT
Hi!!

Yes, I missed the point, I am sorry for that.

The answer in this case is probably less intutive than the "instead of
variance" one. And I think that you can find it at the suggested text
for references:
"3.5.3.1. The Mean Absolute Deviation:
A way to measure the variability or spread of a set of numbers is by
computing their average distance to the mean, called the Mean Absolute
Deviation. The distances from the mean are the absolute values |xi -
m|, i = 1,2,...,n. The Mean Absolute Deviation (M.A.D.) is their
ordinary arithmetic average. Usually we use the Standard Deviation
instead. Two reasons for this are: (1) the standard deviation is
another kind of average distance to the mean and (2) the standard
deviation corresponds to actual ruler distance between the sample and
the mean. "
From the Measures of Variation paragraph of "Notes to Accompany
Chapter 3:   Numerical Data Description" by Prof. Stanley L. Sclove -
University of Illinois at Chicago - College of Business
Administration:
http://www.uic.edu/classes/mba/mba503/981/ntsch03.htm#5

At this point we can follow with the  "Euclidean Ruler" explanation
given in my previous answer:
The reason (1) is quite obvious, it comes from the formula used to
calculate it (that is, by definition, the square root of the
Variance).
To explain the reason (2) you must note that the Standard Deviation
can be directly related to the Euclidean geometric distance between
the sample and its mean.
In effect, the square of the distance between two points, one with
coordinates xi, i = 1,2,...,n, the other with coordinates yi, i =
1,2,...,n, is:
(x1-y1)^2 + (x2-y2)^2 + . . . + (xn-yn)^2. 
If you replace each y by the sample mean, the result is the variance's
numerator. The standard deviation is just the square root of the
variance. So the standard deviation is the ordinary distance between
the sample and its mean, divided by the square root of n-1.
Summed up from the Measures of Variation paragraph of "Notes to
Accompany Chapter 3:   Numerical Data Description" by Prof. Stanley L.
Sclove - University of Illinois at Chicago - College of Business
Administration:
http://www.uic.edu/classes/mba/mba503/981/ntsch03.htm#5


At this point another reference given becomes relevant:
"Why do we square the deviations? Why not just average the distances of the 
observations from their mean? There are two reasons, neither of them obvious.
First, the sum of the squared deviations of any set of observations from their
mean is the smallest that the sum of squared deviations from any number can 
possibly be. This is not true of the unsquared distances. So squared 
deviations point to the mean as center in a way that distances do not.
Second,the standard deviation turns out to be the natural measure of
spread for a particularly important class of symmetric unimodal
distributions, the normal distributions."
From "Reading from book on Standard Deviation" at Terry Berna's Math
Page at Souhegan High School:
http://www.sprise.com/shs/terryberna/Reading%20-%20Standard%20deviation.pdf


The above explain why variance is preferred than average absolute value
deviation, since you now know why is STD used rather than variance,
the conclusion is obvious.

But there are more; I found a text which discuss a related topic and
in the final paragraph states:
"In short, variance is a more powerful concept than MAD, because
predictions about population parameters can be made from sample data.
And, like those steak knives, there is even more: there is a theorem,
the variance theorem, which shows that variances of independent
(uncorrelated) variables are additive. This powerful idea underpins
regression and analysis of variance. MAD is not additive, and hence it
is a much less useful concept in the structure of statistics."
From "I'm Not Mad About MAD" (copyright Education Queensland)
http://exploringdata.cqu.edu.au/docs/why_var2.doc


Finally at MathForum I found several explanations that could be useful
to you at the thread "Standard Deviation vs. Mean Absolute Deviation":
At the botton of this first page you will see links to the answers
given by claimed experts on the topic.
http://mathforum.org/kb/message.jspa?messageID=3986517&tstart=0

Just in case there are the links to the replies:
Re: Standard Deviation vs. Mean Absolute Deviation - Teague, Dan 
http://mathforum.org/kb/message.jspa?messageID=3987571&tstart=0

Re: Standard Deviation vs. Mean Absolute Deviation - dennis roberts   
http://mathforum.org/kb/message.jspa?messageID=3987781&tstart=0

Re: Standard Deviation vs. Mean Absolute Deviation - Olsen Chris  
http://mathforum.org/kb/message.jspa?messageID=3991285&tstart=0


I hope that this helps you now. If you still find the answer wrong and
missed from the topic you have the right to request a refund by
emailing the editors.

Again excuse my misundertanding on the original question.
Regards,
livioflores-ga
lauren411-ga rated this answer:1 out of 5 stars
Very poor answer.  It doesn't address the central issue in the
question: why use standard deviation instead of average absolute value
deviation?  Instead, the answer talks about why we use standard
deviation instead of VARIANCE.  That's an easy question, and not the
one I asked.

Comments  
Subject: Re: Basic statistics
From: iang-ga on 19 Oct 2005 15:07 PDT
 
One problem with working with deviations is they can cancel each other
out - the average of +240 and -240 is 0, which isn't helpful
information if you're about to stick your fingers into a mains socket!
 Working with the roots of the squares gets rid of the negatives and
allows you to focus on the "size" of the numbers.  There may well be
other reasons of course!

Ian G.
Subject: Re: Basic statistics
From: pforcelli-ga on 19 Oct 2005 15:36 PDT
 
Average absolute value of deviations would work; but the reason we
square the deviations, is so that it can be visualized geometrically
as the area of a sqaure.  Strange I know, but true.
Subject: Re: Basic statistics
From: flyinghippo-ga on 20 Oct 2005 09:04 PDT
 
Lauren411,

The problem of deviations cancelling each other (as they always will
if you compute deviations from the mean) can be solved by considering
their absolute values (something I believe you mentioned in your
question). So, in the first example (by Ian G.) -240 and +240 will not
cancel each other but will give you a mean absolute deviation of 240.

The problem with absolute deviations is that you can not do much with
them mathematically. For example, if you have a protfolio of two
stocks and you know that they move completely independently, the
variance of your portfolio (as a measure of the risk you're taking)
will be simply a sum of variances of those two stocks. No such simple
formula exist for absolute deviations - so, you are stuck squaring
them.
Consider also Chebyshev's inequality: in any population, if you know
its standard deviation, you know that at least this much
1-(1/k)^2 
of the whole population is within +/- k standard deviations from the
mean. In other words, If you know the scores on an exam average 60
points and the standard deviation is 10 points, you know that at least
3/4 of all people got between 60-2*10=40 and 60+2*10=80 points
(regardless of how the distribution of the scores looks).
In special cases, like the popular Normal distribution (a.k.a. the
Gaussian Curve), you know exactly how much of the curve's area is
within so many standard deviations from the mean. You can also
calculate this area for other distributions, which comes handy in
calculating all kinds of probabilities: from the risk of a company's
defaulting on its loans to the odds of a child having a disease if you
find a certain mutation.

None of these convenient calculations exist for the absolute
deviations. Squaring the numbers is not too high of a price to pay for
being able to do a lot of useful calculation with the result.

By the way, you don't have to subtract each point from the mean before
you square the difference. There is a neat shortcut for that. Square
each value as it is (not subtracting the average). Get the mean value
of those squares, then subtract square of your population's mean. For
example, your numbers are 1, 3, 4, 5, 7
* First, you calculate the mean (1+3+4+5+7)/5=4
* Then you are supposed to find a difference of each individual number
from the mean. Don't waiste your time! Just square each number as it
is: 1, 9, 16, 25, 49
and sum those squares 1+9+16+25+49=100
* Square the mean of your original readings 4^2=16 and multiply by the
number of readings 16*5=80 Subtract that from the sum of squares you
computed before 100-80=20
* If you want the population variance, divid this number by the number
of readings 20/5=4
If you want the sample variance, divide it by (n-1) 20/(5-1)=5
This is your variance - you just accomplished this with one
subtraction instead of five! If you have a larger number of
measurements, this trick will save you a lot of time.

Good luck,

FlyingHippo
Subject: Re: Basic statistics
From: bozo99-ga on 20 Oct 2005 16:34 PDT
 
Variance has useful algebra related to it.
V = SUM( (x{i} - x{av} )^2 )
  = SUM (  (  ... expand the squared bit ...)  )
  = SUM ( x{i}^2 ) - n.x{av}^2
unless I've fluffed my algebra remembered from ages ago.

I speculate that before calculators it was easier to get a standard
deviation by summing both a column of x{i} and a column of x{i}^2 
(and a short finishing step) than by calculating the mean and then
doing a series of subtractions
and additions.   (My guess is this is the main reason.)

Also algebraically you can think of the contribution of one data point
without knowing in advance whether it is above or below the mean and
whether you have to  multiply that contribution by -1.
Subject: Re: Basic statistics
From: llcooldl-ga on 23 Oct 2005 06:51 PDT
 
I think the "Answer" to this question was perfect, it explains exactly
why standard deviation is used!
Subject: Re: Basic statistics
From: mrmoto-ga on 25 Oct 2005 01:46 PDT
 
Some of the answers and comments so far have been a bit confusing.
As a graduate student in math, I hope to be able to give some
clarification.

There is a strong relationship between mean and standard deviation,
and analagously between median and sum of absolute deviations (from
the median).


Short Answer:

 standard deviation
  - is very easy to calculate, and to work with in formulas
  - has all sorts of "nice" properties (e.g. see
      http://en.wikipedia.org/wiki/Standard_deviation)
  - is the "best" measure of distance from the average, _if_ the
      data has a normal distribution

 average of absolute values of deviations 
  - is a bit cumbersome and harder to manipulate algebraically
  - is more robust with respect to outliers
  - is more appropriate to calculate with respect to deviations
      from median



Longer Answer:

Some people have noted that it's easier to work with squares than
absolute values.  This is true, but there's more to it than that.

Consider the notion of "average".  If you want the average
of a set of n numbers, the standard approach is to use the
_arithmetic mean_; it usually provides a good idea of "average", so
long as the numbers have a normal distribution.

If there are many outliers, on the other hand, the arithmetic mean
will give a distorted representation of the numbers.  In this case,
the _median_ is usually more appropriate.

For example, let's say you have the following data:
0,1,1,1,2,2,2,3,24.

The arithmetic mean is 4, so in fact all of the numbers
except 24 are "below average" -- this is a bit unsettling.

The median is 2, which is (probably) more meaningful here.


Now, to come to the point about deviation.
Using the numbers from the above example, the deviations
from the mean are -4,-3,-3,-3,-2,-2,-2,-1,20.

The standard deviation is approximately 7 -- you can see that
the presence of the 24 has skewed not only the mean but also
the standard deviation. (You probably wouldn't think of describing
the numbers as "four, plus or minus 7".)

So should we use the average of the absolute values of the deviations
from the mean instead?  Not necessarily -- if you do so, you're
implicitly agreeing that the mean is appropriate, which it isn't here.
But, just to see what happens:  the average of the absolute deviations
from the mean is about 4.5.

The average of the absolute deviations from the _median_, on the other
hand, is about 3.  (The distances to the median are -2,-1,-1,-1,0,0,0,1,22)

The reason why you might want to calculate the sum of absolute deviations
from the median rather than the mean is as follows:

* The mean is the number x that minimises the sum of the squares of
    deviations from x (i.e., it gives the best standard deviation)

* The median is the number x that minimises the sum of the absolute
    values of the deviations from x (and therefore gives the best
    average of these absolute values)


Here's another related example.

Suppose you're trying to find a "best fit line" through some points.
Usually you would calculate the "least squares" fit, which minimises
the sum of squares of deviances from the line.  This method is good
most of the time, but will be affected by outliers.

A better approach when there are outliers is to find the line that
minimises the greatest absolute distance from the line.  This is entirely
analogous to the above example.

However, there's a bit of a catch -- and this is where squares of
deviations truly come in handy.

The former minimisation problem is very easy to solve -- you
can often do it by hand for relatively small data, and most
scientific calculators also have this capability.

The latter problem is harder, because of the awkwardness of the
absolute value function in calculus.  It is still possible to solve,
but requires iterative methods, or linear programming techniques.

A good resource for this is R. Vanderbei's book on linear programming
at:

  http://www.princeton.edu/~rvdb/LPbook/index.html

(see Part 1, Chapter 12, which discusses mean, median, best-fit lines,
etc.)



In conclusion:  the average of absolute values of deviations is cumbersome,
but more robust when there are outliers.

I hope this helps!
Subject: Re: Basic statistics
From: benreaves-ga on 27 Oct 2005 00:07 PDT
 
The SD penalizes outliers more than the MAD does.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy