Google Answers: Maths

View Question

Q: Maths ( Answered, 0 Comments )

Question

Subject: Maths
Category: Science > Math
Asked by: lifeafterdeath-ga
List Price: $200.00

Posted: 09 Jun 2003 09:34 PDT
Expires: 09 Jul 2003 09:34 PDT
Question ID: 215091

[Note: Please answer all the 'star points' of each question and read
everything carefully. thanks]


The database application was developed by a group of students for a
travel agencey. they were required to put in 200 records for the
flight details. for each customer the base is to be searched for the
flight availability.
Students ran their search query at different times of the day to
simulate different busy periods on the network. They recorded the time
taken for each query to run. 50 readings were taken and recorded. The
query run times, in seconds, are as follow:

        110  110  115  114  116
        120  112  116  112  117
        120  112  114  115  120
        119  114  115  120  100
        113  113  110  110  119
        115  116  115  115  114
        118  114  116  116  112
        115  116  115  115  114
        120  117  117  117  115
        120  116  118  139  118


Q1.) Considering the first column of the data as a sample of the 50
recordings.
* Draw up a frequency table with tally chart for it.
* Explain what are 'mean , median, and mode'. Find these averages for
this part of the data.
* Calculate the standard deviation of the data .

Q2.) Taking into account all the 50 recordings:
* Draw up the frequency table .
* Find mean, mode and median for the whole set of data.
* Compare these averages with the averages of part one  of this
question and draw your conclusion .
* From the above calculations, what is the most probable time taken to
complete
  a query.
* Define total range, upper and lower quartile, also inter-quartile
range of the data.

Q3.) Complete the grouping of the above data as;
        100-104, 105-109,
     * What is the class width ?
     * Draw the grouped frequency distribution table.
  (give your answers correct to one decimal place)

Answer

Subject: Re: Maths
Answered By: elmarto-ga on 09 Jun 2003 11:27 PDT

Hello lifeafterdeath!
Here are the answers to your questions:

Question 1
----------

* Draw up a frequency table with tally chart for it.

Time taken Tally Freq. Freq. in %
110 | 1 10%
113 | 1 10%
115 || 2 20%
118 | 1 10%
119 | 1 10%
120 |||| 4 40%

* Explain what are 'mean , median, and mode'. Find these averages for
this part of the data.

Mean: "The mean is the sum of all the scores divided by the number of
scores [...] The mean is a good measure of central tendency for
roughly symmetric distributions but can be misleading in skewed
distributions since it can be greatly influenced by extreme scores.
Therefore, other statistics such as the median may be more informative
for distributions such as reaction time or family income that are
frequently very skewed."

Mean
http://www.ruf.rice.edu/~lane/hyperstat/A15885.html

Median: "The median is the middle of a distribution: half the scores
are above the median and half are below the median. The median is less
sensitive to extreme scores than the mean and this makes it a better
measure than the mean for highly skewed distributions. The median
income is usually more informative than the mean income, for example."

Median
http://www.ruf.rice.edu/~lane/hyperstat/A27533.html

Mode: "The mode is the most frequently occurring score in a
distribution and is used as a measure of central tendency. The
advantage of the mode as a measure of central tendency is that its
meaning is obvious. Further, it is the only measure of central
tendency that can be used with nominal data."

Mode
http://www.ruf.rice.edu/~lane/hyperstat/A10032.html

For the 10 observations provided above, the mean, median and mode are
the following:

(110+113+115+115+118+119+120+120+120+120)
Mean= -----------------------------------------
10

= 117 seconds

Median:
"When there is an odd number of numbers, the median is simply the
middle number. For example, the median of 2, 4, and 7 is 4.
When there is an even number of numbers, the median is the mean of the
two middle numbers. Thus, the median of the numbers 2, 4, 7, 12 is
(4+7)/2 = 5.5"

In this case, there is an even number of observations (ten). If we
sort the numbers from lower to greater, we get the following list:

110 113 115 115 118 119 120 120 120 120

It can be easily seen in this list that the two middle numbers are 118
and 119. Thus, the MEDIAN is equal to 118.5 seconds

Mode:
As I said before, the mode is the most frequently observed number.
Looking at the frequency table, you can see that 120 is the number
most frequently observed (it's observed 4 times). Thus, the mode is
120 seconds.

* Calculate the standard deviation of the data .

If you need information on how the standard deviation is computed,
please check the following pages, as writing formulas is a bit messy
here.

Standard Deviation and Variance
http://davidmlane.com/hyperstat/A16252.html
http://davidmlane.com/hyperstat/A40397.html

So, first we compute the sample variance for the ten observations,
using the formula provided in the sites shown above.

Variance = 12.666...

Now, the standard deviation is simply the square root of the variance,
so we find that:

Standard deviation = 3.559... seconds

Question 2
----------

* Draw up the frequency table .

Here is the frequency table for all the observations

----------------------
Time | Freq.
----------+-----------
100 | 1
110 | 4
112 | 4
113 | 2
114 | 6
115 | 10
116 | 7
117 | 4
118 | 3
119 | 2
120 | 6
139 | 1
----------------------

* Find mean, mode and median for the whole set of data.

Now that you know how to compute these three numbers, I'll simply
write the solution and not the procedure.

Mean = 115.58 seconds

Median = 115 seconds

Mode = 115 seconds

* Compare these averages with the averages of part one of this
question and draw your conclusion .

I'll summarize here the results so far:
Sample w/10 observations Full data
Mean 117 115.58
Median 115 115
Mode 120 115

As you can see, the results obtained using the sample with ten
observations is quite similar to the one obtained using all
observations. This means (and is usually the case for most data sets)
that a small sample is usually enough to obtain good approximations to
the population mean, median and mode. It's important to know this,
because it's obviously easier to do computations with 10 observations
rather than with 50 observations. This fact is used extensively by
people doing "serious" statistics. Take, for example, the companies
that measure how many people watch a TV show. They don't ask every
household in the country what shows they are watching. Rather, they
randomly take a relatively small number of households, ask them this
information, and draw conclusions for the whole country based on a
small portion of it.

* From the above calculations, what is the most probable time taken to
complete a query.

This is a difficult question to answer without more information. Let's
see what are the options.

Let's assume first that time is not "continuous", as it seems to be
the case from the data you show. We would be assuming here that you
can only take an integer number of seconds to complete the query. In
this case, I would say the most probable time taken to complete the
query is the mode, that is, 115 seconds. This is the time that most
students have taken, so one could think that it's the most probable
time taken.

Things change if you consider time to be continuous (that is, it's
possible to take 113 seconds, or 112.38, or 120.298483 up to an
infinite number of decimals). In this case, the probability of taking
any single time to complete the query is zero. When assuming
"continuous" time, you can ask what's the probability of taking, say,
between 112.3 and 115.82 seconds, and in this case it will be a
positive number. However, if you ask "what's the probability of taking
exactly 120 seconds?" the the answer is zero. Thus, there is no "most
probable time taken". All times happen with probability zero. However,
in this case, one could ask: "If I had to predict the number of
seconds the students will take, what number should I guess in order to
minimize the sum of 'mistakes' of my prediction?"

The "mistakes" of the prediction are usually computed as the sum of
the squares of the differences between the observed values and the
predicted value. In this case, the number that minimizes this sum of
mistakes is precisely the mean, that is, 115.58 seconds.

Deviations from the mean and median
http://www.ruf.rice.edu/~lane/hyperstat/A41417.html

* Define total range, upper and lower quartile, also inter-quartile
range of the data.

All the definitions used for this question were taken from the
following page

5-Number Summary
http://www.si.umich.edu/libhelp/toolkit/analyze5numSummBoxplot.html

Interquartile range
http://hades.ph.tn.tudelft.nl/Internal/PHServices/Documentation/MathWorld/math/math/i/i170.htm

"The lower and upper range are the lowest and highest values in the
data set"
Thus,
Lower Range = 100 seconds
Upper Range = 139 seconds
Total Range = 139 - 100 = 39 seconds

"The lower quartile is the median of the lower half of the data set
and the upper quartile is the median of the upper half of the data
set"

Thus,
Lower Quartile = 114 seconds
Upper Quartile = 117 seconds

As you can see in the page provided above, the interquartile range is
basically upper quartile minus the lower quartile, so:

Inter-quartile Range = 3 seconds

Question 3
----------

The definition of the class width is the following:

Clarification of Answer by elmarto-ga on 09 Jun 2003 12:09 PDT

I'm sorry, I didn't paste the answer correctly. It's missing the last
part. Here it is:

Question 3 
---------- 
 
The definition of the class width is the following:

"[The class width is the] difference between two consecutive lower
class limits or lower class boundaries", while the definition of
"class limits" or "class boundaries" is "numbers used to separate
classes"

Elementary Statistics
http://www.ec.erau.edu/cce/faculty/baty2/211A_erau/211A_day1/211_day1_lecture2.ppt

Unfortunately, the page is in PowerPoint format. In case you don't
have PowerPoint, I'll explain: when drawing a grouped frequency table,
you first have to define what these "groups" or "classes" are. In this
question, the data is grouped as "100-104", "105-109", "110-114", etc.
For the grouped frequency table, you'll then count how many
observations fall in the "100-104" class, how many fall in the
"105-109" class, etc.

So, returning to the class width, the lower class boundaries are 100,
105, 110, etc, as these are the numbers that separate the classes.
Now, the class width is simply the difference between two consecutive
lower class boundaries, so, the class width is 105-100 = 5.

Note: even though you have only given two classes (100-104 and
105-109), I assumed the the following classes will be 110-114,
115-119, 120-124, 125-129, 130-134, 135-139. I've made this assumption
because it's common practice, when drawing grouped frequency tables,
to use classes with equal class width.


* Draw the grouped frequency distribution table.

In order to do this, we have to see how many observation fall in each
class. For example, in class 100-104, we find that there is only 1
observation (100), while in class 105-109 there are none. By doing
this with all classes, we obtain:

  Class       Freq.      Freq. in %
  100-104      1            2  %
  105-109      0            0  %
  110-114      16           32 %
  115-119      26           52 %
  120-124      6            12 %
  125-129      0            0  %
  130-134      0            0  %
  134-139      1            2  %

-------------------------------------

I've also found that I forgot to specify the frequency in percentage
in Question 2. Here is the table again:

---------------------------------------
     Time |      Freq.      Freq. in %
----------+---------------------------- 
      100 |          1          2 %
      110 |          4          8 %
      112 |          4          8 %
      113 |          2          4 %
      114 |          6          12%
      115 |         10          20%
      116 |          7          14%
      117 |          4          8 %
      118 |          3          6 %
      119 |          2          4 %
      120 |          6          12%
      139 |          1          2 %
--------------------------------------- 

There. I hope these answers were clear enough. If you still have any
questions, just request a clarification, I will be more than happy to
clarify anything you need. Otherwise, I await your comments and final
rating.

Best wishes!
elmarto

Clarification of Answer by elmarto-ga on 09 Jun 2003 12:12 PDT

The search terms I used in Google were:

mean median mode hyperstat
"class width"
"tally chart"
"standard deviation" hyperstat
"lower quartile"
"interquartile range"

For future reference, I suggest using HyperStat Online to find good
term definitions:

HyperStat Online
http://davidmlane.com/hyperstat/


Best luck!
elmarto

Comments

There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy