![]() |
|
![]() | ||
|
Subject:
data set anaylysis
Category: Science Asked by: tferer-ga List Price: $5.00 |
Posted:
05 May 2004 09:48 PDT
Expires: 04 Jun 2004 09:48 PDT Question ID: 341517 |
i have a huge set of numbers (trip times from point a to b) most of the numbers fall into a "normal" range ....occasionally, when there is a delay, trip times go up for a short period. how do i find the number that represents a "normal" un-delayed trip? |
![]() | ||
|
There is no answer at this time. |
![]() | ||
|
Subject:
Re: data set anaylysis
From: pctyszka-ga on 05 May 2004 13:03 PDT |
just a comment: Depending on what you are using the "normal" number for, I would try calculating the average (A+B+C+...+Z)/(# of numbers) or find the median number by sorting all the numbers from shortest time to longest time and then finding the middlemost (# of numbers)/2 number: |
Subject:
Re: data set anaylysis
From: prssurcookr-ga on 06 May 2004 18:24 PDT |
Calculating a mean and standard deviation for your population would be most informative. |
Subject:
Re: data set anaylysis
From: tobytyler-ga on 08 May 2004 07:10 PDT |
I have a huge set of numbers (trip times from point a to b) most of the numbers fall into a "normal" range ....occasionally, when there is a delay, trip times go up for a short period. how do i find the number that represents a "normal" un-delayed trip? ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The way I read the problem is that you have to Step (1) Determine which values are outliers (the delayed time values) Step (2) reject those from your data set Step (3) find the average of the rest My high school mathematics book suggests that outliers may be either (A) more than twice the interquartile range from the median; or (B) more than 2.5 times the standard deviation from the mean (for continuous data). If you have discrete data such as travel times, it might be easiest to use (A) twice the interquartile range. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ An example problem: If my travel data is 12 13 13 14 15 15 16 18 18 19 24 25 27 43 44 hours Step (1) The median = 18 (the middle score) Q1 = 14 (the middle score of the bottom-half) Q3 = 25 (the middle score of the top half) The interquartile range is Q3 - Q1 = 25 - 14 = 11 Step (2) We reject data which is greater than the median + 2 * upper-interquartile range = 18 + 2*11 = 40 We are left with 12 13 13 14 15 15 16 18 18 19 24 25 27 hours Step (3) The median of this is 16. ++++++++++++++++++++ Extras 1) If you have an even number of scores the median is the average of the middle two scores: e.g. The median of (1 5 6 9) is (5+6)/2 = 5.5 2) It might be easy to automatically organize the data from smallest to greatest using a spreadsheet program. |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |