If you see me write something that looks like this:
1 -[ 4 22 70 ]- 296
you're looking at a five-number summary. This is a handy way of summarising a numerical data set. Traditionally, the values are
minimum -[ 25th percentile | median | 75 percentile ]- maximum
The interpretation of the summary
above is as follows.
- The typical data (the values in the middle) lie between 4 and 70 (between the 25th and 75th percentiles).
- The median, 22, is the value that divides the bottom half and top half of the data. So 50% of the data lie below 22 and 50% of the data lie above 22.
- The minimum value is 1.
- The maximum values is 296.
I prefer replacing the minimum with the 1st percentile and the maximum with the 99th percentile since these are more stable (less sensitive to outliers in the data set).
1st perc. -[ 25th perc. | median | 75th perc. ]- 99th perc.
No comments:
Post a Comment