Which Measure of Central Tendency to Use? Mode, Mean, or Median?

De Didaquest
Aller à la navigationAller à la recherche

Related Questions

  • When should I use mean vs median?
  • Why is the median useful?
  • What is more accurate, the median or mean?
  • Why is the median better than mean?
  • Why is median better than average?
Your question could be written as “What is the most representative measure of the central tendency of a set of data?”  That is what mean and median, as well as mode, describe.  If data is normally distributed, then the mean, median and mode will be the same value, hence no real difference.  If the data is skewed, then the mean and median can be significantly different.  In that case, the numerical calculation of mean will not truly represent the “center” of the data points.  This is when the median will be preferred.  If approximately normal, then the calculated mean takes into consideration all the data and not just the center one.  Furthermore, the median is less effected by outliers or extreme values.  The addition of an extreme high value may “shift” the mean quite a bit, yet the median is hardly impacted.  As for your comment regarding “stability”, that is best determined by a control chart.
One thing we need to remember is that all the stats & calculations we do are just an effort to summarize a lot of information in an efficient, useful manner. How you summarize it depends mostly on what you will be doing with the information, but partly just on personal preference.Estimating “central tendency” is perhaps the single most common and useful summary of a set of data. But which measure to use depends on what you want to do with that information. If you are going to place 100 pieces end to end and want to know the total length, then the average is the best number to use. If you want to say “Half the pieces are at least this long,” then median is best. If you want to say

“This is the most common size in the bunch,” then mode would be the best.(Just for the sake of argument, here’s a case where none of these are very useful. If you measure the voltage of a standard wall outlet, the mean and median will both come out to 0 V; it is bimodal with modes at about +/- 155 V (at least in the US). Any electrical engineer will tell you that you need to do an “rsm average” to find a representative measure of the voltage: 110 V in this case.)Note that none of this depends on the specific distribution! Which one you use depends on what you are going to do with the summary, not which one the distributions it comes from.It is quite possible that a summary of the central tendency is not enough information. You may want to know the shape, the spread, the drift of the data. Each of these requires additional calculations, and each comes with choices. For the spread, do you report the range, the standard deviation, the 99% confidence interval, or some other number? And again, you need to decide why you want the information and what you are going to do with it before you choose.So, back to the original post. Mean and median (and mode, and even rms average) and all useful and reasonable numbers to calculate. Which to use depends on which fulfills your needs. Unfortunately, the choice is often made by a supplier/customer/boss who doesn’t really know why they want what they want, but you are the one stuck doing it ;-) Tim F