When the data are skewed right, what is the typical relationship between the mean and median?

We are now going to classify data sets into \(\text{3}\) categories that describe the shape of the data distribution: symmetric, left skewed, right skewed. We can use this classification for any data set, but here we will look only at distributions with one peak. Most of the data distributions that you have seen so far have only one peak, so the plots in this section should look familiar.

Distributions with one peak are called unimodal distributions. Unimodal literally means having one mode. (Remember that a mode is a maximum in the distribution.)

A symmetric distribution is one where the left and right hand sides of the distribution are roughly equally balanced around the mean. The histogram below shows a typical symmetric distribution.

When the data are skewed right, what is the typical relationship between the mean and median?

For symmetric distributions, the mean is approximately equal to the median. The tails of the distribution are the parts to the left and to the right, away from the mean. The tail is the part where the counts in the histogram become smaller. For a symmetric distribution, the left and right tails are equally balanced, meaning that they have about the same length.

The figure below shows the box and whisker diagram for a typical symmetric data set.

When the data are skewed right, what is the typical relationship between the mean and median?

Another property of a symmetric distribution is that its median (second quartile) lies in the middle of its first and third quartiles. Note that the whiskers of the plot (the minimum and maximum) do not have to be equally far away from the median. In the next section on outliers, you will see that the minimum and maximum values do not necessarily match the rest of the data distribution well.

A distribution that is skewed right (also known as positively skewed) is shown below.

When the data are skewed right, what is the typical relationship between the mean and median?

Now the picture is not symmetric around the mean anymore. For a right skewed distribution, the mean is typically greater than the median. Also notice that the tail of the distribution on the right hand (positive) side is longer than on the left hand side.

When the data are skewed right, what is the typical relationship between the mean and median?

From the box and whisker diagram we can also see that the median is closer to the first quartile than the third quartile. The fact that the right hand side tail of the distribution is longer than the left can also be seen.

A distribution that is skewed left has exactly the opposite characteristics of one that is skewed right:

  • the mean is typically less than the median;
  • the tail of the distribution is longer on the left hand side than on the right hand side; and
  • the median is closer to the third quartile than to the first quartile.

The table below summarises the different categories visually.

Symmetric Skewed right (positive) Skewed left (negative)
When the data are skewed right, what is the typical relationship between the mean and median?
When the data are skewed right, what is the typical relationship between the mean and median?
When the data are skewed right, what is the typical relationship between the mean and median?
When the data are skewed right, what is the typical relationship between the mean and median?
When the data are skewed right, what is the typical relationship between the mean and median?
When the data are skewed right, what is the typical relationship between the mean and median?

Textbook Exercise 11.5

Is the following data set symmetric, skewed right or skewed left? Motivate your answer.

\(\text{27}\) ; \(\text{28}\) ; \(\text{30}\) ; \(\text{32}\) ; \(\text{34}\) ; \(\text{38}\) ; \(\text{41}\) ; \(\text{42}\) ; \(\text{43}\) ; \(\text{44}\) ; \(\text{46}\) ; \(\text{53}\) ; \(\text{56}\) ; \(\text{62}\)

The statistics of the data set are

  • mean: \(\text{41,1}\);
  • first quartile: \(\text{33}\);
  • median: \(\text{41,5}\);
  • third quartile: \(\text{45}\).
We can conclude that the data set is skewed left for two reasons.
  • The mean is less than the median. There is only a very small difference between the mean and median, so this is not a very strong reason.
  • A better reason is that the median is closer to the third quartile than the first quartile.

A data set with this histogram:

When the data are skewed right, what is the typical relationship between the mean and median?

skewed right

A data set with this box and whisker plot:

When the data are skewed right, what is the typical relationship between the mean and median?

skewed right

A data set with this frequency polygon:

When the data are skewed right, what is the typical relationship between the mean and median?

skewed left

The following data set:

\(\text{11,2}\) ; \(\text{5}\) ; \(\text{9,4}\) ; \(\text{14,9}\) ; \(\text{4,4}\) ; \(\text{18,8}\) ; \(-\text{0,4}\) ; \(\text{10,5}\) ; \(\text{8,3}\) ; \(\text{17,8}\)

The statistics of the data set are

  • mean: \(\text{9,99}\);
  • first quartile: \(\text{6,65}\);
  • median: \(\text{9,95}\);
  • third quartile: \(\text{13,05}\).
Note that we get contradicting indications from the different ways of determining whether the data is skewed right or left.
  • The mean is slightly greater than the median. This would indicate that the data set is skewed right.
  • The median is slightly closer to the third quartile than the first quartile. This would indicate that the data set is skewed left.
Since these differences are so small and since they contradict each other, we conclude that the data set is symmetric.

Two data sets have the same range and interquartile range, but one is skewed right and the other is skewed left. Sketch the box and whisker plot for each of these data sets. Then, invent data (\(\text{6}\) points in each data set) that matches the descriptions of the two data sets.

Learner-dependent answer.


Skewed Distribution / Asymmetric Distribution: Contents:


What is a Skewed Distribution?

Watch the video or read the article below:

Skewed Distribution: left skewed vs right skewed

Watch this video on YouTube.

If one tail is longer than another, the distribution is skewed. These distributions are sometimes called asymmetric or asymmetrical distributions as they don’t show any kind of symmetry. Symmetry means that one half of the distribution is a mirror image of the other half. For example, the normal distribution is a symmetric distribution with no skew. The tails are exactly the same.


When the data are skewed right, what is the typical relationship between the mean and median?
A normal curve.

A left-skewed distribution has a long left tail. Left-skewed distributions are also called negatively-skewed distributions. That’s because there is a long tail in the negative direction on the number line. The mean is also to the left of the peak.

A right-skewed distribution has a long right tail. Right-skewed distributions are also called positive-skew distributions. That’s because there is a long tail in the positive direction on the number line. The mean is also to the right of the peak.


When the data are skewed right, what is the typical relationship between the mean and median?

The normal distribution is the most common distribution you’ll come across. Next, you’ll see a fair amount of negatively skewed distributions. For example, household income in the U.S. is negatively skewed with a very long left tail.

When the data are skewed right, what is the typical relationship between the mean and median?
Income in the U.S. Image: NY Times.

Interestingly, you can take the same data and make it a right-skewed distribution. This positively-skewed graph plots number of household’s income brackets:

When the data are skewed right, what is the typical relationship between the mean and median?

Mean and Median in Skewed Distributions

In a normal distribution, the mean and the median are the same number while the mean and median in a skewed distribution become different numbers:

A left-skewed, negative distribution will have the mean to the left of the median.

When the data are skewed right, what is the typical relationship between the mean and median?

A right-skewed distribution will have the mean to the right of the median.

When the data are skewed right, what is the typical relationship between the mean and median?

Effects on Statistics

The normal distribution is the easiest distribution to work with in order to gain an understanding about statistics. Real life distributions are usually skewed. Too much skewness, and many statistical techniques don’t work. As a result, advanced mathematical techniques including logarithms and quantile regression techniques are used. Read more about quantile regression here.

Skewed Left (Negative Skew)

A left skewed distribution is sometimes called a negatively skewed distribution because it’s long tail is on the negative direction on a number line.

A common misconception is that the peak of distribution is what defines “peakness.” In other words, a peak that tends to the left is left skewed distribution. This is incorrect. There are two main things that make a distribution skewed left:


  1. The mean is to the left of the peak. This is the main definition behind “skewness”, which is technically a measure of the distribution of values around the mean.
  2. The tail is longer on the left.
  3. In most cases, the mean is to the left of the median. This isn’t a reliable test for skewness though, as some distributions (i.e. many multimodal distributions) violate this rule. You should think of this as a “general idea” kind of rule, and not a set-in-stone one.
When the data are skewed right, what is the typical relationship between the mean and median?
In a left skewed distribution, the mean is to the left of the peak.

Left Skewed and Numerical Values

Skewness can be shown with a list of numbers as well as on a graph. For example, take the numbers 1,2, and 3. They are evenly spaced, with 2 as the mean (1 + 2 + 3 / 3 = 6 / 3 = 2). If you add a number to the far left (think in terms of adding a value to the number line), the distribution becomes left skewed: -10, 1, 2, 3. Similarly, if you add a value to the far right, the set of numbers becomes right skewed:

1, 2, 3, 10.

Left Skewed Boxplot

If the bulk of observations are on the high end of the scale, a boxplot is left skewed. Consequently, the left whisker is longer than the right whisker.

When the data are skewed right, what is the typical relationship between the mean and median?
A left skewed boxplot, showing a long left whisker. Image: SHU.EDU

Left Skewed Histogram

Left skewed histograms are Histograms with long tails on the left.

When the data are skewed right, what is the typical relationship between the mean and median?

Skewed Right / Positive Skew

A right skewed distribution is sometimes called a positive skew distribution. That’s because the tail is longer on the positive direction of the number line.


Right Skewed Histogram

A histogram is right skewed if the peak of the histogram veers to the left. Therefore, the histogram’s tail has a positive skew to the right.

When the data are skewed right, what is the typical relationship between the mean and median?
A skewed to the right histogram. Image: SUNY Oswego

Right Skewed Box Plot

If a box plot is skewed to the right, the box shifts to the left and the right whisker gets longer. As a result, the mean is greater than the median

When the data are skewed right, what is the typical relationship between the mean and median?
Image: Seton Hall University

Right Skewed Mean and Median

The rule of thumb is that in a right skewed distribution, the mean is usually to the right of the median.

When the data are skewed right, what is the typical relationship between the mean and median?

However, like most rules of thumb, there are exceptions. Most right skewed distributions you come across in elementary statistics will have the mean to the right of the median. The Journal of Statistics Education [1] points out an exception to the rule:


In a data analysis course, a third moment formula calculates the skew (see: What is a Moment?). Consequently, some distributions can break the rule of thumb. The following distribution was made from a 2002 General Social Survey. Respondents stated how many people older than 18 lived in their household. This is a right-skewed graph, but the mean is clearly to the left of the median.
When the data are skewed right, what is the typical relationship between the mean and median?
Image: Journal of Statistics Education

There are other exceptions which most involve theoretical mathematics and calculus. The important point to note is that although the mean is generally to the right of the median in a right skewed distribution, it isn’t an absolute fact.

Skew Normal Distribution

When the data are skewed right, what is the typical relationship between the mean and median?

The probability density function for the skew normal, showing various alphas. Image: skbkekas|Wikimedia Commons.
The skew normal distribution is a normal distribution with an extra shape parameter, α. The shape parameter skews the normal distribution to the left or right. As it is only the skew of the normal distribution that’s being changed, the skew normal family has many of the same properties of the normal distribution:

The skew normal has a number of interesting properties related to alpha:

  • If the skew normal has a skew of zero, then it becomes the normal distribution.
  • If the sign of alpha changes, the distribution will flip over the y-axis.
  • As alpha increases (in absolute value), the skew also increases.
  • As alpha tends towards infinity, the series converges to the folded normal density function.

Therefore, the normal distribution can be seen as a special case of the skew normal distribution.

This is a relatively new distribution, introduced by O’Hagan and Leonard in 1976 in a paper on Bayes’ estimation. The work was a basic overview and it wasn’t until the 1980s that an in-depth analysis of the distribution was published. It is mainly used in threshold autoregressive stochastic processes and in time series analysis, but can also be used to model various phenomena in a wide range of fields from the sciences to the stock market.

References

[1] Journal of Statistics Education. Retrieved April 16, 2021 from: http://www.amstat.org/publications/jse/v13n2/vonhippel.html
Abramowitz, M. and Stegun, I. A. (Eds.). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing. New York: Dover, p. 928, 1972.
Kenney, J. F. and Keeping, E. S. “Skewness.” §7.10 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 100-101, 1962.
O’Hagan, A. and Leonard, T. (1976). Bayes estimation subject to uncertainty about parameter constraints. Biometrika, 63, 201-202.

Next: Finding Skewness.

---------------------------------------------------------------------------

When the data are skewed right, what is the typical relationship between the mean and median?
When the data are skewed right, what is the typical relationship between the mean and median?

Need help with a homework or test question? With Chegg Study, you can get step-by-step solutions to your questions from an expert in the field. Your first 30 minutes with a Chegg tutor is free!

Comments? Need to post a correction? Please Contact Us.