Home » Math Theory » Statistics and Probability » Stem and Leaf Plot

Stem and Leaf Plot

What is a stem and leaf plot?

A stem-and-leaf diagram is a schematic representation of a set of data. In other words, the stem and leaf plot is a plot used to represent numerical data by showing its distribution. A stem and leaf plot is called so as in this plot each numerical data value is split into a stem which is the first digit or digits and a leaf. What are the characteristics of the stem and the leaf in this plot? Let us find out.

Characteristics of a Stem and Leaf Plot

Following are the characteristics of a stem and leaf plot –

1. The stem is the first digit or digits, while the leaf is the last digit.
2. The stem and leaf plot is used when your data is not too large, i.e. about 15-150 data points.
3. The stem and leaf plot is drawn in a table with two columns.
4. The stems are listed down in the left column. Each stem is listed, even if some stems have no leaves.
5. The leaves are listed in increasing order in a row to the right of each corresponding stem.

Let us understand it using an example.

Example Suppose we have the following age in years for 15 people obtained from a certain survey – 70, 56, 37, 69, 70, 40, 66, 53, 43, 70, 54, 42, 54, 48, 68

How can plot the above information using a stem and leaf plot? Let us find out.

Solution

We can see in the above the stem and leaf plot that the key 3 | 7 means that it represents the age 37. Similarly, the key 4 | 0 2 3 8 represents the ages 40, 42, 43 and 48, which we have in the survey and here, 4 is the stem and 0 is the leaf. Similarly, 4 is the stem and 2 is the leaf and so on. From this explanation, we can also say that in a stem and leaf plot, the stem unit represents tens and the leaf unit represents single values. Also, the 3 stem can represent any number from 30 to 39. Similar is the case for other stems as well.

What can be concluded from the above stem and leaf plot? The conclusion can be listed as  –

• The minimum age in the survey is 37 years and the maximum age is 70 years.
• The most frequent age (or the mode) in this data is 70 years because it occurs 3 times. There is no other value that occurs more than that.

Now that we have learnt what is meant by stem and leaf plot, let us understand how to read it.

How to read a stem and leaf plot?

As we have learnt, a stem and leaf plot is used for the schematic representation of data. So, how do we read a stem and leaf plot diagram if we have one? Let us understand this by an example.

Suppose we are given the following stem and leaf plot.

What is the interpretation of this plot? Let us find out.

We have learnt that the stem of a stem and the leaf plot represents the tens while the leaves of the plot represent the ones. Using this characteristic of the stem and leaf plot,  we can identify the following from the above plot –

Minimum Value – Notice that the minimum value of the stem is 14. Also, the leaves against this number 14 are two numbers, 4 and 7. This means that stem 14 and the leaves 4 and 7 represent the numbers 144 and 147 respectively. Since these will be the lowest values in the stem and leaf plot, therefore the minimum value of the data that is represented by this stem and leaf plot is 144.

Now, let us look at the maximum value of the data represented by this stem and leaf plot.

Maximum Value – Notice that the maximum value of the stem is 18. Also, the leaves against this number 18 are two numbers, 0 and 8. This means that the stem 18 and the leaves 0 and 8 represent the numbers 180 and 188 respectively. Since these will be the largest values in the stem and leaf plot, therefore the maximum value of the data that is represented by this stem and leaf plot is 188.

Can we also find the mode using the stem and leaf plot? Let us find out.

Mode – we know that mode of a data is the value that appears most frequently in a data set. If we look carefully at the above stem and leaf plot we see that there are a different number of leaves against each stem. This means that if we look at the most frequent value in each row we will get the most frequent value in our data which will be our mode. In the above data, we can see that there are five zeros against the stem 16. This means that the number 160 is present 5 times in the data which is the maximum frequency among all the values. Hence from the above stem and leaf plot, we can conclude that the mode of the given data set is 160.

Another reading from the stem and leaf plot is the position where most of the data is centred. For example in the above stem and leaf plot, we can observe the crowded rows to see where the main cluster of data lies. In the above stem and leaf plot, we can interpret that the data are clustered at 15s and 16s or from 150-169. Also, 150 is the minimum value for row 15 to represent and 169 is the maximum value that row 16 can represent. 15 has 11 numbers in its row and 16 has 14 numbers in its row.

Now, that we have understood how to read a stem and leaf plot, Let us learn how to plot the same.

How to make a stem and leaf plot?

By the definition of a stem and leaf plot, we know that the stem and leaf plot is a plot used to represent numerical data by showing its distribution. Let us take an example to understand the plotting of a stem and leaf plot.

Suppose we are given the body mass indexes of 10 individuals as under –

25.0, 25.2, 24.2, 31.5, 17.4, 29.4, 19.2, 20.7, 24.2, 29.7

How will we plot the above values using a stem and leaf plot diagram? Let us find out.

We will use the following to make the stem and leaf plot.

1. First of all, we will sort the given data and present it in ascending order. We will get, 17.4, 19.2, 20.7, 24.2, 24.2, 25.0, 25.2, 29.4, 29.7, 31.5
2. Now we will find the largest and the smallest number in the data. From the above values, we can see that the smallest value in the data is 17.45 and the largest value of the data is 31.5.
3. Next, we will determine what values will be represented by the stems and what numbers would the leaves represent. We know that the condition for deciding the values of the stem and the leaves is that each stem can consist of any number of digits, but each leaf can have only the single last digit. Also, if the range of values is too great, the numbers can be rounded up to limit the number of stems. In the above values that have been given to us, we can use the leaf to present the decimal place and the stem will represent the rest of the number i.e. ones and the tens place.
4. Also, we know that the minimum of our data is 17.4 (which contains 17 in the ones place) and the maximum is 31.5 (which contains 31 in the ones place) so our stems must go from 17 to 31. It will contain about 14 rows.
5. We know that the stem and leaf plot is drawn with two columns. Therefore, the stems will be listed down in the left column ranging from 17 to 31 a shown below –
1. Next, we will separate each data value into a stem consisting of ones and tens and a leaf consisting of the decimal points. For example, for the data value, 17.4, the stem will be 17, and 4 will be the leaf. Therefore, we will 4 4 in the row of 17 stem to get,
1.  Similarly, the next data value, 19.2, for which the stem will be 19, and 2 will be the leaf. We will write 2 in the row of 19 stem.
1. We will continue till all data values are listed in the stem and leaf plot to get the stem and leaf plot as
1. In the above stem and leaf plot, we can see that there are some stems that are empty, 18,21,22,23,26,27,28, and 30 as they have no corresponding values.

Now that we have learnt how to plot as well as read stem and leaf plots, let us move on and learn the types of stem and leaf plots that we have.

Types of stem and leaf plots

There are three kinds of stem and leaf plots in general –

1. Simple Stem and Leaf Plots
2. Split stem and leaf plots
3. Back-to-back stem and leaf plots

Let us understand them one by one.

Simple Stem and Leaf Plots

The general stem and leaf plots that we make such as the ones we have discussed above are known as simple stem and leaf plots.  In these plots, the stem values are repeated once, no matter how many leaves it contains.

Let us again take an example. Suppose we need to make a stem and leaf plot of heights in cm of 30 people in a survey, the values of which are –

147, 150, 153, 155, 155, 155, 156, 156, 156, 157, 158, 159, 160, 160, 160, 160, 161,162, 163, 163, 163, 164, 167, 167, 169, 170, 172, 174, 180, 180

Now, the stem and leaf plot of the above data will be represented as  –

Split stem and leaf plots

When the leaves are too crowded, it may be desired to use split stem and leaf plots, where each stem is split into two equal parts. This may show additional patterns in our data distribution.

Let us understand it through the example we have considered above in the simple stem and leaf plot.

The Split stem and leaf plot of the same data would be –

Now, in the above split stem and leaf plot, we can see that

• The first 14 stem consists of the values from 140 to 144.
• The second 14 stem consists of the values from 145 to 149.
• The first 15 stem consists of the values from 150 to 154.
• The second 15 stem consists of the values from 155 to 159.
• The first 16 stem consists of the values from 160 to 164.
• The second 16 stem consists of the values from 165 to 169, and so on.

An important point to note here is that in the first simple stem and leaf plot, we can conclude that the main cluster of data lies between 150 and 169 cm. However, in the split stem and leaf plot, we can conclude that the main cluster of data is between 155 to 164 cm which is a more accurate conclusion.

Back-to-back stem and leaf plots

Back-to-back stem and leaf plots are used to compare the distribution of numerical values across two groups. Let us consider an example.

Suppose we have the following data of heights of 20 males through a survey –

155, 156, 156, 160, 162, 162, 163, 164, 165, 167, 167, 167, 169, 169, 170, 170, 172, 174, 174, 178

Also, we have heights of 20 females through a survey as –

147, 150, 153, 155, 155, 156, 157, 158, 158, 158, 159, 159, 160, 160, 160, 160, 161, 163, 163, 165

For the above data we will make a back-to-back stem and leaf plot comparing males to females as –

In the above back to back stem and leaf plot, it is important to note that –

• The stem represents tens and leaves represent ones.
• The right-most column is for the female leaves and the left-most column is for the male leaves.
• The leaves in the right column are arranged in ascending order, while the leaves in the left column are arranged in descending order.

Now let us see what are the advantages and disadvantages of Stem and Leaf Plots

Advantages of Stem and Leaf Plots

The following are the advantages of Stem and Leaf Plots –

1. These plots give you a quick overview of the distribution and you can also see the shape of the distribution.
2. They are also useful for highlighting the mode (the most common number in a data set) and for finding outliers.

Disadvantages of Stem and Leaf Plots

A disadvantage of stem and leaf plots is they are only useful for small data sets from about 15 to 150 data points.

Key Facts and Summary

1. A stem-and-leaf diagram is a schematic representation of a set of data.
2. In a stem and leaf plot, the stem unit represents tens and the leaf unit represents single values.
3. Each stem can consist of any number of digits, but each leaf can have only the single last digit.
4. In simple stem and leaf plots, the stem values are repeated once, no matter how many leaves it contains.
5. There are three kinds of stem and leaf plots in general –
1. Simple Stem and Leaf Plots
2. Split stem and leaf plots
3. Back-to-back stem and leaf plots