**Introduction**

We know that dispersion is the measurement of variations in the values of the variable. It measures the degree of scatteredness of the observations in a distribution around the central values. We shall now learn about two important measures of dispersion, namely, variance and standard deviation.

**What is Variance?**

The variance of a variable X is the arithmetic mean of the squares of all derivatives of X from the arithmetic mean of the observations and is denoted by Var ( X ) or σ ^{2}. In other words, The variance measures the average degree to which each point differs from the mean—the average of all data points. Let us now learn about some properties of variance.

**Properties of Variance**

Properties of variance include –

- In statistics and probability, a variance is always a non-negative number.
- Variance always has squared units.
- Variance treats all deviations from the mean as the same regardless of their direction.

**What is Standard Deviation**

The positive square root of the variance of a variate X is known as its standard deviation and is denoted by σ. In other words, Standard deviation looks at how spread out a group of numbers is from the mean, by looking at the square root of the variance.

Thus,

Standard Deviation = $\sqrt{Var ( X )}$

It is important to note here that, Standard deviation and variance both are measures that tell how spread out the numbers is. While variance gives us a rough idea of spread, the standard deviation is more concrete, giving us the exact distances from the mean.

Let us now learn about some properties of standard deviation.

**Properties of Standard Deviation**

Properties of standard deviation include –

- Standard deviation is sensitive to extreme values. A single very extreme value can increase the standard deviation and misrepresent the dispersion.
- For two data sets with the same mean, the one with the larger standard deviation is the one in which the data is more spread out from the centre.
- Standard deviation is equal to 0 if all values are equal. This is because all values are then equal to the mean.
- Standard deviation is useful when comparing the spread of two separate data sets that have approximately the same mean.

Now let us learn about the calculations of variance and standard deviation in different cases, namely,

- Individual Observations
- Discrete Frequency Distribution
- Continuous of Grouped Frequency Distribution

**Variance of Individual Observations**

If x1, x_{2}, x_{3}, ……, x_{n} are n values of a variable X, then,

Variance ( X ) = $\frac{\sum\limits^n_{i=1}( x_i – \overline{X})^{2}}{n}$

Also,

Variance ( X ) = [$\frac{1}{N} \sum\limits^n_{i=1}f_ix_i$] ^{2}

Another formula for variance is given by

Variance ( X ) = $\frac{1}{N} [\sum f_i d_i^2 ] – [\frac{1}{N} \sum\limits^n_{i=1} f_i d_i$ ] ^{2}

In the case of individual observations, variance and standard deviation may be computed by applying any of the above three formulas. We can now define the algorithm for finding the variance when deviations are taken from the actual mean. The algorithm is as follows –

**Algorithm for finding the variance when deviations are taken from the actual mean**

- Compute the mean $\overline{X}$ of the given observations x
_{1}, x_{2}, x_{3}, x_{4}, ……. x_{n}. - Take the deviations of the observations from the mean i.e. find $x_i – \overline{X}$ ; I = 1, 2, 3, ………, n.
- Square the deviations obtained in the above step and obtain the sum $\sum\limits^n_{i=1}( x_i – \overline{X})^{2}$ .
- Divide the sum $\sum\limits^n_{i=1}( x_i – \overline{X})^{2}$ obtained in the above step by n. this gives us the variance of X.

Let us understand this through an example.

**Example** Compute the variance and the standard deviation of the following observations of the following data –

65, 68, 58, 44, 48, 45, 60, 62, 60, 50

**Solution** We have been given the data – 65, 68, 58, 44, 48, 45, 60, 62, 60, 50. We are required to find the variance and the standard deviation of the given data.

Let $\overline{X}$ be the mean of the given set of observations. Then,

$\overline{X} = \frac{65+ 68+ 58+ 44 + 48+ 45 + 60+ 62+ 60 + 50}{10} = \frac{560}{10}$ = 56

Let us now compute the variance of the given data

x_{i} | $x_i – \overline{X} = x_i$ – 56 | ($x_i – \overline{X}$ ) ^{2} |

65 | 9 | 81 |

58 | 2 | 4 |

68 | 12 | 144 |

44 | -12 | 144 |

48 | -8 | 64 |

45 | -11 | 121 |

60 | 4 | 16 |

62 | 6 | 36 |

60 | 4 | 16 |

50 | -6 | 36 |

($x_i – \overline{X}$ ) ^{2} = 662 |

We can see that number of observations is 10. Therefore, n = 10

Variance = $\frac{\sum\limits^n_{i=1}( x_i – \overline{X})^{2}}{n} = \frac{662}{10}$ = 66.2

Hence, Standard Deviation = $\sqrt{Variance} = \sqrt{66.2}$ = 8.13

Therefore, for the given set of observations,

**Variance = 66.2 and Standard Deviation = 8.13**

**Variance of Discrete Frequency Distribution**

If the values of xi of the variable X or ( and ) frequencies fi are large, in such a case we take deviations of the values of variable X from an arbitrary point say, A. If d_{i} = x_{i} – A , i = 1, 2, 3, …….., n, then the formula of variance reduces to

Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}f_id_i$ ] ^{2}

Sometimes, d_{i} = x_{i} – A are divisible by a common number say, h. If we define

u_{i} = $\frac{x_i-A}{h} = \frac{d_i}{h}$, i = 1, 2, 3, ……., n, then we obtain the following formula for variance,

Var ( X ) = h^{2} [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} \sum\limits^n_{i=1}f_iu_i$ ] ^{2} ) ]

Let us now learn about the algorithm that is used to obtain the variance using the formula –

Var ( X ) = $\frac{1}{N} [ \sum\limits^n_{i=1}f_i ( x_i- \overline{X}$ ) ^{2}]

**Algorithm for finding Variance of a Discrete Frequency Distribution**

- Obtain the given frequency distribution.
- Find the mean $\overline{X}$ of the given frequency distribution.
- Compute deviations ( x
_{i}– $\overline{X}$ ) from the mean $\overline{X}$ . - Find the squares of deviations obtained in the previous step.
- Multiply the squared deviations by respective frequencies and obtain the total [ $\sum\limits^n_{i=1}( x_i – \overline{X})^{2}$].
- Divide the total obtained in the previous step by N = $\sum{f_i}$ to obtain the variance.

Let us understand the above algorithm through an example.

**Example** Find the variance and standard deviation of the following frequency distribution

Variable ( x_{i} ) | 2 | 4 | 6 | 8 | 10 | 12 | 14 | 16 |

Frequency ( f_{i} ) | 4 | 4 | 5 | 15 | 8 | 5 | 4 | 5 |

**Solution** We have been given the frequency distribution as –

Variable ( x_{i} ) | 2 | 4 | 6 | 8 | 10 | 12 | 14 | 16 |

Frequency ( f_{i} ) | 4 | 4 | 5 | 15 | 8 | 5 | 4 | 5 |

We are required to find the variance and the standard deviation of this frequency distribution. Let us use the above algorithm for this purpose in the following table –

Varx _{i} | Freqf _{i} | f_{i}x_{i} | x_{i} – X = x_{i} – 9 | (x_{i} – $\overline{X}$ )^{2} | fi (x_{i} – $\overline{X}$ )^{2} |

2 | 4 | 8 | -7 | 49 | 196 |

4 | 4 | 16 | -5 | 25 | 100 |

6 | 5 | 30 | -3 | 9 | 45 |

8 | 15 | 120 | -1 | 1 | 15 |

10 | 8 | 80 | 1 | 1 | 8 |

12 | 5 | 60 | 3 | 9 | 45 |

14 | 4 | 56 | 5 | 25 | 100 |

16 | 5 | 80 | 7 | 49 | 245 |

N = $\sum{f_i}$ = 50 | N = $\sum{f_i}{x_i}$ = 450 | $\sum{f_i}( x_i – \overline{X})^{2}$ = 754 |

Here, N = 50

$\sum{f_i}{x_i}$= 450

and,

$\sum{f_i}( x_i – \overline{X})^{2}$ = 754

Therefore, $\overline{X} = \frac{1}{N} \sum{f_i}{x_i}$

= $\frac{450}{50}$ = 9

Also,

We know that,

Var ( X ) = $\frac{1}{N} [\sum\limits^n_{i=1}{f_i}( x_i – \overline{X})^{2}$ ]

Substituting the obtained values in the above formula, we get

Var ( X ) = $\frac{750}{50}$ = 15.08

Now, that we have found the value of the variance, we shall calculate the value of the standard deviation.

We know that,

Standard Deviation = $\sqrt{Variance}$

Therefore, Standard Deviation = $\sqrt{Variance} = \sqrt{1508}$ = 3.88

Hence, for the given frequency distribution,

**Variance = 15.08 and Standard Deviation = 3.88**

In practice calculation of standard deviation and variance by this algorithm is rarely used, because if the actual mean is in fraction the calculation is quite tedious and time consuming.

In order to compute the variance by using the following formula,

Var ( X ) = $\frac{1}{N} [ \sum{f_i} {d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] ^{2} , where d_{i} = x_{i} – A, we may use the following algorithm –

**Alternate Algorithm for finding the variance of a Discrete Frequency Distribution**

- Take the deviations of observations from an assumed mean say, A and denote these deviations by d
_{i}. - Multiply the deviations by the respective frequencies and obtain the total $\sum{f_i}{d_i}$ .
- Obtain the squares of deviations obtained in the previous steps, i.e. ( d
_{i})^{2}. - Multiply the squared deviations by respective frequencies and obtain the total $\sum{f_i}{d_i}^{2}$ .
- Substitute the values in the formula

Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2}] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] ^{2}

Let us understand this algorithm through an example

**Example** Calculate the variance and standard deviation from the data given below –

Size of Item | 3 . 5 | 4 . 5 | 5 . 5 | 6 . 5 | 7 . 5 | 8 . 5 | 9 . 5 |

Frequency | 3 | 7 | 2 | 60 | 85 | 32 | 8 |

Solution We have been given the data –

Size of Item | 3 . 5 | 4 . 5 | 5 . 5 | 6 . 5 | 7 . 5 | 8 . 5 | 9 . 5 |

Frequency | 3 | 7 | 2 | 60 | 85 | 32 | 8 |

We are required to find the variance and standard deviation of this data for which we shall go by the algorithm defined above. We shall get the results in the following table –

Size of Itemx _{i} | f_{i} | d_{i} = x_{i} –6.5 | (d_{i})^{2} | f_{i}d_{i} | f_{i}d_{i}^{2} |

3 . 5 | 3 | -3 | 9 | – 9 | 27 |

4 . 5 | 7 | -2 | 4 | – 14 | 28 |

5 . 5 | 22 | -1 | 1 | – 22 | 22 |

6 . 5 | 60 | 0 | 0 | 0 | 0 |

7 . 5 | 85 | 1 | 1 | 85 | 85 |

8 . 5 | 32 | 2 | 4 | 64 | 128 |

9 . 5 | 8 | 3 | 9 | 24 | 72 |

N = $\sum{f_i}$ = 217 | $\sum{f_i}{d_i}$ = 128 | $\sum{f_i}{d_i}^{2}$ = 362 |

Here, N = 217, $\sum{f_i}{d_i}$ = 128 and $\sum{f_i}{d_i}^{2}$ = 362

Now, we shall find the variance of the given data.

We know that

Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2}] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] ^{2}

Substituting the values we get,

Var ( X ) = $\frac{362}{217} – ( \frac{128}{217} )^{2}$ = 1.668 – 0.347 = 1.321

Now that we have obtained the value of the variance, we shall calculate the value of the standard deviation.

We know that

Standard deviation = $\sqrt{Variance}$

Therefore,

Standard deviation = = $\sqrt{1.321}$ = 1.149

Hence, for the give data, we have,

**Variance = 1.321 and Standard deviation = 1.149**

**Variance of a Grouped or Continuous Frequency Distribution**

In a grouped or continuous frequency distribution any of the methods discussed so far for a discrete frequency distribution can be used. Let us understand the algorithm for computing variance of a grouped or continuous frequency distribution.

**Algorithm for computing variance of a grouped or continuous frequency distribution**

- Find the mid-points of various classes.
- Take the deviations of these mid-points from an assumed mean. Denote these deviations by d
_{i}. - Divide the deviations in the previous step by the class interval h and denote them by u
_{i}= $\frac{d_i}{h}$ - Multiply the frequency of each class with the corresponding u
_{i}and obtain $\sum{f_i}{u_i}$ . - Square the values of u
_{i}and multiply them with the corresponding frequencies and obtain $\sum{f_i}{u_i}^{2}$ . - Substitute the values of $\sum{f_i}{u_i}, \sum{f_i}{u_i}^{2}$ and N = $\sum{f_i}$ in the formula

Var ( X ) = h^{2} [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} \sum\limits^n_{i=1}{f_i}{u_i}$ ] ^{2} ) ]

Let us understand the above algorithm using an example.

**Example** Calculate the mean and standard deviation for the following distribution

Marks | 20 – 30 | 30 – 40 | 40 – 50 | 50 – 60 | 60 – 70 | 70 – 80 | 80 – 90 |

No of Students | 3 | 6 | 13 | 15 | 14 | 5 | 4 |

Marks | 20 – 30 | 30 – 40 | 40 – 50 | 50 – 60 | 60 – 70 | 70 – 80 | 80 – 90 |

No of Students | 3 | 6 | 13 | 15 | 14 | 5 | 4 |

**Solution** We have been given the following distribution and are required to find its variance and standard deviation.

In order to do so, we shall follow the step of the algorithm defined above to obtain the following table –

Here,

Freq f _{i} | Mid Values x _{i} | u_{i} = $\frac{{x_i}–55}{10}$ | f_{i}u_{i} | u_{i}^{2} | f_{i}u_{i}^{2} | |

20-30 | 3 | 25 | -3 | -9 | 9 | 27 |

30-40 | 6 | 35 | -2 | -12 | 4 | 24 |

40-50 | 13 | 45 | -1 | -13 | 1 | 13 |

50-60 | 15 | 55 | 0 | 0 | 0 | 0 |

60-70 | 14 | 65 | 1 | 14 | 1 | 14 |

70-80 | 5 | 75 | 2 | 10 | 4 | 20 |

80-90 | 4 | 85 | 3 | 12 | 9 | 36 |

N = $\sum{f_i}$ = 60 | $\sum{f_i}{u_i}$= 2 | $\sum{f_i}{u_i}^{2}$ = 134 |

N = 60

$\sum{f_i}{u_i}$= 2

$\sum{f_i}{u_i}^{2}$ = 134

and h = 10

Mean = $\overline{X}$ = A + h ($\frac{1}{N} [ \sum{f_i}{u_i} ] ) = 55 + 10 ( \frac{2}{60}$ ) = 55.333

Now, we will calculate the variance

Variance ( X ) = h^{2} [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} [ \sum\limits^n_{i=1}{f_i}{u_i} $] ] ^{2} ) ]

= 100 [ $\frac{134}{60} – ( \frac{2}{60}$) ^{2} ] = 222.9

Now, let us calculate the standard deviation

Standard deviation = $\sqrt{Variance} = \sqrt{222.9}$ = 14.94

Hence, for the given distribution,

**Variance ( X ) = 222.9 and Standard deviation =14.94**

**Remember**

- Dispersion is the measurement of variations in the values of the variable. It measures the degree of scatteredness of the observations in a distribution around the central values.
- The variance of a variable X is the arithmetic mean of the squares of all derivatives of X from the arithmetic mean of the observations and is denoted by Var ( X ) or σ
^{2}. - The positive square root of the variance of a variate X is known as its standard deviation and is denoted by σ.
- For individual observations, Variance ( X ) = $\frac{\sum\limits^n_{i=1} ( {x_i}- \overline{X} )^{2}}{n}$

Also, Variance ( X ) = [$\frac{1}{N} \sum\limits^n_{i=1}{f_i}{x_i}$ ] ^{2}

Another formula for variance is given by Variance ( X ) = $\frac{1}{N} \sum{f_i}{d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] ^{2}

- For discrete frequency distributions, we have,

Var ( X ) = $\frac{1}{N} \sum\limits^n_{i=1}{f_i}({x_i} – \overline{X})^{2}$

Also, Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] ^{2}

Sometimes, d_{i} = x_{i} – A are divisible by a common number say, h. If we define

u_{i} = $\frac{{x_i}-A}{h} = \frac{d_i}{h}$, i = 1, 2, 3, ……., n, then we obtain the following formula for variance,Var ( X ) = h^{2} [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} \sum\limits^n_{i=1}{f_i}{u_i}$ ] ^{2} ) ]

## Recommended Worksheets

Measures of Variability (Basketball Themed) Worksheets

Applying Concept of Inferential Statistics (Estimation of Parameters) 7th Grade Math Worksheets

Measures of Skewness (Business Themed) Worksheets