Introduction
We know that dispersion is the measurement of variations in the values of the variable. It measures the degree of scatteredness of the observations in a distribution around the central values. We shall now learn about two important measures of dispersion, namely, variance and standard deviation.
What is Variance?
The variance of a variable X is the arithmetic mean of the squares of all derivatives of X from the arithmetic mean of the observations and is denoted by Var ( X ) or σ 2. In other words, The variance measures the average degree to which each point differs from the mean—the average of all data points. Let us now learn about some properties of variance.
Properties of Variance
Properties of variance include –
- In statistics and probability, a variance is always a non-negative number.
- Variance always has squared units.
- Variance treats all deviations from the mean as the same regardless of their direction.
What is Standard Deviation
The positive square root of the variance of a variate X is known as its standard deviation and is denoted by σ. In other words, Standard deviation looks at how spread out a group of numbers is from the mean, by looking at the square root of the variance.
Thus,
Standard Deviation = $\sqrt{Var ( X )}$
It is important to note here that, Standard deviation and variance both are measures that tell how spread out the numbers is. While variance gives us a rough idea of spread, the standard deviation is more concrete, giving us the exact distances from the mean.
Let us now learn about some properties of standard deviation.
Properties of Standard Deviation
Properties of standard deviation include –
- Standard deviation is sensitive to extreme values. A single very extreme value can increase the standard deviation and misrepresent the dispersion.
- For two data sets with the same mean, the one with the larger standard deviation is the one in which the data is more spread out from the centre.
- Standard deviation is equal to 0 if all values are equal. This is because all values are then equal to the mean.
- Standard deviation is useful when comparing the spread of two separate data sets that have approximately the same mean.
Now let us learn about the calculations of variance and standard deviation in different cases, namely,
- Individual Observations
- Discrete Frequency Distribution
- Continuous of Grouped Frequency Distribution
Variance of Individual Observations
If x1, x2, x3, ……, xn are n values of a variable X, then,
Variance ( X ) = $\frac{\sum\limits^n_{i=1}( x_i – \overline{X})^{2}}{n}$
Also,
Variance ( X ) = [$\frac{1}{N} \sum\limits^n_{i=1}f_ix_i$] 2
Another formula for variance is given by
Variance ( X ) = $\frac{1}{N} [\sum f_i d_i^2 ] – [\frac{1}{N} \sum\limits^n_{i=1} f_i d_i$ ] 2
In the case of individual observations, variance and standard deviation may be computed by applying any of the above three formulas. We can now define the algorithm for finding the variance when deviations are taken from the actual mean. The algorithm is as follows –
Algorithm for finding the variance when deviations are taken from the actual mean
- Compute the mean $\overline{X}$ of the given observations x1, x2, x3, x4, ……. xn.
- Take the deviations of the observations from the mean i.e. find $x_i – \overline{X}$ ; I = 1, 2, 3, ………, n.
- Square the deviations obtained in the above step and obtain the sum $\sum\limits^n_{i=1}( x_i – \overline{X})^{2}$ .
- Divide the sum $\sum\limits^n_{i=1}( x_i – \overline{X})^{2}$ obtained in the above step by n. this gives us the variance of X.
Let us understand this through an example.
Example Compute the variance and the standard deviation of the following observations of the following data –
65, 68, 58, 44, 48, 45, 60, 62, 60, 50
Solution We have been given the data – 65, 68, 58, 44, 48, 45, 60, 62, 60, 50. We are required to find the variance and the standard deviation of the given data.
Let $\overline{X}$ be the mean of the given set of observations. Then,
$\overline{X} = \frac{65+ 68+ 58+ 44 + 48+ 45 + 60+ 62+ 60 + 50}{10} = \frac{560}{10}$ = 56
Let us now compute the variance of the given data
xi | $x_i – \overline{X} = x_i$ – 56 | ($x_i – \overline{X}$ ) 2 |
65 | 9 | 81 |
58 | 2 | 4 |
68 | 12 | 144 |
44 | -12 | 144 |
48 | -8 | 64 |
45 | -11 | 121 |
60 | 4 | 16 |
62 | 6 | 36 |
60 | 4 | 16 |
50 | -6 | 36 |
($x_i – \overline{X}$ ) 2 = 662 |
We can see that number of observations is 10. Therefore, n = 10
Variance = $\frac{\sum\limits^n_{i=1}( x_i – \overline{X})^{2}}{n} = \frac{662}{10}$ = 66.2
Hence, Standard Deviation = $\sqrt{Variance} = \sqrt{66.2}$ = 8.13
Therefore, for the given set of observations,
Variance = 66.2 and Standard Deviation = 8.13
Variance of Discrete Frequency Distribution
If the values of xi of the variable X or ( and ) frequencies fi are large, in such a case we take deviations of the values of variable X from an arbitrary point say, A. If di = xi – A , i = 1, 2, 3, …….., n, then the formula of variance reduces to
Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}f_id_i$ ] 2
Sometimes, di = xi – A are divisible by a common number say, h. If we define
ui = $\frac{x_i-A}{h} = \frac{d_i}{h}$, i = 1, 2, 3, ……., n, then we obtain the following formula for variance,
Var ( X ) = h2 [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} \sum\limits^n_{i=1}f_iu_i$ ] 2 ) ]
Let us now learn about the algorithm that is used to obtain the variance using the formula –
Var ( X ) = $\frac{1}{N} [ \sum\limits^n_{i=1}f_i ( x_i- \overline{X}$ ) 2]
Algorithm for finding Variance of a Discrete Frequency Distribution
- Obtain the given frequency distribution.
- Find the mean $\overline{X}$ of the given frequency distribution.
- Compute deviations ( xi – $\overline{X}$ ) from the mean $\overline{X}$ .
- Find the squares of deviations obtained in the previous step.
- Multiply the squared deviations by respective frequencies and obtain the total [ $\sum\limits^n_{i=1}( x_i – \overline{X})^{2}$].
- Divide the total obtained in the previous step by N = $\sum{f_i}$ to obtain the variance.
Let us understand the above algorithm through an example.
Example Find the variance and standard deviation of the following frequency distribution
Variable ( xi ) | 2 | 4 | 6 | 8 | 10 | 12 | 14 | 16 |
Frequency ( fi ) | 4 | 4 | 5 | 15 | 8 | 5 | 4 | 5 |
Solution We have been given the frequency distribution as –
Variable ( xi ) | 2 | 4 | 6 | 8 | 10 | 12 | 14 | 16 |
Frequency ( fi ) | 4 | 4 | 5 | 15 | 8 | 5 | 4 | 5 |
We are required to find the variance and the standard deviation of this frequency distribution. Let us use the above algorithm for this purpose in the following table –
Var xi | Freq fi | fixi | xi – X = xi – 9 | (xi – $\overline{X}$ )2 | fi (xi – $\overline{X}$ )2 |
2 | 4 | 8 | -7 | 49 | 196 |
4 | 4 | 16 | -5 | 25 | 100 |
6 | 5 | 30 | -3 | 9 | 45 |
8 | 15 | 120 | -1 | 1 | 15 |
10 | 8 | 80 | 1 | 1 | 8 |
12 | 5 | 60 | 3 | 9 | 45 |
14 | 4 | 56 | 5 | 25 | 100 |
16 | 5 | 80 | 7 | 49 | 245 |
N = $\sum{f_i}$ = 50 | N = $\sum{f_i}{x_i}$ = 450 | $\sum{f_i}( x_i – \overline{X})^{2}$ = 754 |
Here, N = 50
$\sum{f_i}{x_i}$= 450
and,
$\sum{f_i}( x_i – \overline{X})^{2}$ = 754
Therefore, $\overline{X} = \frac{1}{N} \sum{f_i}{x_i}$
= $\frac{450}{50}$ = 9
Also,
We know that,
Var ( X ) = $\frac{1}{N} [\sum\limits^n_{i=1}{f_i}( x_i – \overline{X})^{2}$ ]
Substituting the obtained values in the above formula, we get
Var ( X ) = $\frac{750}{50}$ = 15.08
Now, that we have found the value of the variance, we shall calculate the value of the standard deviation.
We know that,
Standard Deviation = $\sqrt{Variance}$
Therefore, Standard Deviation = $\sqrt{Variance} = \sqrt{1508}$ = 3.88
Hence, for the given frequency distribution,
Variance = 15.08 and Standard Deviation = 3.88
In practice calculation of standard deviation and variance by this algorithm is rarely used, because if the actual mean is in fraction the calculation is quite tedious and time consuming.
In order to compute the variance by using the following formula,
Var ( X ) = $\frac{1}{N} [ \sum{f_i} {d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] 2 , where di = xi – A, we may use the following algorithm –
Alternate Algorithm for finding the variance of a Discrete Frequency Distribution
- Take the deviations of observations from an assumed mean say, A and denote these deviations by di.
- Multiply the deviations by the respective frequencies and obtain the total $\sum{f_i}{d_i}$ .
- Obtain the squares of deviations obtained in the previous steps, i.e. ( di ) 2 .
- Multiply the squared deviations by respective frequencies and obtain the total $\sum{f_i}{d_i}^{2}$ .
- Substitute the values in the formula
Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2}] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] 2
Let us understand this algorithm through an example
Example Calculate the variance and standard deviation from the data given below –
Size of Item | 3 . 5 | 4 . 5 | 5 . 5 | 6 . 5 | 7 . 5 | 8 . 5 | 9 . 5 |
Frequency | 3 | 7 | 2 | 60 | 85 | 32 | 8 |
Solution We have been given the data –
Size of Item | 3 . 5 | 4 . 5 | 5 . 5 | 6 . 5 | 7 . 5 | 8 . 5 | 9 . 5 |
Frequency | 3 | 7 | 2 | 60 | 85 | 32 | 8 |
We are required to find the variance and standard deviation of this data for which we shall go by the algorithm defined above. We shall get the results in the following table –
Size of Item xi | fi | di = xi –6.5 | (di)2 | fidi | fidi2 |
3 . 5 | 3 | -3 | 9 | – 9 | 27 |
4 . 5 | 7 | -2 | 4 | – 14 | 28 |
5 . 5 | 22 | -1 | 1 | – 22 | 22 |
6 . 5 | 60 | 0 | 0 | 0 | 0 |
7 . 5 | 85 | 1 | 1 | 85 | 85 |
8 . 5 | 32 | 2 | 4 | 64 | 128 |
9 . 5 | 8 | 3 | 9 | 24 | 72 |
N = $\sum{f_i}$ = 217 | $\sum{f_i}{d_i}$ = 128 | $\sum{f_i}{d_i}^{2}$ = 362 |
Here, N = 217, $\sum{f_i}{d_i}$ = 128 and $\sum{f_i}{d_i}^{2}$ = 362
Now, we shall find the variance of the given data.
We know that
Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2}] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] 2
Substituting the values we get,
Var ( X ) = $\frac{362}{217} – ( \frac{128}{217} )^{2}$ = 1.668 – 0.347 = 1.321
Now that we have obtained the value of the variance, we shall calculate the value of the standard deviation.
We know that
Standard deviation = $\sqrt{Variance}$
Therefore,
Standard deviation = = $\sqrt{1.321}$ = 1.149
Hence, for the give data, we have,
Variance = 1.321 and Standard deviation = 1.149
Variance of a Grouped or Continuous Frequency Distribution
In a grouped or continuous frequency distribution any of the methods discussed so far for a discrete frequency distribution can be used. Let us understand the algorithm for computing variance of a grouped or continuous frequency distribution.
Algorithm for computing variance of a grouped or continuous frequency distribution
- Find the mid-points of various classes.
- Take the deviations of these mid-points from an assumed mean. Denote these deviations by di.
- Divide the deviations in the previous step by the class interval h and denote them by ui = $\frac{d_i}{h}$
- Multiply the frequency of each class with the corresponding ui and obtain $\sum{f_i}{u_i}$ .
- Square the values of ui and multiply them with the corresponding frequencies and obtain $\sum{f_i}{u_i}^{2}$ .
- Substitute the values of $\sum{f_i}{u_i}, \sum{f_i}{u_i}^{2}$ and N = $\sum{f_i}$ in the formula
Var ( X ) = h2 [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} \sum\limits^n_{i=1}{f_i}{u_i}$ ] 2 ) ]
Let us understand the above algorithm using an example.
Example Calculate the mean and standard deviation for the following distribution
Marks | 20 – 30 | 30 – 40 | 40 – 50 | 50 – 60 | 60 – 70 | 70 – 80 | 80 – 90 |
No of Students | 3 | 6 | 13 | 15 | 14 | 5 | 4 |
Marks | 20 – 30 | 30 – 40 | 40 – 50 | 50 – 60 | 60 – 70 | 70 – 80 | 80 – 90 |
No of Students | 3 | 6 | 13 | 15 | 14 | 5 | 4 |
Solution We have been given the following distribution and are required to find its variance and standard deviation.
In order to do so, we shall follow the step of the algorithm defined above to obtain the following table –
Here,
Freq fi | Mid Values xi | ui = $\frac{{x_i}–55}{10}$ | fiui | ui2 | fiui2 | |
20-30 | 3 | 25 | -3 | -9 | 9 | 27 |
30-40 | 6 | 35 | -2 | -12 | 4 | 24 |
40-50 | 13 | 45 | -1 | -13 | 1 | 13 |
50-60 | 15 | 55 | 0 | 0 | 0 | 0 |
60-70 | 14 | 65 | 1 | 14 | 1 | 14 |
70-80 | 5 | 75 | 2 | 10 | 4 | 20 |
80-90 | 4 | 85 | 3 | 12 | 9 | 36 |
N = $\sum{f_i}$ = 60 | $\sum{f_i}{u_i}$= 2 | $\sum{f_i}{u_i}^{2}$ = 134 |
N = 60
$\sum{f_i}{u_i}$= 2
$\sum{f_i}{u_i}^{2}$ = 134
and h = 10
Mean = $\overline{X}$ = A + h ($\frac{1}{N} [ \sum{f_i}{u_i} ] ) = 55 + 10 ( \frac{2}{60}$ ) = 55.333
Now, we will calculate the variance
Variance ( X ) = h2 [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} [ \sum\limits^n_{i=1}{f_i}{u_i} $] ] 2 ) ]
= 100 [ $\frac{134}{60} – ( \frac{2}{60}$) 2 ] = 222.9
Now, let us calculate the standard deviation
Standard deviation = $\sqrt{Variance} = \sqrt{222.9}$ = 14.94
Hence, for the given distribution,
Variance ( X ) = 222.9 and Standard deviation =14.94
Remember
- Dispersion is the measurement of variations in the values of the variable. It measures the degree of scatteredness of the observations in a distribution around the central values.
- The variance of a variable X is the arithmetic mean of the squares of all derivatives of X from the arithmetic mean of the observations and is denoted by Var ( X ) or σ 2.
- The positive square root of the variance of a variate X is known as its standard deviation and is denoted by σ.
- For individual observations, Variance ( X ) = $\frac{\sum\limits^n_{i=1} ( {x_i}- \overline{X} )^{2}}{n}$
Also, Variance ( X ) = [$\frac{1}{N} \sum\limits^n_{i=1}{f_i}{x_i}$ ] 2
Another formula for variance is given by Variance ( X ) = $\frac{1}{N} \sum{f_i}{d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] 2
- For discrete frequency distributions, we have,
Var ( X ) = $\frac{1}{N} \sum\limits^n_{i=1}{f_i}({x_i} – \overline{X})^{2}$
Also, Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] 2
Sometimes, di = xi – A are divisible by a common number say, h. If we define
ui = $\frac{{x_i}-A}{h} = \frac{d_i}{h}$, i = 1, 2, 3, ……., n, then we obtain the following formula for variance,Var ( X ) = h2 [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} \sum\limits^n_{i=1}{f_i}{u_i}$ ] 2 ) ]
Recommended Worksheets
Measures of Variability (Basketball Themed) Worksheets
Applying Concept of Inferential Statistics (Estimation of Parameters) 7th Grade Math Worksheets
Measures of Skewness (Business Themed) Worksheets