Home » Math Theory » Statistics and Probability » Standard Deviation And Variance

Standard Deviation And Variance

Introduction

We know that dispersion is the measurement of variations in the values of the variable. It measures the degree of scatteredness of the observations in a distribution around the central values. We shall now learn about two important measures of dispersion, namely, variance and standard deviation.

What is Variance?

The variance of a variable X is the arithmetic mean of the squares of all derivatives of X from the arithmetic mean of the observations and is denoted by Var ( X ) or σ 2. In other words, The variance measures the average degree to which each point differs from the mean—the average of all data points. Let us now learn about some properties of variance.

Properties of Variance

Properties of variance include – 

  1. In statistics and probability, a variance is always a non-negative number. 
  2. Variance always has squared units. 
  3. Variance treats all deviations from the mean as the same regardless of their direction.

What is Standard Deviation

The positive square root of the variance of a variate X is known as its standard deviation and is denoted by σ. In other words, Standard deviation looks at how spread out a group of numbers is from the mean, by looking at the square root of the variance.

Thus,

Standard Deviation = $\sqrt{Var ( X )}$

It is important to note here that, Standard deviation and variance both are measures that tell how spread out the numbers is. While variance gives us a rough idea of spread, the standard deviation is more concrete, giving us the exact distances from the mean.

Let us now learn about some properties of standard deviation.

Properties of Standard Deviation

Properties of standard deviation include – 

  1. Standard deviation is sensitive to extreme values. A single very extreme value can increase the standard deviation and misrepresent the dispersion.
  2. For two data sets with the same mean, the one with the larger standard deviation is the one in which the data is more spread out from the centre.
  3. Standard deviation is equal to 0 if all values are equal. This is because all values are then equal to the mean.
  4. Standard deviation is useful when comparing the spread of two separate data sets that have approximately the same mean.

Now let us learn about the calculations of variance and standard deviation in different cases, namely,

  1. Individual Observations
  2. Discrete Frequency Distribution
  3. Continuous of Grouped Frequency Distribution

Variance of Individual Observations

If x1, x2, x3, ……, xn are n values of a variable X, then,

Variance ( X ) = $\frac{\sum\limits^n_{i=1}( x_i – \overline{X})^{2}}{n}$

Also,

Variance ( X ) = [$\frac{1}{N} \sum\limits^n_{i=1}f_ix_i$] 2

Another formula for variance is given by

Variance ( X ) = $\frac{1}{N} [\sum f_i d_i^2 ] – [\frac{1}{N} \sum\limits^n_{i=1} f_i d_i$ ] 2

In the case of individual observations, variance and standard deviation may be computed by applying any of the above three formulas. We can now define the algorithm for finding the variance when deviations are taken from the actual mean. The algorithm is as follows – 

Algorithm for finding the variance when deviations are taken from the actual mean

  1. Compute the mean $\overline{X}$ of the given observations x1, x2, x3, x4, ……. xn.
  2. Take the deviations of the observations from the mean i.e. find $x_i – \overline{X}$ ; I = 1, 2, 3, ………, n.
  3. Square the deviations obtained in the above step and obtain the sum $\sum\limits^n_{i=1}( x_i – \overline{X})^{2}$ .
  4. Divide the sum  $\sum\limits^n_{i=1}( x_i – \overline{X})^{2}$ obtained in the above step by n. this gives us the variance of X.

Let us understand this through an example.

Example Compute the variance and the standard deviation of the following observations of the following data – 

  65, 68, 58, 44, 48, 45, 60, 62, 60, 50

Solution We have been given the data – 65, 68, 58, 44, 48, 45, 60, 62, 60, 50. We are required to find the variance and the standard deviation of the given data. 

Let $\overline{X}$ be the mean of the given set of observations. Then,

$\overline{X} = \frac{65+ 68+ 58+  44 + 48+  45 + 60+  62+  60 + 50}{10} = \frac{560}{10}$ = 56

Let us now compute the variance of the given data

xi$x_i – \overline{X} = x_i$ – 56($x_i – \overline{X}$ ) 2
65981
5824
6812144
44-12144
48-864
45-11121
60416
62636
60416
50-636
($x_i – \overline{X}$ ) 2 = 662

We can see that number of observations is 10. Therefore, n = 10

Variance = $\frac{\sum\limits^n_{i=1}( x_i – \overline{X})^{2}}{n} = \frac{662}{10}$ = 66.2

Hence, Standard Deviation = $\sqrt{Variance}  = \sqrt{66.2}$ = 8.13

Therefore, for the given set of observations,

Variance = 66.2 and Standard Deviation = 8.13

Variance of Discrete Frequency Distribution

If the values of xi of the variable X or ( and  ) frequencies fi are large, in such a case we take deviations of the values of variable X from an arbitrary point say, A. If di = xi – A , i = 1, 2, 3, …….., n, then the formula of variance reduces to

Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}f_id_i$ ] 2

Sometimes, di =  xi – A are divisible by a common number  say, h. If we define

ui =  $\frac{x_i-A}{h} = \frac{d_i}{h}$, i = 1, 2, 3, ……., n, then we obtain the following formula for variance,

Var ( X ) =  h2 [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} \sum\limits^n_{i=1}f_iu_i$ ] 2 ) ]

Let us now learn about the algorithm that is used to obtain the variance using the formula – 

 Var ( X ) =  $\frac{1}{N} [ \sum\limits^n_{i=1}f_i ( x_i- \overline{X}$  ) 2]

Algorithm for finding Variance of a Discrete Frequency Distribution

  1. Obtain the given frequency distribution.
  2. Find the mean $\overline{X}$ of the given frequency distribution.
  3. Compute deviations ( xi –   $\overline{X}$ ) from the mean  $\overline{X}$ .
  4. Find the squares of deviations obtained in the previous step.
  5. Multiply the squared deviations by respective frequencies and obtain the total [ $\sum\limits^n_{i=1}( x_i – \overline{X})^{2}$].
  6. Divide the total obtained in the previous step by N = $\sum{f_i}$ to obtain the variance.

Let us understand the above algorithm through an example.

Example Find the variance and standard deviation of the following frequency distribution

Variable ( xi )246810121416
Frequency ( fi )445158545

Solution We have been given the frequency distribution as  – 

Variable ( xi )246810121416
Frequency ( fi )445158545

We are required to find the variance and the standard deviation of this frequency distribution. Let us use the above algorithm for this purpose in the following table – 

Var
xi
Freq
fi
fixixi – X  = xi – 9(xi – $\overline{X}$  )2fi (xi – $\overline{X}$ )2
248-749196
4416-525100
6530-3945
815120-1115
10880118
125603945
14456525100
16580749245
N = $\sum{f_i}$ = 50N = $\sum{f_i}{x_i}$ = 450$\sum{f_i}( x_i – \overline{X})^{2}$ = 754

Here, N = 50

$\sum{f_i}{x_i}$= 450

and,

$\sum{f_i}( x_i – \overline{X})^{2}$ = 754

Therefore, $\overline{X} = \frac{1}{N} \sum{f_i}{x_i}$

 = $\frac{450}{50}$ = 9

Also,

We know that, 

Var ( X ) =  $\frac{1}{N} [\sum\limits^n_{i=1}{f_i}( x_i – \overline{X})^{2}$ ]

Substituting the obtained values in the above formula, we get

Var ( X ) =  $\frac{750}{50}$ = 15.08

Now, that we have found the value of the variance, we shall calculate the value of the standard deviation.

We know that,

Standard Deviation = $\sqrt{Variance}$

Therefore, Standard Deviation = $\sqrt{Variance}  = \sqrt{1508}$ = 3.88

Hence, for the given frequency distribution,

Variance = 15.08 and Standard Deviation = 3.88

In practice calculation of standard deviation and variance by this algorithm is rarely used, because if the actual mean is in fraction the calculation is quite tedious and time consuming. 

In order to compute the variance by using the following formula, 

Var ( X ) = $\frac{1}{N} [ \sum{f_i} {d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] 2 , where di = xi – A, we may use the following algorithm – 

Alternate Algorithm for finding the variance of a Discrete Frequency Distribution

  1. Take the deviations of observations from an assumed mean say, A and denote these deviations by di.
  2. Multiply the deviations by the respective frequencies and obtain the total $\sum{f_i}{d_i}$ .
  3. Obtain the squares of deviations obtained in the previous steps, i.e. ( di ) 2 .
  4. Multiply the squared deviations by respective frequencies and obtain the total $\sum{f_i}{d_i}^{2}$ .
  5. Substitute the values in the formula

Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2}] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] 2

Let us understand this algorithm through an example

Example Calculate the variance and standard deviation from the data given below – 

Size of Item3 . 54 . 55 . 56 . 57 . 58 . 59 . 5
Frequency3726085328

Solution We have been given the data – 

Size of Item3 . 54 . 55 . 56 . 57 . 58 . 59 . 5
Frequency3726085328

We are required to find the variance and standard deviation of this data for which we shall go by the algorithm defined above. We shall get the results in the following table – 

Size of Item
xi
fidi = xi –6.5(di)2fidifidi2
3 . 53-39– 927
4 . 57-24– 1428
5 . 522-11– 2222
6 . 5600000
7 . 585118585
8 . 5322464128
9 . 58392472
N = $\sum{f_i}$ = 217$\sum{f_i}{d_i}$ = 128$\sum{f_i}{d_i}^{2}$  = 362

Here, N = 217, $\sum{f_i}{d_i}$ = 128 and $\sum{f_i}{d_i}^{2}$  = 362

Now, we shall find the variance of the given data.

We know that

Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2}] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] 2

Substituting the values we get,

Var ( X ) = $\frac{362}{217} – ( \frac{128}{217} )^{2}$ = 1.668 – 0.347 = 1.321

Now that we have obtained the value of the variance, we shall calculate the value of the standard deviation.

We know that 

Standard deviation = $\sqrt{Variance}$  

Therefore,

Standard deviation = = $\sqrt{1.321}$ = 1.149

Hence, for the give data, we have,

Variance = 1.321 and Standard deviation = 1.149

Variance of a Grouped or Continuous Frequency Distribution

In a grouped or continuous frequency distribution any of the methods discussed so far for a discrete frequency distribution can be used. Let us understand the algorithm for computing variance of a grouped or continuous frequency distribution.

Algorithm for computing variance of a grouped or continuous frequency distribution

  1. Find the mid-points of various classes.
  2. Take the deviations of these mid-points from an assumed mean. Denote these deviations by di.
  3. Divide the deviations in the previous step by the class interval h and denote them by ui = $\frac{d_i}{h}$
  4. Multiply the frequency of each class with the corresponding ui and obtain $\sum{f_i}{u_i}$ .
  5. Square the values of ui and multiply them with the corresponding frequencies and obtain $\sum{f_i}{u_i}^{2}$ .
  6. Substitute the values of $\sum{f_i}{u_i}, \sum{f_i}{u_i}^{2}$  and N = $\sum{f_i}$   in the formula

Var ( X ) =  h2 [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} \sum\limits^n_{i=1}{f_i}{u_i}$ ] 2 ) ]

Let us understand the above algorithm using an example.

Example Calculate the mean and standard deviation for the following distribution

Marks20 – 3030 – 4040 – 5050 – 6060 – 7070 – 8080 – 90
No of Students3613151454
Marks20 – 3030 – 4040 – 5050 – 6060 – 7070 – 8080 – 90
No of Students3613151454

Solution We have been given the following distribution and are required to find its variance and standard deviation.

In order to do so, we shall follow the step of the algorithm defined above to obtain the following table – 

Here,

 Freq
fi
Mid Values
xi
ui = $\frac{{x_i}–55}{10}$fiui ui2fiui2
20-30325-3-9927
30-40635-2-12424
40-501345-1-13113
50-6015550000
60-701465114114
70-80575210420
80-90485312936
N = $\sum{f_i}$ = 60$\sum{f_i}{u_i}$= 2$\sum{f_i}{u_i}^{2}$ = 134

N = 60

$\sum{f_i}{u_i}$= 2

$\sum{f_i}{u_i}^{2}$ = 134

and h = 10

Mean  = $\overline{X}$ = A + h ($\frac{1}{N} [ \sum{f_i}{u_i} ] ) = 55 + 10 ( \frac{2}{60}$ ) = 55.333

Now, we will calculate the variance 

Variance ( X ) = h2 [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} [ \sum\limits^n_{i=1}{f_i}{u_i} $] ] 2 ) ]

 = 100 [ $\frac{134}{60}  – ( \frac{2}{60}$) 2 ] = 222.9

Now, let us calculate the standard deviation

Standard deviation = $\sqrt{Variance}  = \sqrt{222.9}$ = 14.94

Hence, for the given distribution,

Variance ( X ) = 222.9 and Standard deviation =14.94

Remember

  1. Dispersion is the measurement of variations in the values of the variable. It measures the degree of scatteredness of the observations in a distribution around the central values.
  2. The variance of a variable X is the arithmetic mean of the squares of all derivatives of X from the arithmetic mean of the observations and is denoted by Var ( X ) or σ 2.
  3. The positive square root of the variance of a variate X is known as its standard deviation and is denoted by σ.
  4. For individual observations, Variance ( X ) = $\frac{\sum\limits^n_{i=1} ( {x_i}- \overline{X} )^{2}}{n}$

Also, Variance ( X ) = [$\frac{1}{N} \sum\limits^n_{i=1}{f_i}{x_i}$ ] 2

Another formula for variance is given by Variance ( X ) = $\frac{1}{N} \sum{f_i}{d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] 2

  1. For discrete frequency distributions, we have,

Var ( X ) =  $\frac{1}{N} \sum\limits^n_{i=1}{f_i}({x_i} – \overline{X})^{2}$

Also, Var ( X ) = $\frac{1}{N} [ \sum{f_i}{d_i}^{2} ] – [\frac{1}{N} \sum\limits^n_{i=1}{f_i}{d_i}$ ] 2

Sometimes, di =  xi – A are divisible by a common number  say, h. If we define

ui = $\frac{{x_i}-A}{h} = \frac{d_i}{h}$, i = 1, 2, 3, ……., n, then we obtain the following formula for variance,Var ( X ) =  h2 [ ($\frac{1}{N} [ \sum{f_i}{u_i}^{2} ] ) – ([\frac{1}{N} \sum\limits^n_{i=1}{f_i}{u_i}$ ] 2 ) ]

Recommended Worksheets

Measures of Variability (Basketball Themed) Worksheets
Applying Concept of Inferential Statistics (Estimation of Parameters) 7th Grade Math Worksheets
Measures of Skewness (Business Themed) Worksheets