Link to the previous post : https://statinfer.com/104-3-2-descriptive-statistics-mean-and-median/

In the previous post we tried understanding descriptive Statistics. In this post we will understand Dispersion Measures and implement them using python.

This post is the extension of previous posts, we will be going forward with previously imported data from 104.3.2 and 104.3.1.

Dispersion Measures : Variance and Standard Deviation

Dispersion

Just knowing the central tendency is not enough.
Two variables might have same mean, but they might be very different.
Look at these two variables. Profit details of two companies A & B for last 14 Quarters in MMs

Company A	Company B
43	17
44	15
0	12
25	17
20	15
35	18
-8	12
13	15
-10	12
-8	13
32	18
11	18
-8	14
21	14

Though the average profit is 15 in both the cases
Company B has performed consistently than company A.
There was even loses for company A
Measures of dispersion become very vital in such cases

Variance and Standard deviation

Dispersion is the quantification of deviation of each point from the mean value.
Variance is average of squared distances of each point from the mean
Variance is a fairly good measure of dispersion.
Variance in profit for company A is 352 and Company B is 4.9

σ 2 = \sum n i = 1 ( x i - x ¯ ) 2 n

Variance Calculation

Value	Value – Mean	(Value – Mean)^2
43	28	784
44	29	841
0	-15	225
25	10	100
20	5	25
35	20	400
-8	-23	529
13	-2	4
-10	-25	625
-8	-23	529
32	17	289
11	-4	16
-8	-23	529
21	6	36
15		352

Value	Value – Mean	(Value – Mean)^2
17	2	4
15	0	0
12	-3	9
15	0	0
18	3	9
12	-3	9
15	0	0
12	-3	9
13	-2	4
18	3	9
18	3	9
14	-1	1
14	-1	1
21	6	36
15		4.9

Standard Deviation

Standard deviation is just the square root of variance
Variance gives a good idea on dispersion, but it is of the order of squares.
Its very clear from the formula, variance unites are squared than that of original data.
Standard deviation is the variance measure that is in the same units as the original data

 $s = \sqrt{\frac{\sum_{i = 1}^{n} (x_{i} - \bar{x})^{2}}{n}}$

Variance and Standard deviation on Python

Divide the Income data into two sets. USA vs Others
Find the variance of “education.num” in those two sets. Which one has higher variance?

In [12]:

usa_income=Income_Data[Income_Data["native-country"]==' United-States']
usa_income.shape

Out[12]:

(29170, 15)

In [13]:

other_income=Income_Data[Income_Data["native-country"]!=' United-States']
other_income.shape

Out[13]:

(3391, 15)

Variance and SD for USA

In [14]:

var_usa=usa_income["education-num"].var()
var_usa

Out[14]:

5.735862879538104

In [15]:

std_usa=usa_income["education-num"].std()
std_usa

Out[15]:

2.394966154152936

In [16]:

var_other=other_income["education-num"].var()
var_other

Out[16]:

13.567613037808737

In [17]:

std_other=other_income["education-num"].std()
std_other

Out[17]:

3.6834240914954033

Practice : Variance and Standard deviation

Dataset: “./Online Retail Sales Data/Online Retail.csv”
What is the variance and s.d of “UnitPrice”
What is the variance and s.d of “Quantity”
Which one these two variables is consistent?

In [18]:

var_UnitPrice=Retail['UnitPrice'].var()
var_UnitPrice

Out[18]:

9362.469164424467

In [19]:

std_UnitPrice=Retail['UnitPrice'].std()
std_UnitPrice

Out[19]:

96.75985306119716

In [20]:

var_quantity=Retail['Quantity'].var()
var_quantity

Out[20]:

47559.39140913822

In [21]:

std_quantity=Retail['Quantity'].std()
std_quantity

Out[21]:

218.08115784986612


The next post is about percentiles and quartiles in python.
Link to the next post : https://statinfer.com/104-3-4-percentiles-quartiles-in-python/

104.3.3 Dispersion Measures in Python

Variance and Standard Deviation

Dispersion Measures : Variance and Standard Deviation

Dispersion

Variance and Standard deviation

Standard Deviation

Variance and Standard deviation on Python

Practice : Variance and Standard deviation

Statinfer

Statinfer

Statinfer

104.3.3 Dispersion Measures in Python

Variance and Standard Deviation

Dispersion Measures : Variance and Standard Deviation

Dispersion

Variance and Standard deviation

Standard Deviation

Variance and Standard deviation on Python

Practice : Variance and Standard deviation

Related Courses

Python(Batch6)

Statinfer

Tableau (Batch6)

Statinfer

PowerBI (Batch6)

Statinfer