• No products in the cart.

204.1.5 R-Squared in Python

Basic statistical measure for fitness of regression line.

Link to the previous post : https://statinfer.com/204-1-4-how-good-is-my-regression-line/

 

In this post we will move toward the statistical measure for a good fit.

R-Squared

  • A good fit will have
    • SSE (Minimum or Maximum?)
    • SSR (Minimum or Maximum?)
    • And we know SST= SSE + SSR
    • SSE/SST(Minimum or Maximum?)
    • SSR/SST(Minimum or Maximum?)
  • The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable.
  • The coefficient of determination is also called R-squared and is denoted as R2
R^2=SSR/SST

where 0<= R2<=1

Practice : R- Square

(We are continuing with the python session from posts 204.1.1 – 204.1.4; we have already built models required for this practice session)

  • What is the R-square value of Passengers vs Promotion_Budget model?
  • What is the R-square value of Passengers vs Inter_metro_flight_ratio?
In [19]:
#What is the R-square value of Passengers vs Promotion_Budget model?
fitted1.summary()
Out[19]:
OLS Regression Results
Dep. Variable: Passengers R-squared: 0.933
Model: OLS Adj. R-squared: 0.932
Method: Least Squares F-statistic: 1084.
Date: Wed, 27 Jul 2016 Prob (F-statistic): 1.66e-47
Time: 11:48:27 Log-Likelihood: -751.34
No. Observations: 80 AIC: 1507.
Df Residuals: 78 BIC: 1511.
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [95.0% Conf. Int.]
Intercept 1259.6058 1361.071 0.925 0.358 -1450.078 3969.290
Promotion_Budget 0.0695 0.002 32.923 0.000 0.065 0.074
Omnibus: 26.624 Durbin-Watson: 1.831
Prob(Omnibus): 0.000 Jarque-Bera (JB): 5.188
Skew: -0.128 Prob(JB): 0.0747
Kurtosis: 1.779 Cond. No. 2.67e+06
In [20]:
#What is the R-square value of Passengers vs Inter_metro_flight_ratio

fitted2.summary()
Out[20]:
OLS Regression Results
Dep. Variable: Passengers R-squared: 0.242
Model: OLS Adj. R-squared: 0.232
Method: Least Squares F-statistic: 24.90
Date: Wed, 27 Jul 2016 Prob (F-statistic): 3.58e-06
Time: 11:48:27 Log-Likelihood: -848.30
No. Observations: 80 AIC: 1701.
Df Residuals: 78 BIC: 1705.
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [95.0% Conf. Int.]
Intercept 2.044e+04 4993.747 4.093 0.000 1.05e+04 3.04e+04
Inter_metro_flight_ratio 3.507e+04 7027.768 4.990 0.000 2.11e+04 4.91e+04
Omnibus: 10.172 Durbin-Watson: 1.385
Prob(Omnibus): 0.006 Jarque-Bera (JB): 10.098
Skew: 0.822 Prob(JB): 0.00641
Kurtosis: 3.573 Cond. No. 9.48

The next post is about multiple regression in python.

Link to the next post: https://statinfer.com/204-1-6-multiple-regression-in-python/

21st June 2017

Statinfer

Statinfer derived from Statistical inference. We provide training in various Data Analytics and Data Science courses and assist candidates in securing placements.

Contact Us

info@statinfer.com

+91- 9676098897

+91- 9494762485

 

Our Social Links

top
© 2020. All Rights Reserved.