• No products in the cart.

204.2.2 Logistic Function to Regression

From function to regression.

Link to the previous post :  https://statinfer.com/204-2-1-logistic-regression-why-do-we-need-it/


In the last post we saw linear regression cannot be used if the final output is binary, yes or no. As it’s tough to fit a binary output on a linear function.

To solve this problem we can move toward some different kind of functions, a Logistic Function being the first choice.

A Logistic Function

This is how a Logistic Function look like:

The Logistic function

  • We want a model that predicts probabilities between 0 and 1, that is, S-shaped.
  • There are lots of s-shaped curves. We use the logistic model:

Logistic Regression Output

  • In logistic regression, we try to predict the probability instead of direct values.
  • Y is binary, it takes only two values 1 and 0 instead of predicting 1 or 0 we predict the probability of 1 and probability of zero.
  • This suits aptly for the binary categorical outputs like YES vs NO; WIN vs LOSS; Fraud vs Non Fraud.

Practice : Logistic Regression

  • Dataset: Product Sales Data/Product_sales.csv
  • Build a logistic Regression line between Age and buying
  • A 4 years old customer, will he buy the product?
  • If Age is 105 then will that customer buy the product?
In [8]:
import pandas as pd 
sales=pd.read_csv("datasets\\Product Sales Data\\Product_sales.csv")

import statsmodels.formula.api as sm

# Build a logistic Regression line between Age and buying 
<statsmodels.discrete.discrete_model.Logit at 0x203ba4ac630>
In [9]:
result = logit.fit()
Optimization terminated successfully.
         Current function value: 0.584320
         Iterations 5
<statsmodels.discrete.discrete_model.BinaryResultsWrapper at 0x203bbd90e48>
In [10]:
Logit Regression Results
Dep. Variable: Bought No. Observations: 467
Model: Logit Df Residuals: 466
Method: MLE Df Model: 0
Date: Sun, 16 Oct 2016 Pseudo R-squ.: 0.1478
Time: 14:35:42 Log-Likelihood: -272.88
converged: True LL-Null: -320.21
LLR p-value: nan
coef std err z P>|z| [95.0% Conf. Int.]
Age 0.0294 0.003 8.813 0.000 0.023 0.036
In [11]:
###coefficients Interval of each coefficient

print (result.conf_int())
            0         1
Age  0.022851  0.035923
In [12]:
#One more way of fitting the model
from sklearn.linear_model import LogisticRegression
logistic = LogisticRegression()
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)
In [13]:
#A 4 years old customer, will he buy the product?
In [14]:
#If Age is 105 then will that customer buy the product?

The next post is on multiple logistic regression.
Link to the next post: https://statinfer.com/204-2-3-multiple-logistic-regression/

0 responses on "204.2.2 Logistic Function to Regression"

Leave a Message

© 2020. All Rights Reserved.