In this series we will try to explore Logistic Regression Models. For the starters we will do a recap of Linear Regression and see if it works all the time.
import pandas as pd
sales=pd.read_csv("datasets\\Product Sales Data\\Product_sales.csv")
#What are the variables in the dataset?
sales.columns.values
#Build a predictive model for Bought vs Age
### we need to use the statsmodels package, which enables many statistical methods to be used in Python
import statsmodels.formula.api as sm
from statsmodels.formula.api import ols
model = sm.ols(formula='Bought ~ Age', data=sales)
fitted = model.fit()
fitted.summary()
#What is R-Square?
fitted.rsquared
#If Age is 4 then will that customer buy the product?
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(sales[["Age"]], sales[["Bought"]])
age1=4
predict1=lr.predict(age1)
predict1
age2=105
predict2=lr.predict(age2)
predict2
The output of these non linear functions cannot be justifies with a linear model.