import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.boxplot(bank.balance)
plt.boxplot(bank.balance)
outlier are present in balance variable
#Get relevant percentiles and see their distribution
bank['balance'].quantile([0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1])
# Draw a box plot for age variable
plt.boxplot(bank.age)
No outliers are present
#Get relevant percentiles and see their distribution
bank['age'].quantile([0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1])
Next post is about creating graphs in python.
Link to the next post :https://statinfer.com/104-3-6-creating-graphs-in-python/
You must be logged in to post a comment.
Great tutorial. I am currently trying to figure out how to actually target the outliers, log them, and then remove them from the dataframe. Your title insinuates that there is a function that actually detects the outliers. Do you know of any methods that can do this or what would be the best algorithm?