Link to the previous post : https://statinfer.com/104-2-1-importing-data-in-python/
In this post we will cover basic tasks we can perform on a dataset after importing it into python.
We will complete following tasks:
import pandas as pd # importing library pandas
Sales_country =pd.read_csv("datasets\\Superstore Sales Data\\Sales_by_country_v1.csv")
print(Sales)
#How many rows and columns are there in this dataset?
Sales_country.shape
#Print only column names in the dataset
Sales_country.columns.values
#Print first 10 observations
Sales_country.head(10)
#Print the last 5 observations
Sales_country.tail(5)
#Get the summary of the dataset
Sales_country.describe()
#Print the structure of the data
Sales_country.apply(lambda x: [x.unique()]) # this is close str() in R.
#Describe the field unitsSold
Sales_country.unitsSold.describe()
#Describe the field custCountry
Sales_country.custCountry.describe() #describe wont give much info about string variable, so we will create frequency table
Sales_country.custCountry.value_counts() #frequency table
#Create a new dataset by taking first 30 observations from this data
sales_new=Sales_country.head(30)
#Print the resultant data
print(sales_new)
#Remove(delete) the new dataset
del(sales_new)
The next post is about manipulating datasets in python.
Link to the next post : https://statinfer.com/104-2-3-manipulting-datasets-in-python/