Statinfer

103.2.3.b Sub Setting-An Example

Learn through practice

In previous section we saw  Sub Setting Example 1

Here we have to import the automobile dataset and then perform various operations in it.

  1. Create a new dataset for exclusively Toyota cars
  2. Import : “./Automobile Data Set/AutoDataset.csv”
  3. Create a new dataset for all cars with city.mpg greater than 30 and engine size is less than 120.
  4. Create a new dataset by taking only sedan cars. Keep only four variables(Make, body style, fuel type, price) in the final dataset.
  5. Create a new dataset by taking Audi, BMW or Porsche company makes. Drop two variables from the resultant dataset(price and normalized losses)

Solutions

1.Import : “./Automobile Data Set/AutoDataset.csv”

>auto_data <- read.csv("C:\\Amrita\\Datavedi\\Automobile Data Set\\AutoDataset.csv")

2.Create a new dataset for exclusively Toyota cars

>toyota_data <- subset(auto_data, make == "toyota")
>head(toyota_data)
symboling normalized.losses   make fuel.type aspiration num.of.doors
## 151         1                87 toyota       gas        std          two
## 152         1                87 toyota       gas        std          two
## 153         1                74 toyota       gas        std         four
## 154         0                77 toyota       gas        std         four
## 155         0                81 toyota       gas        std         four
## 156         0                91 toyota       gas        std         four
##     body.style drive.wheels engine.location wheel.base length width height
## 151  hatchback          fwd           front       95.7  158.7  63.6   54.5
## 152  hatchback          fwd           front       95.7  158.7  63.6   54.5
## 153  hatchback          fwd           front       95.7  158.7  63.6   54.5
## 154      wagon          fwd           front       95.7  169.7  63.6   59.1
## 155      wagon          4wd           front       95.7  169.7  63.6   59.1
## 156      wagon          4wd           front       95.7  169.7  63.6   59.1
##     curb.weight engine.type num.of.cylinders engine.size fuel.system bore
## 151        1985         ohc             four          92        2bbl 3.05
## 152        2040         ohc             four          92        2bbl 3.05
## 153        2015         ohc             four          92        2bbl 3.05
## 154        2280         ohc             four          92        2bbl 3.05
## 155        2290         ohc             four          92        2bbl 3.05
## 156        3110         ohc             four          92        2bbl 3.05
##     stroke compression.ratio horsepower peak.rpm city.mpg highway.mpg
## 151   3.03                 9         62     4800       35          39
## 152   3.03                 9         62     4800       31          38
## 153   3.03                 9         62     4800       31          38
## 154   3.03                 9         62     4800       31          37
## 155   3.03                 9         62     4800       27          32
## 156   3.03                 9         62     4800       27          32
##     price
## 151  5348
## 152  6338
## 153  6488
## 154  6918
## 155  7898
## 156  8778

 


3.Create a new dataset for all cars with city.mpg greater than 30 and engine size is less than 120.

>auto_data1 <- subset(auto_data, (city.mpg > 30) & (engine.size < 120))
>head(auto_data1)

##    symboling normalized.losses      make fuel.type aspiration num.of.doors
## 19         2               121 chevrolet       gas        std          two
## 20         1                98 chevrolet       gas        std          two
## 21         0                81 chevrolet       gas        std         four
## 22         1               118     dodge       gas        std          two
## 23         1               118     dodge       gas        std          two
## 25         1               148     dodge       gas        std         four
##    body.style drive.wheels engine.location wheel.base length width height
## 19  hatchback          fwd           front       88.4  141.1  60.3   53.2
## 20  hatchback          fwd           front       94.5  155.9  63.6   52.0
## 21      sedan          fwd           front       94.5  158.8  63.6   52.0
## 22  hatchback          fwd           front       93.7  157.3  63.8   50.8
## 23  hatchback          fwd           front       93.7  157.3  63.8   50.8
## 25  hatchback          fwd           front       93.7  157.3  63.8   50.6
##    curb.weight engine.type num.of.cylinders engine.size fuel.system bore
## 19        1488           l            three          61        2bbl 2.91
## 20        1874         ohc             four          90        2bbl 3.03
## 21        1909         ohc             four          90        2bbl 3.03
## 22        1876         ohc             four          90        2bbl 2.97
## 23        1876         ohc             four          90        2bbl 2.97
## 25        1967         ohc             four          90        2bbl 2.97
##    stroke compression.ratio horsepower peak.rpm city.mpg highway.mpg price
## 19   3.03              9.50         48     5100       47          53  5151
## 20   3.11              9.60         70     5400       38          43  6295
## 21   3.11              9.60         70     5400       38          43  6575
## 22   3.23              9.41         68     5500       37          41  5572
## 23   3.23              9.40         68     5500       31          38  6377
## 25   3.23              9.40         68     5500       31          38  622

4.Create a new dataset by taking only sedan cars. Keep only four variables(Make, body style, fuel type, price) in the final dataset.

>auto_data2 <- subset(auto_data, body.style == "sedan" , select = c(make, body.style,fuel.type,price))
>head(auto_data2)
make body.style fuel.type price
## 4  audi      sedan       gas 13950
## 5  audi      sedan       gas 17450
## 6  audi      sedan       gas 15250
## 7  audi      sedan       gas 17710
## 9  audi      sedan       gas 23875
## 11  bmw      sedan       gas 16430



5.Create a new dataset by taking Audi, BMW or Porsche company makes. Drop two variables from the resultant dataset(price and normalized losses)

auto_data3 <- subset(auto_data, (make == "audi") | (make == "bmw") | (make == "porsche"), select = c(-price, -normalized.losses)) 
head(auto_data3)
##   symboling make fuel.type aspiration num.of.doors body.style drive.wheels
## 4         2 audi       gas        std         four      sedan          fwd
## 5         2 audi       gas        std         four      sedan          4wd
## 6         2 audi       gas        std          two      sedan          fwd
## 7         1 audi       gas        std         four      sedan          fwd
## 8         1 audi       gas        std         four      wagon          fwd
## 9         1 audi       gas      turbo         four      sedan          fwd
##   engine.location wheel.base length width height curb.weight engine.type
## 4           front       99.8  176.6  66.2   54.3        2337         ohc
## 5           front       99.4  176.6  66.4   54.3        2824         ohc
## 6           front       99.8  177.3  66.3   53.1        2507         ohc
## 7           front      105.8  192.7  71.4   55.7        2844         ohc
## 8           front      105.8  192.7  71.4   55.7        2954         ohc
## 9           front      105.8  192.7  71.4   55.9        3086         ohc
##   num.of.cylinders engine.size fuel.system bore stroke compression.ratio
## 4             four         109        mpfi 3.19    3.4              10.0
## 5             five         136        mpfi 3.19    3.4               8.0
## 6             five         136        mpfi 3.19    3.4               8.5
## 7             five         136        mpfi 3.19    3.4               8.5
## 8             five         136        mpfi 3.19    3.4               8.5
## 9             five         131        mpfi 3.13    3.4               8.3
##   horsepower peak.rpm city.mpg highway.mpg
## 4        102     5500       24          30
## 5        115     5500       18          22
## 6        110     5500       19          25
## 7        110     5500       19          25
## 8        110     5500       19          25
## 9        140     5500       17          20

With these two examples we have learned much more about subsetting in R.

In the next post we will see Calculated Fields in R.

0 responses on "103.2.3.b Sub Setting-An Example"

Leave a Message

Blog Posts

Hurry up!!!

"use coupon code for FLAT 30% discount"  datascientistoffer        ___________________________________      Subscribe to our youtube channel. Get access to video tutorials.                

Contact Us

Statinfer Software Solutions#647 2nd floor 1st Main, Indira Nagar 1st Stage, 100 feet road,Indranagar Bangalore,Karnataka, Pin code:-560038 Landmarks: Opp. Namma Metro Pillar 48.

Connect with us

linkin fn twitter g

How to become a Data Scientist.?

top