• No products in the cart.

103.2.3.b Sub Setting-An Example

Learn through practice

In previous section we saw  Sub Setting Example 1

Here we have to import the automobile dataset and then perform various operations in it.

  1. Create a new dataset for exclusively Toyota cars
  2. Import : “./Automobile Data Set/AutoDataset.csv”
  3. Create a new dataset for all cars with city.mpg greater than 30 and engine size is less than 120.
  4. Create a new dataset by taking only sedan cars. Keep only four variables(Make, body style, fuel type, price) in the final dataset.
  5. Create a new dataset by taking Audi, BMW or Porsche company makes. Drop two variables from the resultant dataset(price and normalized losses)

Solutions

1.Import : “./Automobile Data Set/AutoDataset.csv”

>auto_data <- read.csv("C:\\Amrita\\Datavedi\\Automobile Data Set\\AutoDataset.csv")

2.Create a new dataset for exclusively Toyota cars

>toyota_data <- subset(auto_data, make == "toyota")
>head(toyota_data)
symboling normalized.losses   make fuel.type aspiration num.of.doors
## 151         1                87 toyota       gas        std          two
## 152         1                87 toyota       gas        std          two
## 153         1                74 toyota       gas        std         four
## 154         0                77 toyota       gas        std         four
## 155         0                81 toyota       gas        std         four
## 156         0                91 toyota       gas        std         four
##     body.style drive.wheels engine.location wheel.base length width height
## 151  hatchback          fwd           front       95.7  158.7  63.6   54.5
## 152  hatchback          fwd           front       95.7  158.7  63.6   54.5
## 153  hatchback          fwd           front       95.7  158.7  63.6   54.5
## 154      wagon          fwd           front       95.7  169.7  63.6   59.1
## 155      wagon          4wd           front       95.7  169.7  63.6   59.1
## 156      wagon          4wd           front       95.7  169.7  63.6   59.1
##     curb.weight engine.type num.of.cylinders engine.size fuel.system bore
## 151        1985         ohc             four          92        2bbl 3.05
## 152        2040         ohc             four          92        2bbl 3.05
## 153        2015         ohc             four          92        2bbl 3.05
## 154        2280         ohc             four          92        2bbl 3.05
## 155        2290         ohc             four          92        2bbl 3.05
## 156        3110         ohc             four          92        2bbl 3.05
##     stroke compression.ratio horsepower peak.rpm city.mpg highway.mpg
## 151   3.03                 9         62     4800       35          39
## 152   3.03                 9         62     4800       31          38
## 153   3.03                 9         62     4800       31          38
## 154   3.03                 9         62     4800       31          37
## 155   3.03                 9         62     4800       27          32
## 156   3.03                 9         62     4800       27          32
##     price
## 151  5348
## 152  6338
## 153  6488
## 154  6918
## 155  7898
## 156  8778

 


3.Create a new dataset for all cars with city.mpg greater than 30 and engine size is less than 120.

>auto_data1 <- subset(auto_data, (city.mpg > 30) & (engine.size < 120))
>head(auto_data1)

##    symboling normalized.losses      make fuel.type aspiration num.of.doors
## 19         2               121 chevrolet       gas        std          two
## 20         1                98 chevrolet       gas        std          two
## 21         0                81 chevrolet       gas        std         four
## 22         1               118     dodge       gas        std          two
## 23         1               118     dodge       gas        std          two
## 25         1               148     dodge       gas        std         four
##    body.style drive.wheels engine.location wheel.base length width height
## 19  hatchback          fwd           front       88.4  141.1  60.3   53.2
## 20  hatchback          fwd           front       94.5  155.9  63.6   52.0
## 21      sedan          fwd           front       94.5  158.8  63.6   52.0
## 22  hatchback          fwd           front       93.7  157.3  63.8   50.8
## 23  hatchback          fwd           front       93.7  157.3  63.8   50.8
## 25  hatchback          fwd           front       93.7  157.3  63.8   50.6
##    curb.weight engine.type num.of.cylinders engine.size fuel.system bore
## 19        1488           l            three          61        2bbl 2.91
## 20        1874         ohc             four          90        2bbl 3.03
## 21        1909         ohc             four          90        2bbl 3.03
## 22        1876         ohc             four          90        2bbl 2.97
## 23        1876         ohc             four          90        2bbl 2.97
## 25        1967         ohc             four          90        2bbl 2.97
##    stroke compression.ratio horsepower peak.rpm city.mpg highway.mpg price
## 19   3.03              9.50         48     5100       47          53  5151
## 20   3.11              9.60         70     5400       38          43  6295
## 21   3.11              9.60         70     5400       38          43  6575
## 22   3.23              9.41         68     5500       37          41  5572
## 23   3.23              9.40         68     5500       31          38  6377
## 25   3.23              9.40         68     5500       31          38  622

4.Create a new dataset by taking only sedan cars. Keep only four variables(Make, body style, fuel type, price) in the final dataset.

>auto_data2 <- subset(auto_data, body.style == "sedan" , select = c(make, body.style,fuel.type,price))
>head(auto_data2)
make body.style fuel.type price
## 4  audi      sedan       gas 13950
## 5  audi      sedan       gas 17450
## 6  audi      sedan       gas 15250
## 7  audi      sedan       gas 17710
## 9  audi      sedan       gas 23875
## 11  bmw      sedan       gas 16430



5.Create a new dataset by taking Audi, BMW or Porsche company makes. Drop two variables from the resultant dataset(price and normalized losses)

auto_data3 <- subset(auto_data, (make == "audi") | (make == "bmw") | (make == "porsche"), select = c(-price, -normalized.losses)) 
head(auto_data3)
##   symboling make fuel.type aspiration num.of.doors body.style drive.wheels
## 4         2 audi       gas        std         four      sedan          fwd
## 5         2 audi       gas        std         four      sedan          4wd
## 6         2 audi       gas        std          two      sedan          fwd
## 7         1 audi       gas        std         four      sedan          fwd
## 8         1 audi       gas        std         four      wagon          fwd
## 9         1 audi       gas      turbo         four      sedan          fwd
##   engine.location wheel.base length width height curb.weight engine.type
## 4           front       99.8  176.6  66.2   54.3        2337         ohc
## 5           front       99.4  176.6  66.4   54.3        2824         ohc
## 6           front       99.8  177.3  66.3   53.1        2507         ohc
## 7           front      105.8  192.7  71.4   55.7        2844         ohc
## 8           front      105.8  192.7  71.4   55.7        2954         ohc
## 9           front      105.8  192.7  71.4   55.9        3086         ohc
##   num.of.cylinders engine.size fuel.system bore stroke compression.ratio
## 4             four         109        mpfi 3.19    3.4              10.0
## 5             five         136        mpfi 3.19    3.4               8.0
## 6             five         136        mpfi 3.19    3.4               8.5
## 7             five         136        mpfi 3.19    3.4               8.5
## 8             five         136        mpfi 3.19    3.4               8.5
## 9             five         131        mpfi 3.13    3.4               8.3
##   horsepower peak.rpm city.mpg highway.mpg
## 4        102     5500       24          30
## 5        115     5500       18          22
## 6        110     5500       19          25
## 7        110     5500       19          25
## 8        110     5500       19          25
## 9        140     5500       17          20

With these two examples we have learned much more about subsetting in R.

In the next post we will see Calculated Fields in R.

Statinfer

Statinfer derived from Statistical inference. We provide training in various Data Analytics and Data Science courses and assist candidates in securing placements.

Contact Us

info@statinfer.com

+91- 9676098897

+91- 9494762485

 

Our Social Links

top
© 2020. All Rights Reserved.