link to the previous post : https://statinfer.com/104-2-2-practice-working-with-datasets-in-python/
In this blog we will see how we can manipulate imported dataset into subsets.
Sub-setting the data
- Dataset: “./World Bank Data/GDP.csv“
Out[29]:
array(['Country_code', 'Rank', 'Country', 'GDP'], dtype=object)
- New dataset with selected rows
Country_code Rank Country GDP
2 JPN 3 Japan 4601461
9 RUS 10 Russian Federation 1860598
15 IDN 16 Indonesia 888538
25 NOR 26 Norway 499817
- New dataset by keeping selected columns
Out[33]:
|
Country |
Rank |
| 0 |
United States |
1 |
| 1 |
China |
2 |
| 2 |
Japan |
3 |
| 3 |
Germany |
4 |
| 4 |
United Kingdom |
5 |
| 5 |
France |
6 |
| 6 |
Brazil |
7 |
| 7 |
Italy |
8 |
| 8 |
India |
9 |
| 9 |
Russian Federation |
10 |
| 10 |
Canada |
11 |
| 11 |
Australia |
12 |
| 12 |
Korea, Rep. |
13 |
| 13 |
Spain |
14 |
| 14 |
Mexico |
15 |
| 15 |
Indonesia |
16 |
| 16 |
Netherlands |
17 |
| 17 |
Turkey |
18 |
| 18 |
Saudi Arabia |
19 |
| 19 |
Switzerland |
20 |
| 20 |
Sweden |
21 |
| 21 |
Nigeria |
22 |
| 22 |
Poland |
23 |
| 23 |
Argentina |
24 |
| 24 |
Belgium |
25 |
| 25 |
Norway |
26 |
| 26 |
Austria |
27 |
| 27 |
Iran, Islamic Rep. |
28 |
| 28 |
Thailand |
29 |
| 29 |
United Arab Emirates |
30 |
| … |
… |
… |
| 164 |
Maldives |
165 |
| 165 |
Faeroe Islands |
166 |
| 166 |
Lesotho |
167 |
| 167 |
Liberia |
168 |
| 168 |
Bhutan |
169 |
| 169 |
Cabo Verde |
170 |
| 170 |
Central African Republic |
171 |
| 171 |
Belize |
172 |
| 172 |
Djibouti |
173 |
| 173 |
Seychelles |
174 |
| 174 |
Timor-Leste |
175 |
| 175 |
St. Lucia |
176 |
| 176 |
Antigua and Barbuda |
177 |
| 177 |
Solomon Islands |
178 |
| 178 |
Guinea-Bissau |
179 |
| 179 |
Grenada |
180 |
| 180 |
Gambia, The |
181 |
| 181 |
St. Kitts and Nevis |
182 |
| 182 |
Vanuatu |
183 |
| 183 |
Samoa |
184 |
| 184 |
St. Vincent and the Grenadines |
185 |
| 185 |
Comoros |
186 |
| 186 |
Dominica |
187 |
| 187 |
Tonga |
188 |
| 188 |
São Tomé and Principe |
189 |
| 189 |
Micronesia, Fed. Sts. |
190 |
| 190 |
Palau |
191 |
| 191 |
Marshall Islands |
192 |
| 192 |
Kiribati |
193 |
| 193 |
Tuvalu |
194 |
194 rows × 2 columns
- New dataset with selected rows and columns
Out[34]:
|
Country |
GDP |
| 0 |
United States |
17419000 |
| 1 |
China |
10354832 |
| 2 |
Japan |
4601461 |
| 3 |
Germany |
3868291 |
| 4 |
United Kingdom |
2988893 |
| 5 |
France |
2829192 |
| 6 |
Brazil |
2346076 |
| 7 |
Italy |
2141161 |
| 8 |
India |
2048517 |
| 9 |
Russian Federation |
1860598 |
New dataset with selected rows and excluding columns
Out[35]:
|
Rank |
Country |
GDP |
| 0 |
1 |
United States |
17419000 |
| 1 |
2 |
China |
10354832 |
| 2 |
3 |
Japan |
4601461 |
| 3 |
4 |
Germany |
3868291 |
| 4 |
5 |
United Kingdom |
2988893 |
| 5 |
6 |
France |
2829192 |
| 6 |
7 |
Brazil |
2346076 |
| 7 |
8 |
Italy |
2141161 |
| 8 |
9 |
India |
2048517 |
| 9 |
10 |
Russian Federation |
1860598 |
| 10 |
11 |
Canada |
1785387 |
| 11 |
12 |
Australia |
1454675 |
The next post is a practice session on manipulating dataset in python.
Link to the next post : https://statinfer.com/104-2-4-practice-manipulating-dataset-in-python/