import pandas as pd
= "https://raw.githubusercontent.com/fahadsultan/csc272/main/data/elections.csv"
url
= pd.read_csv(url) elections
Selection: subset of columns
To select a column in a DataFrame
, we can use the bracket notation. That is, name of the DataFrame followed by the column name in square brackets: df['column_name']
.
For example, to select a column named Candidate
from the election
DataFrame, we can use the following code:
= elections['Candidate']
candidates print(candidates)
0 Andrew Jackson
1 John Quincy Adams
2 Andrew Jackson
3 John Quincy Adams
4 Andrew Jackson
...
177 Jill Stein
178 Joseph Biden
179 Donald Trump
180 Jo Jorgensen
181 Howard Hawkins
Name: Candidate, Length: 182, dtype: object
This extracts a single column as a Series
. We can confirm this by checking the type of the output.
type(candidates)
pandas.core.series.Series
To select multiple columns, we can pass a list of column names. For example, to select both Candidate
and Votes
columns from the election
DataFrame, we can use the following line of code:
'Candidate', 'Party']] elections[[
Candidate | Party | |
---|---|---|
0 | Andrew Jackson | Democratic-Republican |
1 | John Quincy Adams | Democratic-Republican |
2 | Andrew Jackson | Democratic |
3 | John Quincy Adams | National Republican |
4 | Andrew Jackson | Democratic |
... | ... | ... |
177 | Jill Stein | Green |
178 | Joseph Biden | Democratic |
179 | Donald Trump | Republican |
180 | Jo Jorgensen | Libertarian |
181 | Howard Hawkins | Green |
182 rows × 2 columns
This extracts multiple columns as a DataFrame
. We can confirm as well this by checking the type of the output.
type(elections[['Candidate', 'Party']])
This is how we can select columns in a DataFrame
. Next, let’s learn how to filter rows.
[]
The []
selection operator is the most baffling of all, yet the most commonly used. It only takes a single argument, which may be one of the following:
- A list of column labels
- A single column label
Say we wanted the first four rows of our elections
DataFrame.