Line Plots (Temporal Data)

In this notebook, we will learn how to create line plots using the Matplotlib library in Python. Line plots are useful for visualizing the relationship between two numeric variables when one of them is time.

A line plot is a type of plot that displays information as a series of data points called ‘markers’ connected by straight line segments. Line plots are useful for visualizing the relationship between two numeric variables when one of them is time.

Creating a Line Plot

To create a line plot, we can use the plot() function from the Matplotlib library. The plot() function takes two arguments: the x-axis values and the y-axis values. The x-axis values are typically time values, while the y-axis values are the data values.

Here is an example of how to create a simple line plot using the Matplotlib library:

import pandas as pd 

url = "https://raw.githubusercontent.com/fahadsultan/csc272/main/data/elections.csv"

elections = pd.read_csv(url)

elections.head()
Year Candidate Party Popular vote Result %
0 1824 Andrew Jackson Democratic-Republican 151271 loss 57.210122
1 1824 John Quincy Adams Democratic-Republican 113142 win 42.789878
2 1828 Andrew Jackson Democratic 642806 win 56.203927
3 1828 John Quincy Adams National Republican 500897 loss 43.796073
4 1832 Andrew Jackson Democratic 702735 win 54.574789
from matplotlib import pyplot as plt

plt.style.use('dark_background')

fig, ax = plt.subplots()

winners = elections[elections['Result']=='win']

ax.plot(winners['Year'], winners['%']);

ax.set_xlabel('Year')
ax.set_ylabel('Percentage of Votes')
ax.set_title('Winning Percentage of Votes by Year');

Marker Styles

Markers are used to highlight individual data points on a line plot. Matplotlib provides a variety of marker styles that can be used to customize the appearance of the data points.

Here are some common marker styles that can be used in line plots:

  • o: Circle marker
  • s: Square marker
  • ^: Triangle marker
  • *: Star marker
  • x: X marker
  • +: Plus marker

To specify a marker style in a line plot, we can use the marker parameter of the plot() function. The marker parameter takes a string value that represents the desired marker style.

Here is an example of how to create a line plot with different marker styles using the Matplotlib library:

fig, ax = plt.subplots()

winners = elections[elections['Result']=='win']

ax.plot(winners['Year'], winners['%'], marker='s');

ax.set_xlabel('Year')
ax.set_ylabel('Percentage of Votes')
ax.set_title('Winning Percentage of Votes by Year');

The markers are particularly useful when

  • we have a small number of data points and want to highlight individual data points on the plot

  • we have multiple lines on the same plot and want to distinguish between them

  • we want to use grayscale or black-and-white printing, where color is not available to distinguish between lines

fig, ax = plt.subplots()

democrats = elections[elections['Party']=='Democratic']
republicans = elections[elections['Party']=='Republican']

ax.plot(democrats['Year'], democrats['%'], marker='s', label='Democrats');
ax.plot(republicans['Year'],  republicans['%'],  marker='o', label='Republicans');

ax.legend()

ax.set_xlabel('Year')
ax.set_ylabel('Percentage of Votes')
ax.set_title('Percentage of Votes by Year, by Party');

Note that the above example is not a great example of using markers, as the data points are too close together to be easily distinguished and there is a lot of clutter on the plot. However, it demonstrates how to use different marker styles in a line plot.

Marker Size

Markers can be customized further by changing their size. The size of the markers can be adjusted using the markersize parameter of the plot() function. The markersize parameter takes a numeric value that represents the desired size of the markers.

Here is an example of how to create a line plot with different marker sizes using the Matplotlib library:

fig, ax = plt.subplots()

democrats = elections[elections['Party']=='Democratic']
republicans = elections[elections['Party']=='Republican']

ax.plot(democrats['Year'], democrats['%'], marker='o', label='Democrats', markersize=10);
ax.plot(republicans['Year'],  republicans['%'],  marker='s', label='Republicans', markersize=5);

ax.legend();

ax.set_xlabel('Year');
ax.set_ylabel('Percentage of Votes');

ax.set_title('Percentage of Votes by Year, by Party');

Line Styles

In addition to markers, line plots can also be customized by changing the style of the lines. Matplotlib provides a variety of line styles that can be used to customize the appearance of the lines.

Here are some common line styles that can be used in line plots:

  • -: Solid line
  • --: Dashed line
  • -.: Dash-dot line
  • :: Dotted line

To specify a line style in a line plot, we can use the linestyle parameter of the plot() function. The linestyle parameter takes a string value that represents the desired line style.

Here is an example of how to create a line plot with different line styles using the Matplotlib library:

fig, ax = plt.subplots()

democrats = elections[elections['Party']=='Democratic']
republicans = elections[elections['Party']=='Republican']

ax.plot(democrats['Year'], democrats['%'], label='Democrats', linestyle='-');
ax.plot(republicans['Year'],  republicans['%'],  label='Republicans', linestyle=':');

ax.legend();

ax.set_xlabel('Year');
ax.set_ylabel('Percentage of Votes');

ax.set_title('Percentage of Votes by Year, by Party');

Line Width

Lines can be customized further by changing their width. The width of the lines can be adjusted using the linewidth parameter of the plot() function. The linewidth parameter takes a numeric value that represents the desired width of the lines.

Here is an example of how to create a line plot with different line widths using the Matplotlib library:

fig, ax = plt.subplots()

democrats = elections[elections['Party']=='Democratic']
republicans = elections[elections['Party']=='Republican']

ax.plot(democrats['Year'],    democrats['%'],    label='Democrats',   linewidth=10);
ax.plot(republicans['Year'],  republicans['%'],  label='Republicans', linewidth=10);

ax.legend();

ax.set_xlabel('Year');
ax.set_ylabel('Percentage of Votes');

ax.set_title('Percentage of Votes by Year, by Party');

Line Color

Lines can also be customized by changing their color. The color of the lines can be adjusted using the color parameter of the plot() function. The color parameter takes a string value that represents the desired color of the lines.

Here is an example of how to create a line plot with different line colors using the Matplotlib library:

fig, ax = plt.subplots()

dems = elections[elections['Party']=='Democratic']
reps = elections[elections['Party']=='Republican']

ax.plot(dems['Year'],    dems['%'],    label='Democrats',   color='blue');
ax.plot(reps['Year'],  reps['%'],  label='Republicans', color='red');

ax.legend();

ax.set_xlabel('Year');
ax.set_ylabel('Percentage of Votes');

ax.set_title('Percentage of Votes by Year, by Party');

Seaborn and Pandas for Line Plots

In addition to using the Matplotlib library directly, we can also create line plots using the Seaborn and Pandas libraries in Python. Seaborn and Pandas provide high-level interfaces for creating line plots that are more user-friendly and require less code.

Here is an example of how to create a line plot using the Seaborn library in Python:

import seaborn as sns 

url = 'https://raw.githubusercontent.com/fahadsultan/csc272/main/data/elections.csv'
data = pd.read_csv(url)

fig, ax = plt.subplots()

sns.lineplot(data = winners,      \
             x       = "Year",    \
             y       = "%",       \
             hue     = "Party",   \
             style   = "Party",   \
             markers = True,      \
             dashes  = False,     \
             ax      = ax);

ax.set_xlabel('Year');
ax.set_ylabel('Percentage of Votes');
ax.set_title('Winning Percentage of Votes by Year, by Party');

import seaborn as sns 

url = 'https://raw.githubusercontent.com/fahadsultan/csc272/main/data/elections.csv'
data = pd.read_csv(url)

fig, ax = plt.subplots()

dems = data[data['Party']=='Democratic']
reps = data[data['Party']=='Republican']

dems.plot(x='Year', y='%', kind='line', marker='s', color='blue', label='Democrats', ax=ax);
reps.plot(x='Year', y='%', kind='line', marker='o', color='red', label='Republicans' ,ax=ax);

ax.set_xlabel('Year');
ax.set_ylabel('Percentage of Votes');

ax.set_title('Percentage of Votes by Year, by Party');

Note that seaborn automatically aggregates the data by taking the mean of each numeric variable at each time point. The shaded region around the line represents the 95% confidence interval for the mean. We’ll talk more about confidence intervals in a later lecture.

Please Don’t

Use line graphs where the x-axis is not time

Plotting a line graph where the x-axis is not time is confusing. Use a bar graph instead.

Forget to label

All visualizations should have at minimum contain the following:

  1. A clear and descriptive Title.
  2. All axes should be labeled using name of the variable and the units of measurement.
  3. The axes should also be labeled with the range of values shown.
  4. Legend, if applicable.
Don’t be like this guy:


Note that the figure above is missing axis labels. No, “wave1”, “wave2”, “wave3” and “wave4” are not proper labels for the x-axis. “Are we stuck?” is also not a very informative title.

Use too many lines

Use the wrong colors

If the data is categorical, then use qualitative colormaps. Do not use sequential colormaps.

If the data is categorical, then use qualitative colormaps. If your data ranges from negative to positive values use divergent colormaps. If your data ranges from low to high values, then use sequential colormaps.