Representing Data Graphically: Learn It 2

Displaying a Categorical Variable Across Multiple Populations or Groups

Both pie charts and bar graphs are good visual representations of a categorical variable from a single population or group. But what can we do if we want to compare a categorical variable across multiple groups?

Side-by-side bar charts and stacked bar charts are extensions of bar graphs or pie charts that allow us to conduct comparisons between multiple data sets. These bar charts will help us to explore how to display and interpret changes in a categorical variable of interest when comparing multiple populations or groups of interest.

side-by-side bar graphs

Side-by-side bar graphs present data for two categorical variables from more than one group by creating multiple bars on the chart for each group  – one bar for each variable.

Let’s look at an example.

The 2016 presidential race was very different from the one in 2020.

In 2016, fewer people turned out to vote,[1] more people were deemed ineligible ([latex]6[/latex] million felons in 2016[2] compared to [latex]5.1[/latex] million felons in 2020),[3] and the election results were much closer.

In 2016, Hillary Clinton won the popular vote, and fewer than [latex]80,000[/latex] votes out of [latex]137[/latex] million votes cast determined the outcome of Donald Trump being selected as our president.[4]

Looking to our future, one question might be, “If we increase legitimate voter participation, will one party benefit?” We can better answer this question if we study the voting patterns of different groups within the United States.

CNN used an exit poll to estimate the presidential 2020 voting patterns by race.[5] The following is a table of the results, where the rows describe the different groups of people of interest (White, Black, Latinx, Asian, and Other) and the columns represent the vote choices (Biden, Trump, or Other).

Presidential 2020 Voting Patterns Percentage by Race
  Biden Trump Other
White [latex]41[/latex] [latex]58[/latex] [latex]1[/latex]
Black [latex]87[/latex] [latex]12[/latex] [latex]1[/latex]
Latinx [latex]65[/latex] [latex]32[/latex] [latex]3[/latex]
Asian [latex]61[/latex] [latex]34[/latex] [latex]5[/latex]
Other [latex]55[/latex] [latex]41[/latex] [latex]4[/latex]

Among Asians, for example, [latex]61[/latex]% voted for Biden, [latex]34[/latex]% voted for Trump, and the remaining [latex]5[/latex]% voted for someone else.

Translating the table to a visual might aid in the comparison between the groups.

A bar graph of how America Voted in 2020 estimated using a CNN exit poll. The horizontal axis is labeled race and the vertical axis is labeled Percent. To the right, there is a key labeled "Vote" showing blue represents Biden, Red represents Trump, and yellow represents Other. Across the horizontal axis, the bars are grouped into sections labeled White, Black, Latinx, Asian, and Other. Above each section are three bars, one of each color in the key.

The groups of interest are listed on the horizontal axis (Whites, Blacks, Latinx, Asian, and Other) and the percentages associated with each voter choice are on the vertical axis.

When percentages of an entire group are reported, within each group the heights of the bars should total [latex]100[/latex]. This represents [latex]100\%[/latex] of all responses within that group. Using a side-by-side bar graph that chooses to represent percentages within groups (as opposed to the numbers of actual ballots cast within groups), means that you cannot make conclusions about counts. Rather, you can make conclusions about relative proportions or percentages within each group.

stacked bar graphs

Stacked bar graphs display the same data as a contingency table and a side-by-side bar graph.

 

This type of chart offers a different perspective of a visual comparison between the groups, where the height of each bar totals [latex]100\%[/latex] for that group.

In a stacked bar chart, each bar represents the responses of one group. The height of each color within that bar represents a percentage of a particular response, and the combination of all colors represents the total [latex](100\%)[/latex] of all responses within that group.  Like the side-by-side bar chart where percentage is plotted along the vertical axis, you cannot make conclusions or comparisons regarding the absolute counts of responses within or between groups.

A single stacked bar chart is very similar to a pie chart, but it uses rectangular regions rather than pie slices to represent each category.

Rather than showing a different bar for each category, stacked bar charts display sub-categories as segments within each bar. Sometimes the bars represent counts, while others, such as the ones we see in the questions below, display percentages. Each segment represents a percentage of the whole so it’s easy to see relative differences within a bar. But, as segment percentages grow smaller, it can become difficult to estimate them.


  1. Schaul, K., Rabinowitz, K., & Mellnik, T. (2020, December 28). 2020 turnout is the highest in over a century. The Washington Post. https://www.washingtonpost.com/graphics/2020/elections/voter-turnout/
  2. Uggen, C., Larson, R., & Shannon, S. (2016, October 16). 6 million lost voters: State-level estimates of felony disenfranchisement, 2016. The Sentencing Project. https://www.sentencingproject.org/publications/6-million-lost-voters-state-level-estimates-felony-disenfranchisement-2016/
  3. Maxouris, C. (2020, October 15). More than 5 million people with felony convictions can’t vote in this year’s election, advocacy group finds. CNN. https://www.cnn.com/2020/10/15/us/felony-convictions-voting-sentencing-project-study/index.html
  4. Why voting matters: Supreme Court edition. (2018, June 28). Axios. Retrieved from https://www.axios.com/hillary-clinton-2016-election-votes-supreme-court-liberal-justice-1b4bc4fc-9fad-44b4-ab54-9ef86aa9c1f1.html
  5. Exit polls. (2020). CNN Politics. Retrieved from https://www.cnn.com/election/2020/exit-polls/president/national-results