[Python] Comparing Groups: Visualizing Distributions for continuous variables (matplotlib/seaborn)

In the previous post, we learned the foundations of data visualization in Python: matplotlib for low-level control, seaborn for statistical graphics, and how to choose accessible colors. One of the most common tasks in social science data analysis is comparing outcomes across different groups.

Whether you’re examining service utilization by program type, client satisfaction across different offices, or screening scores by demographic characteristics, you’ll frequently need to visualize how distributions differ between groups. This post covers multiple approaches to making these comparisons, using both matplotlib and seaborn.

Two Types of Group Comparisons

Before choosing a visualization, you need to think about what type of outcome variable you’re comparing across groups. This determines which plots are appropriate.

Continuous outcomes are numeric variables that can take any value within a range. Examples include PHQ-9 scores (0-27), age, income, number of service hours, or days until follow-up. When comparing continuous outcomes across groups, you’re typically interested in questions like: “Do clients in Program A have higher satisfaction scores than clients in Program B?” or “How does the distribution of wait times differ across clinic locations?”

📊 For continuous outcomes, use: box plots, violin plots, strip/swarm plots, or bar charts with error bars (showing mean ± SD).

Categorical outcomes (also called nominal outcomes) are variables with distinct categories that have no inherent order. Examples include service completion (yes/no), discharge status (completed, dropped out, transferred), or housing situation (housed, shelter, unsheltered). When comparing categorical outcomes across groups, you’re asking questions like: “What proportion of clients completed treatment in each program?” or “How does the distribution of discharge types differ by referral source?”

📊 For categorical outcomes, use: grouped bar charts (showing counts or percentages), stacked bar charts, or mosaic plots.

This post focuses on continuous outcomes, which are the more common case when examining clinical measures, survey scores, and service utilization metrics. We’ll cover categorical outcome comparisons in a future post.

Why Distribution Matters

When comparing groups, it’s tempting to just compare averages. Client group A has an average PHQ-9 score of 12; client group B has an average of 14.

The problem is that averages hide information. Two groups can have identical averages but completely different distributions. One group might be tightly clustered around the mean, while another is spread across the entire range. One might have a few extreme outliers pulling the average in one direction. Without seeing the shape of the distribution, you miss these patterns.

Here are the types of plot you can use to compare groups for continuous outcomes.

Plot TypeBest ForLimitations
Bar chart with error barsTraditional statistical presentation (academic papers, formal reports); showing central tendency and spreadHides distributional details
Box plotComparing medians and identifying outliers across multiple groupsStill summarizes the data; misses distribution shape
Violin plotWhen distribution shape matters (skewed, bimodal, or other interesting features)Can be unfamiliar to some audiences
Strip plot / Swarm plotShowing every observation; smaller datasetsGets crowded with large datasets
Combined plots (box + strip, violin + strip)Both summary statistics and individual observationsMore visual complexity

Bar Chart with Error Bars (Mean ± SD)

The most basic approach is a bar chart showing the mean for each group. This is what you’ll see in many research papers and reports. Adding error bars (typically showing standard deviation or standard error) gives viewers some sense of the spread in the data.

Using matplotlib:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Create mock data: PHQ-9 scores for clients across three service regions
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),   # Urban: mean=11, sd=4
        np.random.normal(10, 3, 60),   # Suburban: mean=10, sd=3
        np.random.normal(13, 5, 60)    # Rural: mean=13, sd=5
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)  # Valid PHQ-9 range

# Calculate summary statistics
summary = data.groupby('region')['phq9_score'].agg(['mean', 'std']).reset_index()

# Create bar chart with error bars
plt.figure(figsize=(8, 5))
x = range(len(summary))
plt.bar(x, summary['mean'], 
        yerr=summary['std'],  # Error bars showing standard deviation
        capsize=5,            # Cap width on error bars
        color=['#0077BB', '#EE7733', '#009988'],
        edgecolor='white',
        linewidth=1.5)

plt.xticks(x, summary['region'], fontsize=11)
plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region (Mean ± SD)', fontsize=13)
plt.ylim(0, 25)

# Remove top and right spines for cleaner look
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)

plt.tight_layout()
plt.show()

Using seaborn:

Seaborn’s barplot function calculates the mean and confidence interval automatically:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create mock data
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)

plt.figure(figsize=(8, 5))
sns.barplot(data=data, x='region', y='phq9_score', 
            palette=['#0077BB', '#EE7733', '#009988'],
            errorbar='sd',  # Show standard deviation (use 'ci' for confidence interval)
            capsize=0.1)

plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region', fontsize=13)
plt.show()

Bar charts are familiar and easy to read, but they compress all the information about each group into just two numbers: the height of the bar (mean) and the length of the error bar (spread). We can do better.

Box Plot: Showing the Five-Number Summary

https://phastdata.org/boxplot_help

Box plots (also called box-and-whisker plots) show much more information than bar charts. Each box displays five key statistics: the median (line in the middle), the first quartile (bottom of box), the third quartile (top of box), and the range of most data points (the “whiskers”). Points beyond the whiskers are shown as individual dots, representing potential outliers.

Using matplotlib:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Create mock data
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)

plt.figure(figsize=(8, 5))

# Prepare data as list of arrays for matplotlib boxplot
urban = data[data['region'] == 'Urban']['phq9_score']
suburban = data[data['region'] == 'Suburban']['phq9_score']
rural = data[data['region'] == 'Rural']['phq9_score']

bp = plt.boxplot([urban, suburban, rural], 
                  labels=['Urban', 'Suburban', 'Rural'],
                  patch_artist=True,  # Fill boxes with color
                  medianprops={'color': 'black', 'linewidth': 1.5})

# Color each box
colors = ['#0077BB', '#EE7733', '#009988']
for patch, color in zip(bp['boxes'], colors):
    patch.set_facecolor(color)
    patch.set_alpha(0.7)

plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region', fontsize=13)
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)

plt.tight_layout()
plt.show()

Using seaborn:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create mock data
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)

plt.figure(figsize=(8, 5))
sns.boxplot(data=data, x='region', y='phq9_score',
            palette=['#0077BB', '#EE7733', '#009988'])

plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region', fontsize=13)
plt.show()

Box plots are excellent for comparing the central tendency (median), spread (box height), and identifying outliers across groups. The box represents the interquartile range (IQR), meaning 50% of your data falls within that box.

Violin Plot: Showing the Full Distribution Shape

Violin plots use a technique called kernel density estimation (KDE) to create that smooth curve. You don’t need to understand the math behind KDE to use violin plots effectively. Think of it this way: imagine each data point creates a small “bump” of probability around itself. KDE stacks all these bumps together and draws the outline. Where many data points cluster together, their bumps pile up and the violin gets wider. Where data points are sparse, the violin narrows. It’s essentially a smoothed-out version of a histogram that better reveals the underlying shape of your data.

Using matplotlib:

Matplotlib’s violin plot is functional but requires more customization:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Create mock data
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)

urban = data[data['region'] == 'Urban']['phq9_score']
suburban = data[data['region'] == 'Suburban']['phq9_score']
rural = data[data['region'] == 'Rural']['phq9_score']

plt.figure(figsize=(8, 5))

vp = plt.violinplot([urban, suburban, rural], 
                     positions=[1, 2, 3],
                     showmeans=True,
                     showmedians=True)

# Color each violin
colors = ['#0077BB', '#EE7733', '#009988']
for i, body in enumerate(vp['bodies']):
    body.set_facecolor(colors[i])
    body.set_alpha(0.7)

plt.xticks([1, 2, 3], ['Urban', 'Suburban', 'Rural'])
plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region', fontsize=13)
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)

plt.tight_layout()
plt.show()

Using seaborn:

Seaborn makes violin plots much easier and adds a mini box plot inside by default:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create mock data
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)

plt.figure(figsize=(8, 5))
sns.violinplot(data=data, x='region', y='phq9_score',
               palette=['#0077BB', '#EE7733', '#009988'],
               inner='box')  # Shows box plot inside the violin

plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region', fontsize=13)
plt.show()

The inner parameter controls what’s shown inside the violin. Options include 'box' (mini box plot), 'quartile' (lines at quartiles), 'point' (individual points), or 'stick' (lines for each observation).

Violin plots are particularly useful when distributions have interesting shapes, like multiple peaks (bimodal distributions) or heavy skew. A box plot would miss these features entirely.

Strip Plot and Swarm Plot: Showing Every Data Point

Sometimes the best way to understand a distribution is to see every single observation. Strip plots and swarm plots display each data point individually.

Strip plot adds random horizontal “jitter” to prevent points from overlapping:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create mock data
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)

plt.figure(figsize=(8, 5))
sns.stripplot(data=data, x='region', y='phq9_score',
              palette=['#0077BB', '#EE7733', '#009988'],
              jitter=0.2,  # Amount of horizontal spread
              alpha=0.6,   # Transparency
              size=6)

plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region (Individual Clients)', fontsize=13)
plt.show()

Swarm plot arranges points so they don’t overlap at all, creating a shape similar to a violin plot but with discrete points:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create mock data
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)

plt.figure(figsize=(8, 5))
sns.swarmplot(data=data, x='region', y='phq9_score',
              palette=['#0077BB', '#EE7733', '#009988'],
              size=5)

plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region (Individual Clients)', fontsize=13)
plt.show()

Swarm plots work well for small to medium datasets (up to a few hundred points per group). For larger datasets, they become too crowded and strip plots with transparency work better.

Combining Plots: Box Plot + Strip Plot

One powerful technique is layering multiple plot types. A common combination is a box plot (for summary statistics) with a strip plot overlay (for individual observations):

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create mock data
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)

plt.figure(figsize=(8, 5))

# First layer: box plot
sns.boxplot(data=data, x='region', y='phq9_score',
            color='lightgray',      # Neutral color for boxes
            width=0.5,
            linewidth=1.5)

# Second layer: strip plot on top
sns.stripplot(data=data, x='region', y='phq9_score',
              palette=['#0077BB', '#EE7733', '#009988'],
              jitter=0.15,
              alpha=0.6,
              size=5)

plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region', fontsize=13)
plt.show()

This gives you the best of both worlds: the box plot provides quick visual reference for medians and quartiles, while the strip plot shows you where the actual data points are.

Similarly, you can combine violin plots with strip plots:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create mock data
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)

plt.figure(figsize=(8, 5))

# Violin plot with light fill
sns.violinplot(data=data, x='region', y='phq9_score',
               palette=['#0077BB', '#EE7733', '#009988'],
               inner=None,  # Remove default inner display
               alpha=0.3)

# Strip plot overlay
sns.stripplot(data=data, x='region', y='phq9_score',
              color='black',
              jitter=0.1,
              alpha=0.5,
              size=4)

plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region', fontsize=13)
plt.show()

Adding a Second Grouping Variable with Hue

Often you need to compare groups along two dimensions. For example, comparing PHQ-9 scores by region AND by whether clients completed treatment. Seaborn makes this easy with the hue parameter.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create mock data with two grouping variables
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)
data['completed'] = np.random.choice(['Completed', 'Did not complete'], 
                                      size=len(data), 
                                      p=[0.6, 0.4])

plt.figure(figsize=(10, 5))
sns.boxplot(data=data, x='region', y='phq9_score', hue='completed',
            palette=['#0077BB', '#EE7733'])

plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region and Treatment Completion', fontsize=13)
plt.legend(title='Treatment Status')
plt.show()

For violin plots, the split parameter creates a particularly elegant comparison when you have exactly two hue categories:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create mock data with two grouping variables
np.random.seed(42)
data = pd.DataFrame({
    'region': ['Urban'] * 60 + ['Suburban'] * 60 + ['Rural'] * 60,
    'phq9_score': np.concatenate([
        np.random.normal(11, 4, 60),
        np.random.normal(10, 3, 60),
        np.random.normal(13, 5, 60)
    ])
})
data['phq9_score'] = data['phq9_score'].clip(0, 27)
data['completed'] = np.random.choice(['Completed', 'Did not complete'], 
                                      size=len(data), 
                                      p=[0.6, 0.4])

plt.figure(figsize=(10, 5))
sns.violinplot(data=data, x='region', y='phq9_score', hue='completed',
               split=True,  # Each half shows one hue category
               palette=['#0077BB', '#EE7733'],
               inner='quart')

plt.ylabel('PHQ-9 Score', fontsize=11)
plt.title('Depression Scores by Region and Treatment Completion', fontsize=13)
plt.legend(title='Treatment Status')
plt.show()

A Note on Comparing Groups Responsibly

Before we close, a word about the ethics of group comparisons in data visualization.

When visualizing differences between groups of people, especially when those groups are defined by race, ethnicity, gender, or other social categories, our design choices matter. Research has shown that certain visualization approaches can inadvertently reinforce harmful stereotypes (Holder & Xiong, 2022).

The key issue is this: visualizations that emphasize differences between groups while hiding variation within groups can encourage viewers to see group membership as the explanation for outcome differences. If a chart shows that Group A has worse outcomes than Group B, viewers may unconsciously conclude that something about Group A’s characteristics causes those outcomes, rather than considering systemic factors like discrimination, resource allocation, or historical context.

What can we do about this?

  1. Show within-group variation. Plots that display individual data points (strip plots, swarm plots) or distribution shapes (violin plots) help viewers see that groups are not monolithic. There’s enormous variation within any group, and the overlap between groups is often substantial.
  2. Provide context. When possible, include information about the systemic factors that might explain disparities. If rural clients have higher depression scores, is that because rural areas have fewer mental health providers? Less insurance coverage? Different economic conditions?
  3. Consider your framing. The way you title and label your visualizations matters. “Depression Burden Across Service Regions” frames the data differently than “Which Regions Have the Most Depressed Clients?”
  4. Think about your comparisons. Instead of always comparing demographic groups to each other, sometimes it’s more useful to compare a group to itself over time, or to compare outcomes before and after an intervention within the same group.

This doesn’t mean we should avoid disaggregating data by demographic characteristics. Disaggregation is essential for identifying disparities and targeting interventions. The point is to visualize those disparities in ways that illuminate rather than obscure the underlying causes.

Resources

Seaborn categorical plots tutorial: https://seaborn.pydata.org/tutorial/categorical.html

Seaborn color palettes: https://seaborn.pydata.org/tutorial/color_palettes.html

Urban Institute guide on racial equity in data visualization: https://urban-institute.medium.com/applying-racial-equity-awareness-in-data-visualization-bd359bf7a7ff

  • January 4, 2026