[Python] Introduction to Data Visualization (matplotlib/seaborn)
If you’ve ever stared at a spreadsheet full of numbers and felt overwhelmed, it’s so normal. Our brains aren’t built to process raw numbers very well. It has been suggested that when information is presented visually rather than in text alone, people remember significantly more of it.
In this series of posts, we’ll learn how to create data visualizations using Python. More importantly, we’ll learn why certain visualizations work and how to make them accessible to everyone. Because a chart that half your audience can’t read isn’t doing its job.
Why Your Brain Loves (Good) Charts
Before we write any code, let’s talk about what happens in your brain when you look at a data visualization. Understanding this will help you make better design choices later.
Pre-attentive Attributes: What Your Eyes Notice First
When you look at any visual scene, your brain processes certain features almost instantly, within about 200-250 milliseconds, before you even consciously think about what you’re seeing (Healey & Enns, 2012). These are called “pre-attentive attributes,” and they include things like color, size, orientation, and position. This is why you can spot a red dot in a sea of blue dots almost immediately. Your visual system does the work for you.
In data visualization, we can use this to our advantage. Want to draw attention to one particular data point? Make it a different color. Want to show that one category has a much larger value? Make that bar taller. Want to indicate groupings? Use the same color for related items. The key is using these attributes intentionally rather than letting your software’s defaults make these choices for you.
The flip side is that using too many pre-attentive attributes at once creates visual chaos. If everything is trying to grab your attention, nothing succeeds.
Gestalt Principles: How Your Brain Groups Things

The Gestalt psychologists of the early 1900s discovered something interesting: our brains don’t just see individual visual elements. They automatically organize those elements into groups and patterns. Several principles describe how this happens.
- Proximity means we see objects that are close together as belonging to the same group. This is why spacing matters so much in charts. Put bars close together, and viewers will assume they’re related.
- Similarity means we group objects that look alike. Same color? Same group. Same shape? Same group. This is incredibly useful for categorical data.
- Enclosure means we perceive objects within a boundary as a unit. A simple box around a set of data points tells viewers “these belong together.”
- Connection means we see connected objects as related. This is why line charts work: we perceive the line as representing a continuous phenomenon, not just a collection of separate points.
Why does this matter for you? Because your viewers will apply these principles whether you intend them to or not. If you accidentally put unrelated bars close together, or use the same color for categories that have nothing in common, you’ll confuse your audience. The good news is that understanding these principles lets you design visualizations that work with your viewers’ brains instead of against them.
Python coding: Matplotlib and Seaborn
Python has become one of the most popular languages for data analysis, and two libraries form the foundation of most Python data visualization: matplotlib and seaborn. Think of them as your pencils and paintbrushes for creating charts.
Matplotlib: The Foundation

Matplotlib was created by John D. Hunter, a neurobiologist who wanted to replicate MATLAB’s plotting capabilities in Python. Since its release in 2003, it has become the most widely used visualization library in Python. If you’ve seen a chart in a scientific paper or research presentation made with Python, there’s a good chance matplotlib created it.
Matplotlib is what we call a “low-level” library. This means it gives you a lot of control over every aspect of your visualization, from the exact position of your axes to the precise shade of each color. The trade-off is that it takes more code to create a polished chart. Here’s a simple example:
import matplotlib.pyplot as plt
# Sample data: PHQ-9 scores for 5 clients
clients = ['Client A', 'Client B', 'Client C', 'Client D', 'Client E']
phq9_scores = [12, 8, 19, 5, 14]
# Create the bar chart
plt.figure(figsize=(8, 5))
plt.bar(clients, phq9_scores, color='steelblue')
plt.xlabel('Client')
plt.ylabel('PHQ-9 Score')
plt.title('PHQ-9 Depression Screening Results')
plt.show()

The import matplotlib.pyplot as plt line is something you’ll see in virtually every Python visualization script. It’s the standard way to access matplotlib’s plotting functions. The plt is just a nickname (an “alias”) that saves you from typing matplotlib.pyplot every time.
One thing worth noting: matplotlib uses a “Figure” and “Axes” structure. The Figure is like your canvas or sheet of paper. The Axes are the actual plotting areas within that canvas (and yes, a single Figure can have multiple Axes for creating dashboards or comparison charts). This can be confusing at first because “Axes” sounds like it should refer to the x-axis and y-axis, but it actually refers to the entire plot area. You’ll get used to it.
Seaborn: Making Statistical Graphics Easier
Seaborn builds on top of matplotlib and is specifically designed for statistical data visualization. It was created by Michael Waskom while he was a graduate student in neuroscience at Stanford. The library integrates tightly with pandas DataFrames, which makes it perfect for the kind of data analysis social workers typically do.
Here’s what makes seaborn appealing: it provides “high-level” functions that handle a lot of the setup work for you. Want a boxplot comparing groups? One line of code. Want to see the distribution of a variable? One line. Seaborn also comes with better-looking default styles than matplotlib’s bare-bones aesthetics.
import seaborn as sns
import pandas as pd
# Create a sample dataset
data = pd.DataFrame({
'service_type': ['Case Management', 'Case Management', 'Case Management',
'Therapy', 'Therapy', 'Therapy',
'Group Counseling', 'Group Counseling', 'Group Counseling'],
'sessions_attended': [12, 8, 15, 10, 6, 9, 4, 7, 5]
})
# Create a box plot with one line
sns.boxplot(data=data, x='service_type', y='sessions_attended')
plt.title('Session Attendance by Service Type')
plt.show()

Seaborn has six built-in color palette variations of its default colors: “deep,” “muted,” “pastel,” “bright,” “dark,” and “colorblind.” Yes, there’s a palette specifically designed for accessibility, which we’ll talk about more in a moment. You can switch between them with a single command:
sns.set_palette("colorblind")
The relationship between matplotlib and seaborn is sometimes confusing for beginners. Here’s a helpful way to think about it: seaborn is built on top of matplotlib. When you create a seaborn plot, you’re actually creating a matplotlib plot with a lot of the tedious configuration already done. This means you can mix and match. Start with seaborn for convenience, then use matplotlib commands to fine-tune specific details.
Color
Color is one of the most powerful tools in data visualization, but it’s also one of the most frequently misused. Let’s talk about how to use it effectively and, just as importantly, how to make sure everyone can see your colors.
The Problem with Red and Green
Here’s something that surprises many people: about 8% of men and 0.5% of women have some form of color vision deficiency, commonly called “color blindness” (Birch, 2012). The most common types involve difficulty distinguishing between red and green hues. This means that the classic “red for bad, green for good” color scheme that feels so intuitive? It’s invisible to a significant portion of your audience.
Think about what this means in practice. If you create a chart showing which clients are “at risk” (red) versus “stable” (green), roughly one in twelve men looking at your chart will see two nearly identical colors. They won’t be able to tell which clients need immediate attention. This isn’t a minor inconvenience. It’s a fundamental failure to communicate.
Color vision deficiency comes in several forms. Protanopia and deuteranopia affect the perception of red and green hues. Tritanopia affects blue and yellow perception (this is much rarer). And a very small number of people see no color at all. When you’re designing for accessibility, the most common red-green deficiencies are your primary concern, but good accessible design often helps viewers with other types of deficiencies too.
Choosing Accessible Colors
So what colors should you use? Here are some principles backed by accessibility research and guidelines:
- Blue and orange make a good pair. These colors remain distinguishable for most people with color vision deficiency. If you need a simple two-color scheme, this is often your safest bet.
- Vary lightness, not just hue. If two colors have similar brightness, they can be hard to tell apart even for people with typical color vision, especially in poor lighting or on low-quality displays. Make sure your colors differ in lightness as well as hue.
- Use ColorBrewer palettes. Cynthia Brewer, a cartographer at Penn State, developed a set of carefully designed color palettes specifically for data visualization (colorbrewer2.org). Many of these palettes have been tested for color-blind accessibility.
- Test your visualizations. You can simulate how your visualizations look to people with different types of color vision deficiency. Run your charts through these simulators before finalizing them: https://www.color-blindness.com/coblis-color-blindness-simulator/
Here’s how to use a colorblind-friendly palette in seaborn:
import seaborn as sns
import matplotlib.pyplot as plt
# Set a colorblind-friendly palette globally
sns.set_palette("colorblind")
# Or use a specific ColorBrewer palette known for accessibility
# The "Set2" palette works well for categorical data
custom_palette = sns.color_palette("Set2")
For sequential data (values that go from low to high), the “viridis” palette is an excellent choice. It was specifically designed to be perceptually uniform and to work for people with color vision deficiency:
# Using viridis for a heatmap
sns.heatmap(data_matrix, cmap="viridis")
Contrast Ratios: The WCAG Standards
The Web Content Accessibility Guidelines (WCAG), developed by the World Wide Web Consortium, provide specific standards for color contrast. While these were developed for web content, they apply equally to data visualizations.
For text, WCAG recommends a contrast ratio of at least 4.5:1 against the background. For graphical elements (like the bars in a bar chart or the points in a scatter plot), a ratio of at least 3:1 is recommended. What do these numbers mean? A contrast ratio of 1:1 means no contrast at all (the same color). A ratio of 21:1 is the maximum (black on white or white on black).
You can check contrast ratios using tools like WebAIM’s Contrast Checker (webaim.org/resources/contrastchecker/). Enter your foreground and background colors, and it will tell you if they meet accessibility standards.
Here’s a practical tip: if you’re ever unsure whether your colors have enough contrast, convert your visualization to grayscale and see if you can still tell the different elements apart. If everything turns into a similar shade of gray, you need more contrast.
Use Line Style
Even with perfect color choices, you shouldn’t rely on color as your only way of conveying information. What if someone prints your chart in black and white? What if they’re viewing it on a screen with poor color reproduction?
Supplement color with other visual cues. Add patterns or textures to bars. Use different line styles (solid, dashed, dotted) in addition to different colors. Include direct labels on your chart elements instead of making viewers match colors to a legend. These redundant cues ensure your visualization communicates even when color fails.
# Example: Using different line styles in addition to colors
plt.plot(x, y1, color='blue', linestyle='-', label='Group A')
plt.plot(x, y2, color='orange', linestyle='--', label='Group B')
plt.plot(x, y3, color='green', linestyle=':', label='Group C')
Resources
If you want to dig deeper into any of the topics we covered, here are some good starting points:
For matplotlib:
- Official tutorial: https://matplotlib.org/stable/tutorials/index.html
- The pyplot tutorial is a good beginner starting point
For seaborn:
- Official tutorial: https://seaborn.pydata.org/tutorial.html
- The introduction page gives a nice overview of seaborn’s philosophy
For color and accessibility:
- ColorBrewer: https://colorbrewer2.org (select “colorblind safe” to filter palettes)
- WebAIM Contrast Checker: https://webaim.org/resources/contrastchecker/
- Color Oracle (simulator): https://colororacle.org
- Seaborn color palettes documentation: https://seaborn.pydata.org/tutorial/color_palettes.html

1 Response
[…] the previous post, we learned the foundations of data visualization in Python: matplotlib for low-level control, […]