Better Data Visuals With Python: Understanding Planar and Retinal Variables
A how-to on using color, size, and shape to represent your data more clearly.
Planar variables refer to the classic x and y axes used in many charts and graphs to compare two variables through dots, lines, or bars.

This is our chart’s two-dimensional space: one variable on the horizontal axis and another on the vertical axis, forming the basis of many familiar plots.
However, when dealing with more than two variables, we may need to find something “stronger” to represent our data.
This is where it is beneficial to dig into the usefulness of retinal variables, a concept introduced by Jacques Bertin (from Semiology of Graphics, 1967).
Retinal variables, such as color, shape, and size, enable the encoding of additional data dimensions, leveraging the human ability to easily distinguish between these visual properties.

For example, a common technique is using color shading to represent ordered values (increasing or decreasing) for quantitative data.
With our newly found knowledge of planar and retinal variables, let’s find a dataset and apply these 2 principles in practice.
Exploration and Representation — A Simple Dataset
Let’s use a simple dataset — from the World Happiness Report that I have downloaded from my GitHub HERE.
The dataset provides yearly happiness scores for various countries from 2008 to 2023. The higher the number (based on a selection of metrics), the happier the country for that year.
Here is a snapshot of the first 10 rows in the dataset:
Each row in the dataset contains a country’s name, its happiness score for a given year, and contributing factors like GDP per capita, social support, healthy life expectancy, freedom to make life choices, generosity, and perceptions of corruption.
This gives us plenty of dimensions to play with!
And for our examples, let’s focus on the use of both planar and retinal variables to encode information from this data.
We’ll explore several visualizations that compare happiness scores between countries and illustrate changes in happiness over time.
Bar Chart of Happiest Countries (Planar Only)
If we want to find the top 10 happiest countries for a particular year, a simple bar chart is an effective start. For example, here’s a horizontal bar chart of the 10 happiest countries in 2020 (based on their happiness scores):
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv('WHR2024.csv')
df = df.rename(columns={'Country name': 'Country', 'Ladder score': 'Happiness Score'})
df2024 = df[df['Year'] == 2024][['Country', 'Happiness Score']].dropna()
top20 = df2024.sort_values('Happiness Score', ascending=False).head(10)
plt.figure(figsize=(10, 8))
sns.barplot(data=top20, x='Happiness Score', y='Country', color='blu')
plt.title('Top 10 Happiest Countries in 2024')
plt.xlabel('Happiness Score')
plt.ylabel('Country')
plt.tight_layout()
plt.show()
The code uses sns.barplot()
to draw a horizontal bar chart of the top 10 countries by happiness score in 2024. The X-axis shows scores and the Y-axis lists countries, sorted in descending order.
When we run the code, our resulting data visual looks like so:
This chart is a classic planar representation of our dataset — the country names are arranged along the Y-axis and their happiness scores extend along the X-axis as bars.
Now this works great for showing the highest scores, but what about the 10 least happy countries? We’d need to create another bar chart or flip the sorting. Bar charts are clear for ranking, but they only show one snapshot (one year) and one metric at a time.
And what about how a particular country is trending over time? Is a country becoming more or less happy year to year? How does this compare to its neighbors or countries in the same region?
A bar chart can’t directly show trends over multiple years for many countries.
But a line chart can!
Line Chart of Happiness Trends Over Time (Planar)
To visualize changes over time, we can create a time-series line chart.
For example, plotting the happiness scores of 5 selected countries from 2015 to 2020:
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv('WHR2024.csv')
df = df.rename(columns={'Country name': 'Country', 'Ladder score': 'Happiness Score'})
selected_countries = ['Finland', 'Denmark', 'Sweden', 'Canada', 'Japan']
df = df.rename(columns={'Country name': 'Country', 'Ladder score': 'Happiness Score'})
df_line = df[df['Country'].isin(selected_countries)][['Year', 'Country', 'Happiness Score']]
plt.figure(figsize=(10, 6))
sns.lineplot(data=df_line, x='Year', y='Happiness Score', hue='Country', marker='o')
plt.title('Happiness Score Trends (2011–2024)')
plt.xlabel('Year')
plt.ylabel('Happiness Score')
plt.grid(True)
plt.tight_layout()
plt.show()
The sns.lineplot()
function plots happiness scores over time for selected countries, using markers to show data points.
It draws one line per country to reveal trends from 2011 to 2024 like so:
In this line chart, each country is represented by a line on the X-Y planar grid (year on the X-axis, happiness score on the Y-axis). We can distinguish the countries by using different coloured lines (colour here is a retinal variable encoding the country).
This is great for comparing trends, but it only shows a limited number of countries before the chart becomes too cluttered.
We picked 5 countries for clarity — if we tried to plot all 150+ countries, it would be an unreadable spaghetti of lines! So, can we create a better representation of our data for a specific year and include all countries?
Yes! A map can leverage spatial positioning as well as color to encode information.
Heatmap by Country (Planar and Retinal)
So far, our visualizations either looked at one year (bar chart) or a handful of countries over time (line chart).
But what if we want to visualize broader trends over time for a bunch of countries?
A heatmap is a good choice here as it uses two planar variables as a grid (one along each axis) and color intensity as a retinal variable to represent values in each cell.
In our happiness data, a heatmap can help show how scores change over time across multiple countries:
# Calculate average happiness score by region for each year
avg_by_region = df.groupby(['Year', 'Region'])['Happiness Score'].mean().reset_index()
# Pivot the data to have regions as rows and years as columns
pivot = avg_by_region.pivot(index='Region', columns='Year', values='Happiness Score')
# Plot the heatmap
plt.figure(figsize=(10,6))
sns.heatmap(pivot, cmap="YlGnBu", annot=True, fmt=".2f")
plt.title('Average Happiness Score by Region (2015–2022)')
plt.xlabel('Year')
plt.ylabel('Region')
plt.xticks(rotation=45, ha='center')
plt.tight_layout()
plt.show()
In the above code, the data is grouped by Year and Region.
The avg_by_region.pivot()
function creates a table where each row is a country and each cell value represents the score for that country.
We then use sns.heatmap
to draw the color-coded grid, using the "YlGnBu"
colormap (yellow-green-blue):
Cool! Now what does this heatmap tell us?
We can scan across a row to see how a country’s average happiness changed over time (looking for color changes left to right).
We can compare vertically to see differences between countries in the same year.
Int this example, the “lighter” colours show lower values and darker blues indicate higher values:
The color intensity is a retinal variable here encoding the happiness score magnitude. The planar variables are the discrete positions on the grid (year and region).
Heatmaps are excellent for seeing broad trends and outliers in a tabular dataset. In this case, it shows how happiness changes by country over time.
Awesome.
Choropleth Map of Happiness (Planar + Retinal)
A choropleth map colors each country based on its happiness score, using geographic position as a planar variable and color intensity as a retinal variable.
Creating a choropleth map in Python is waaaay easier than you think. The code:
import pandas as pd
import plotly.express as px
df = pd.read_csv('WHR2024.csv')
df = df.rename(columns={'Country name': 'Country', 'Ladder score': 'Happiness Score'})
df2024 = df[df['Year'] == 2024][['Country', 'Happiness Score']].dropna()
fig = px.choropleth(
df2024,
locations='Country',
locationmode='country names',
color='Happiness Score',
color_continuous_scale='Viridis',
title='Global Happiness Scores (2024)'
)
fig.show()
The code uses plotly.express.choropleth()
to create a world map where each country is shaded by its 2024 happiness score. Our beautiful results:
Yes, it really is that easy (these results are from a screen shot from my Jupyter Notebook).
We can see the world happiness values mapped geographically. Here the planar variables are latitude and longitude, and the retinal variables here are a-plenty: we have color shading, country size, and country shape.
We’ve covered all 3 aspects of retinal variables here. Nice work!
Darker more saturated colours (ie dark blue) represent higher happiness scores, and lighter colours represent lower scores.
A choropleth offers a great combination of the two main variable types, allowing our data to be displayed across all countries from a single visualization.
Terrific.
In Summary…
The end game here is to make our data tell a deeper story, without making a mess.
The key is to choose the right tool for the job, and then fine-tune it using planar and retinal variables to highlight the story in the data. We can use:
A bar chart (planar) for ranking values at one point in time.
A line chart (planar with color) to show trends over time
A heatmap (planar and retinal) to summarize large amounts of data to spot overall patterns.
A choropleth map (planar and retinal) to add geographic context.
We used planar variables (positions on axes or maps) when comparing quantitative values or categories directly.
We layered retinal variables like color hue, color intensity, and size to encode additional dimensions without cluttering the chart.
And lastly, three simple takeaway rules for visualizing our data:
Keep it straightforward
Keep it accurate
Aim for clarity
Good visual encoding makes important information pop out at you!