Leveraging Alluvial Charts To Show Hidden Patterns in World Human Rights Data

A how-to example on using Python and Plotly to visualize global human rights data

Jun 11, 2025

When you work with complex datasets, you often face a challenge: dense, multivariate data that is difficult to interpret at a glance.

Bar charts and scatter plots can help show individual metrics, but what if you want to trace how the data evolves through stages over time?

This is where alluvial plots shine.

Let me show you what an alluvial plot is — and how to use Python and Plotly to build two interactive alluvial-style plots from the global human rights index (HRI) dataset.

The HRI Dataset

This week, I found a terrific global human rights data set from the CIRIGHTS Human Rights project (link HERE).

This dataset contains extensive research (25 indicators) on how each country supports basic human rights, including:

Physical integrity
Workers’ rights
Empowerment and freedom (for example, free association and speech)
Justice (full list HERE)

On their site, the CIRIGHTS project displays the aggregate index (on a score from 0 to 100) on a global choropleth map.

This dataset hast data that shows how countries move through stages of development in their legal and civil protections.

For this exercise, let’s use two key metrics:

Freedom of association (assn field)
Freedom of speech (speech field)

NOTE: A cleaned version of the HRI dataset is on my GitHub: HERE

We’ll connect each of these metrics with their overall human rights score, categorized from “Poor” to “Excellent.” With an alluvial chart, we can map each country using Year, Metric, and HR Score, like so:

Year → Metric → HR Score

Each country flows from a given year, to a rights category, to a final HR outcome score.

This allows us to ask:

Does freedom of association result in better HR scores?
Does freedom of speech correlate with better civil rights outcomes?

From this, we can ask further questions, such as are there bottlenecks or weak links in the rights pipeline?

So Why Use Alluvial Plots?

Alluvial plots (also called Sankey diagrams in Plotly) are designed to show flows between categories across multiple stages.

They’re excellent for understanding processes where entities change group membership over time or across criteria.

So let’s do this!

Chart 1: Freedom of Association ➔ HR Outcomes

We begin with civil society rights. The assn score reflects freedom of association — the right to join organizations, unions, and civil groups.

For this chart, we categorize each countries freedom of association into three main “bin” categories:

Low (0–0.5)
Medium (0.5–1.5)
High (>2)

Note: vertical node position is based on link volume, not conceptual hierarchy — so the visual top-to-bottom order may not reflect actual category rankings.

We then categorized each country’s HR outcome into four bins — Poor (≤30), Fair (31–60), Good (61–90), and Excellent (91–120) — based on the hrscorex2 variable, which is a composite human rights score used throughout the dataset.

Now let’s put together our allivial chart code using the Python Pandas (to organize our date) and Plotly (to display our chart) libraries:

Python Code:

# Use actual field names
fa_df = df[['year', 'assn', 'hrscorex2']].dropna()

# Bin values
fa_bins = pd.cut(fa_df['assn'], [-1, 0.5, 1.5, 2], labels=["Low", "Medium", "High"])
hrscore_bins = pd.cut(fa_df['hrscorex2'], [-1, 30, 60, 90, 120], labels=["Poor", "Fair", "Good", "Excellent"])
# Prepare plot data
df_plot = pd.DataFrame({
    'Year': fa_df['year'].astype(str),
    'FA_Score': fa_bins.astype(str),
    'HRScore': hrscore_bins.astype(str)
})
# Label ordering
ordered_years = sorted(df_plot['Year'].unique())
ordered_fa = ["Low", "Medium", "High"]
ordered_hr = ["Poor", "Fair", "Good", "Excellent"]
labels = ordered_years + ordered_fa + ordered_hr
label_map = {label: i for i, label in enumerate(labels)}
# Link generator
def get_links(df, source_col, target_col, label_map):
    grouped = df.groupby([source_col, target_col]).size().reset_index(name='value')
    return {
        'source': [label_map[src] for src in grouped[source_col]],
        'target': [label_map[tgt] for tgt in grouped[target_col]],
        'value': grouped['value'].tolist()
    }
# Create link data
link1 = get_links(df_plot, 'Year', 'FA_Score', label_map)
link2 = get_links(df_plot, 'FA_Score', 'HRScore', label_map)
# Build Sankey diagram
fig = go.Figure(data=[go.Sankey(
    node=dict(
        label=labels,
        pad=15,
        thickness=20,
        x=[0]*len(ordered_years) + [0.5]*len(ordered_fa) + [1]*len(ordered_hr)
    ),
    link=dict(
        source=link1['source'] + link2['source'],
        target=link1['target'] + link2['target'],
        value=link1['value'] + link2['value']
    )
)])
fig.update_layout(title_text="Alluvial Plot: Year ➝ Freedom of Association ➝ HR Score")
fig.show()

What this code does:

Loads and filters freedom of association and HR score data
Bins assn into 3 ranked categories and maps them to HR outcomes
Uses fixed x values to enforce stage layout: Year ➝ FA ➝ HR Score
Builds an alluvial-style Sankey plot using Plotly

And our beautiful chart result:

So what are we actually looking at here?

Year is the first stage of each country’s flow, exposing temporal patterns
Countries with “High” freedom of association were more likely to achieve “Good” or “Excellent” HR scores
Countries with “Low” scores rarely progressed to high HR outcomes

For comparison and constrast, let’s look at a second metric: freedom of speech.

Chart 2: Freedom of Speech ➔ HR Outcomes

This time let’s use the metric speech, which measures freedom of expression and censorship resistance. It’s a core civil liberty that's often suppressed in authoritarian contexts.

We grouped speech into the same categories as our previous chart (Low, Medium, High)

Note: Plotly again arranges nodes vertically by flow size, not logical hierarchy — so don’t expect “Low” to always appear above “Medium” or “High.”

And as with the first example, we can organize the Outcomes in the same 4 bins (Poor, Fair, Good, Excellent).

Our Python Code:

# Filter data for speech freedom
speech_df = df[['year', 'speech', 'hrscorex2']].dropna()

# Bin values
speech_bins = pd.cut(speech_df['speech'], [-1, 0.5, 1.5, 2], labels=["Low", "Medium", "High"])
hr_bins = pd.cut(speech_df['hrscorex2'], [-1, 30, 60, 90, 120], labels=["Poor", "Fair", "Good", "Excellent"])
# Build plot data
df_plot3 = pd.DataFrame({
    'Year': speech_df['year'].astype(str),
    'SpeechFreedom': speech_bins.astype(str),
    'HRScore': hr_bins.astype(str)
})
# Ordered label mapping
ordered_years = sorted(df_plot3['Year'].unique())
ordered_speech = ["Low", "Medium", "High"]
ordered_hr = ["Poor", "Fair", "Good", "Excellent"]
labels = ordered_years + ordered_speech + ordered_hr
label_map = {label: i for i, label in enumerate(labels)}
# Link builder
def get_links(df, source_col, target_col, label_map):
    grouped = df.groupby([source_col, target_col]).size().reset_index(name='value')
    return {
        'source': [label_map[src] for src in grouped[source_col]],
        'target': [label_map[tgt] for tgt in grouped[target_col]],
        'value': grouped['value'].tolist()
    }
link1 = get_links(df_plot3, 'Year', 'SpeechFreedom', label_map)
link2 = get_links(df_plot3, 'SpeechFreedom', 'HRScore', label_map)
fig = go.Figure(data=[go.Sankey(
    node=dict(
        label=labels,
        pad=15,
        thickness=20,
        x=[0]*len(ordered_years) + [0.5]*len(ordered_speech) + [1]*len(ordered_hr)
    ),
    link=dict(
        source=link1['source'] + link2['source'],
        target=link1['target'] + link2['target'],
        value=link1['value'] + link2['value']
    )
)])
fig.update_layout(title_text="Alluvial Plot: Year ➝ Freedom of Speech ➝ HR Score")
fig.show()

What this code does:

Extracts speech and HR score from the dataset
Bins each into interpretable categories
Defines stage order and maps categorical flows across time ➝ speech rating ➝ outcome
Renders the alluvial-style Sankey chart

Our beautiful chart result:

What do we see in this chart example?

Strong freedom of expression correlates with better overall human rights outcomes
Many countries with “Low” freedom of speech stay clustered in “Poor” or “Fair” HR categories
This metric shows a different dimension of rights, tied more to political and media conditions than labor or association

To the untrained eye, these charts look very similar.

Let’s take a bit of a closer look at the numbers.

So Which Rights Metric Best Predicts HR Outcomes?

We compared how each rights category correlates with human rights scores, based on actual proportions of countries falling into each HR score level.

Freedom of Association ➝ HR Score Distribution

Freedom of Speech ➝ HR Score Distribution

What does this mean?

Freedom of Speech shows the strongest association with high HR scores: 75.1% of countries with “High” speech freedom fall into the “Good” HR category, and 0% remain in “Poor.”
Freedom of Association follows, with 66.5% in “Good” and 0.4% in “Poor.”

In short, when it comes to predicting strong human rights outcomes, freedom of speech is the most telling signal in this dataset.

With these two examples to compare, it is now possible to use any metric that is contained in this dataset to see which metric best predicts overall human rights outcomes.

So choose another metric — and give it a go!

In Summary…

Alluvial plots are an excellent way of seeing change through visual representation.

These charts reveal structural barriers, promising transitions, and mismatches. For example, strong labor protections without legal accountability might still lead to weak human rights enforcement.

When you need to track change over time, across multiple institutional domains, and still keep the visualization readable, an alluvial plot is a great choice.

Data at Depth

Discussion about this post