Modular Streamlit Coding: A Multi-Page Tutorial Showcasing UNHCR Refugee Data
3-page interactive Python application using a modular approach
Streamlit is an open-source app framework that allows data scientists and analysts to create interactive web applications with ease.
Back by popular demand, this article explores the process of creating a multi-page Streamlit application - but this time in a more modular fashion, starting with a core shell of pages.
Data at Depth is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
Using just a few lines of Python, you can turn data scripts into shareable web apps.
Let me step you through how to use Streamlit to create a multi-page interactive application - on this occasion using the UN High Commission for Refugees (UNHCR) data set that tracks global refugees by country of origin and country of asylum
The application will have three pages of data visualizations:
an overview of countries of asylum (top 10 countries for asylum)
country-specific analysis for asylum countries
an overall view of asylum seeking countries using a choropleth map
Let’s build this all from the ground up, starting with a Streamlit shell application and our data set.
Creating A Multipage Streamlit Application Shell
To create a modular multi-page Streamlit application, we will first set up the basic structure of the application.
This involves creating separate functions for each page and a main function to manage navigation between pages.
Here is the code for setting up the 3-page Streamlit application with each page as a shell:
import streamlit as st
# Page 1: Overview
def page_overview():
st.subheader("Global Asylum Decisions by Year Range (Choose Range)")
# Visualization code will go here
# Page 2: Country-Specific Analysis with Grouped Bar Chart
def page_country_analysis():
st.subheader("Country-Specific Analysis")
# Visualization code will go here
# Page 3: Choropleth Mapping
def page_choropleth():
st.subheader("Global Distribution of Asylum Decisions")
# Visualization code will go here
# Main app with navigation
def main():
st.set_page_config(page_title="Asylum Decisions Dashboard", layout="wide", initial_sidebar_state="expanded")
st.sidebar.title("Navigation")
menu_options = ["Global Asylum Decisions", "Country Analysis", "Global Mapping"]
menu_choice = st.sidebar.selectbox("Go to", menu_options)
if menu_choice == "Global Asylum Decisions":
page_overview()
elif menu_choice == "Country Analysis":
page_country_analysis()
elif menu_choice == "Global Mapping":
page_choropleth()
if __name__ == "__main__":
main()
The code explained:
Import the Streamlit library.
Define separate functions for each page:
page_overview()
,page_country_analysis()
, andpage_choropleth()
.Each function contains a subheader to indicate the page's purpose. The data visualization code will be added later.
Define the
main()
function, which sets the page configuration and handles navigation.Use
st.sidebar.selectbox
to create a sidebar menu for navigating between the three pages.Based on the user's selection, the corresponding page function is called to display the page content.
The
main()
function is called when the script is executed.
This shell sets up the basic structure of a multi-page Streamlit application. In the following sections, we will add the data visualization code to each page function.
Now to run this code, we need access to a terminal prompt. I use the built-in terminal prompt in PyCharm:
The application runs in your default browser: Our first look at our shell application:
Perfect! Now we can start adding to this shell by creating data visualizations for each of our pages.
Accessing And Downloading our Data Set
The UN High Commission for Refugees (UNHCR) tracks statistics on refugee movements across the globe.
Their data is freely accessible HERE.
After clicking the link to get to the download page, we can be granular on the data that we select:
For this project, let’s retrieve the county of origin for each refugee and the country of asylum.
With this data, we have refugee data showing movement in two directions:
from country of origin — where asylum seekers are going to
from country of asylum — where asylum seekers are coming from
Once we download the dataset, we can open it up in spreadsheet format to see what we are dealing with:
The data fields that we are interested in for this project are:
Country of origin — where a person seeking asylum is coming from
Country of asylum — where a person is actually seeking asylum
Recognized decisions — whether the person seeking asylum was accepted (numeric total by country)
Rejected decisions - a person seeking asylum is denied
Both the country of origin and asylum have a 3-letter ISO code that can be used for reliable unique identification.
Now let’s put together each page of our application.
Page 1: Top 10 countries for asylum (Bar/Sunburst Charts)
For our first page, let’s create a stock bar chart, and for fun, a beautiful (but sometimes difficult to interpret) sunburst chart.
We will add the data visualization code to the page_overview()
function to display the global asylum decisions by year range.
This involves loading the dataset, filtering the data based on the selected year range, and creating two visualizations: a horizontal bar chart showing the top 10 countries by total asylum decisions and a sunburst chart showing the breakdown by country of origin.
import pandas as pd
import plotly.express as px
# Load dataset
data = pd.read_csv('asylum-decisions.csv')
# Helper function to get asylum decision counts
def get_asylum_counts(df, group_by_column):
return df.groupby([group_by_column])[
['Recognized decisions', 'Rejected decisions', 'Total decisions']].sum().reset_index()
Data Loading:
The dataset is loaded using
pd.read_csv()
and stored in thedata
variable.
Helper Function:
get_asylum_counts(df, group_by_column)
: This function groups the data by the specified column and calculates the sum of recognized, rejected, and total decisions.
We need to add this code at the top of our initial shell application to access the data se.
Here is the complete code for the page_overview()
function with data visualization:
# Page 1: Overview
def page_overview():
st.subheader("Global Asylum Decisions by Year Range (Choose Range)")
year_filter = st.slider("Year Range", int(data['Year'].min()), int(data['Year'].max()),
(int(data['Year'].min()), int(data['Year'].max())))
filtered_data = data[(data['Year'] >= year_filter[0]) & (data['Year'] <= year_filter[1])]
asylum_counts = get_asylum_counts(filtered_data, 'Country of asylum')
top_countries = asylum_counts.sort_values(by='Total decisions', ascending=False).head(10)
fig_bar = px.bar(top_countries, x='Total decisions', y='Country of asylum', orientation='h',
title="Top 10 Countries by Total Asylum Decisions",
color='Total decisions', color_continuous_scale=px.colors.sequential.YlOrRd)
fig_bar.update_layout(showlegend=False, height=400, yaxis={'categoryorder': 'total ascending'})
fig_bar.update_coloraxes(showscale=False) # Remove color scale
st.plotly_chart(fig_bar)
top_countries_origins = filtered_data[filtered_data['Country of asylum'].isin(top_countries['Country of asylum'])]
fig_sunburst = px.sunburst(top_countries_origins, path=['Country of asylum', 'Country of origin'], values='Total decisions',
title="Top 10 Countries by Origin Breakdown",
color='Total decisions', color_continuous_scale=px.colors.qualitative.Bold)
fig_sunburst.update_layout(height=600, showlegend=False)
fig_sunburst.update_coloraxes(showscale=False) # Remove color scale
st.plotly_chart(fig_sunburst)
The code explained:
A slider (
st.slider
) is used to select the year range for filtering the data.The data is filtered based on the selected year range, and asylum decision counts are calculated.
The top 10 countries by total asylum decisions are determined by sorting the data.
A horizontal bar chart (
px.bar
) is created to visualize the top 10 countries by total asylum decisions. The legend and color scale are removed usingshowlegend=False
andupdate_coloraxes(showscale=False)
.A sunburst chart (
px.sunburst
) is created to show the breakdown by country of origin for the top 10 countries. The legend and color scale are also removed usingshowlegend=False
andupdate_coloraxes(showscale=False)
.
With the code in place, we can test our progress by saving and running our growing application. The visual result:
You can adjust the slider (on either end) to give a more focused year range. Additionally, with a Plotly sunburst chart, you can actually click on a country (from within the inner circle) to produce a more granular set of numbers:
And if you hover over each country within this view, you can view the actual numbers.
Great! Now let’s add in some country specific data for our page 2.
Page 2: Country Specific Data Visualizations
Now we can add the data visualization code to the page_country_analysis()
function to display the asylum decisions for a selected country. T
his involves creating two visualizations: a grouped bar chart showing the number of recognized and rejected asylum decisions over the years, and a horizontal bar chart showing the total recognized, rejected, and total decisions for the selected country:
# Page 2: Country-Specific Analysis with Grouped Bar Chart
def page_country_analysis():
st.subheader("Country-Specific Analysis")
country = st.selectbox("Select Country", data['Country of asylum'].unique())
country_data = data[data['Country of asylum'] == country]
country_data_long = pd.melt(country_data, id_vars=['Year'],
value_vars=['Recognized decisions', 'Rejected decisions'],
var_name='Decision Type', value_name='Count')
fig_grouped_bar = px.bar(country_data_long, x='Year', y='Count', color='Decision Type', barmode='group',
title=f"Asylum Decisions for {country} Over the Years",
labels={'Count': 'Number of Decisions'},
color_discrete_sequence=px.colors.sequential.YlOrRd)
fig_grouped_bar.update_layout(height=400, showlegend=True)
st.plotly_chart(fig_grouped_bar)
total_decisions = country_data[
['Recognized decisions', 'Rejected decisions', 'Total decisions']].sum().reset_index()
total_decisions.columns = ['Decision Type', 'Count']
fig_horizontal_bar = px.bar(total_decisions, x='Count', y='Decision Type', orientation='h',
title=f"Total Asylum Decisions for {country}",
color='Decision Type', color_discrete_sequence=px.colors.sequential.YlOrRd)
fig_horizontal_bar.update_layout(height=300, showlegend=False)
st.plotly_chart(fig_horizontal_bar)
The code explained:
A dropdown (
st.selectbox
) is used to select a country from the list of countries in the dataset.The data is filtered based on the selected country.
The data is transformed from wide to long format using
pd.melt()
to create a grouped bar chart.A grouped bar chart (
px.bar
) is created to visualize the number of recognized and rejected asylum decisions over the years for the selected country. The legend is displayed usingshowlegend=True
.The total number of recognized, rejected, and total decisions for the selected country is calculated and displayed using a horizontal bar chart (
px.bar
). The legend is removed usingshowlegend=False
.
You can simply copy/paste this code into the shell application for Page 2 and run it from your command prompt. The results:
OK, we are two-thirds of the way there. For our last page, a beautiful global choropleth map.
Page 3: Global Choropleth Map
Now let’s add the data visualization code to the page_choropleth()
function to display the global distribution of asylum decisions for a selected year. This involves creating a choropleth map to visualize the total asylum decisions by country.
# Page 3: Choropleth Mapping
def page_choropleth():
st.subheader("Global Distribution of Asylum Decisions")
year = st.selectbox("Select Year", sorted(data['Year'].unique()), key='year_select')
year_data = data[data['Year'] == year]
asylum_counts = get_asylum_counts(year_data, 'Country of asylum')
st.subheader(f"Global Distribution of Asylum Decisions in {year}")
fig = px.choropleth(asylum_counts, locations="Country of asylum", locationmode='country names',
color="Total decisions",
hover_name="Country of asylum", color_continuous_scale=px.colors.sequential.YlOrBr)
fig.update_layout(height=500)
st.plotly_chart(fig)
st.subheader(f"Asylum Decisions by Country in {year}")
sorted_asylum_counts = asylum_counts.sort_values(by='Total decisions', ascending=False)
st.dataframe(sorted_asylum_counts)
The code explained:
A dropdown (
st.selectbox
) is used to select a year from the list of years in the dataset.The data is filtered based on the selected year, and asylum decision counts are calculated.
A choropleth map (
px.choropleth
) is created to visualize the total asylum decisions by country for the selected year. The color scale is set usingcolor_continuous_scale=px.colors.sequential.YlOrBr
.A data table (
st.dataframe
) is displayed to show the asylum decision counts by country for the selected year, sorted in descending order.
As with the previous page, we only need to copy/paste this code into our page_choropleth() function, Save, and Run. The beautiful results:
Very nice.
Yes, it really is that simple to build multi-page Streamlit applications.
And if you have any issues with the build, the code (and dataset) can be found on GitHub HERE.
In Summary…
Streamlit is a terrific application for quickly and easily generating data visualizations using standard Python data visualization libraries like Plotly.
This project showcases a 3-page Streamlit application representing global UNHCR asylum data over a range of years.
We have created 6 different data visualizations to give our users different stories and perspectives on this data.
By first creating the multi-page application shell, you can easily add, remove or modify individual charts and pages. For example, if you want to add on a fourth page, you only need to create a “Page 4” function, add any visualizations to that page, and then add an additional menu item into the page logic (created in the first step).
I hope you found this educational and useful.
GitHub Repository: HERE