GPT-4 and Streamlit: Gold Medal Modular Prompting to Engineer an Interactive App
Olympic medal dashboard image recognition to interactive Streamlit app with GPT-4
The power of GPT-4 extends beyond simple analysis and text generation.
For example, GPT-4 excels at both image recognition and generating functional Streamlit data visualization code.
Recently, while following the Paris 2024 Olympics online, I came a cross a useful dashboard (on espn.com):
This particular dashboard shows the current top 3 countries by overall medals, the top 3 by medal type, and then displays a table of all countries with medals, organized by total count.
Having a solid grasp of what GPT-4 can do AND being in possession of a data set that has the gold/silver/bronze medal tallies for all countries for all summer Olympiads, I wanted to know if armed with just a screenshot of this dashboard, could I prompt GPT-4 to create a similar dashboard for all previous Olympiads?
Here’s how it panned out.
The Data Set
The data used to accomplish this task is the “Olympic Medals by Country” dataset. It is available on Kaggle, HERE.
In this data set, the data is organized by year, country, and a count of “Gold”, “Silver” and “Bronze” medals. It includes all of the summer Olympiads up to and including Tokyo 2020.
Here is a screenshot of the first 15 rows of the dataset:
Now let’s start to set up our Python Streamlit application code.
Prompting GPT-4 for Image Recognition
To start, I took a screen shot of the webpage (from the espn.com website) and uploaded the image to the GPT-4 chat window. Then I prompted the LLM to ask it what it could see:
Response from GPT-4: The image is a medal tracker for the 2024 Summer Olympics.
GPT-4 also provides a much more detailed breakdown of what it sees - and it’s assessment is accurate.
Knowing that GPT-4 has an understanding of the dashboard, the next step is to upload the CSV data file of all of the previous olympics and prompt for analysis.
Prompt to GPT-4: Give a one sentence overview of the data set in the attached CSV file.
Response from GPT-4: The dataset contains information about the Summer Olympic medals, including the year, host country, host city, country name, country code, and the number of gold, silver, and bronze medals won by each country.
Perfect! Now that we know GPT-4 has an understanding of both the image and the data set, we can put together a prompt to re-create the dashboard from just the image.
Prompting GPT-4 for Streamlit Dashboard Code
I’ve been prompting GPT-4 for code creation for more than a year now. I know that the best approach is to give this LLM very specific instructions for each component in this application.
For this prompt, I am asking it to create 4 components.
Prompt to GPT-4: Please write the Python Streamlit code to recreate the dashboard shown in the image. Specifically:
A dropdown menu allowing the user to select a year
On the first row, display three donut charts showing the top 3 countries (each country broken down by medal type). Use the medal colours appropriately for each donut chart.
On the second row, I want the top 3 countries by medal total for each medal colour. I want a circular icon beside the medal name (gold/silver/bronze)
Below, a table showing all medals by country, including a total, for that year
Response from GPT-4: Here's the Python Streamlit code to create the dashboard:
And GPT-4 generates the entire 80 lines of code.
Let’s break this code down into each component:
Setup - Libraries, CSV to Data Frame, Title:
import streamlit as st
import pandas as pd
import plotly.graph_objects as go
# Load the dataset, preprocess
file_path = 'Summer_olympic_Medals.csv'
data = pd.read_csv(file_path)
data['Total'] = data['Gold'] + data['Silver'] + data['Bronze']
# Streamlit application Title
st.subheader("2024 Summer Olympics - Medal Tracker")
For this application, you need to have three libraries installed:
Streamlit is a library to create interactive web applications directly from Python scripts. A huge advantage to streamlit is in the simplicity of the code needed to create fully interactive data visualizations. This simplicity comes at the expense of customization.
The pandas library is used to create data frames — which is just a fancy way to say that our data will be organized in a way that can be interpreted by the charting and mapping functions.
The Plotly library is a terrific library for creating charts and maps. For this exercise we will use the Plotly graph_objects module.
Component 1 - Dropdown Menu:
To create the dropdown menu by year, we need to filter by year and then use the streamlit selectbox() function:
# Dropdown menu for year selection
years = data['Year'].unique()
selected_year = st.selectbox("Select Year", years)
Component 2 - Donut Charts for selected year
Next, we need to filter for the year selected (from the dropdown) and then display the top 3 countries by total medal count as a donut chart:
# Filter data for the selected year
filtered_data = data[data['Year'] == selected_year]
top_3_countries = filtered_data.nlargest(3, 'Total')
# Create donut charts for each of the top 3 countries
def create_donut_chart(country_data):
labels = ['Gold', 'Silver', 'Bronze']
values = [country_data['Gold'].values[0], country_data['Silver'].values[0], country_data['Bronze'].values[0]]
total = sum(values)
country_name = country_data['Country_Name'].values[0]
fig = go.Figure(data=[go.Pie(
labels=labels,
values=values,
hole=0.75,
marker=dict(colors=['#FFD700', '#C0C0C0', '#CD7F32']),
textinfo='none'
)])
fig.add_annotation(x=0.5, y=0.5, text=f"{country_name}<br>Total: {total}", showarrow=False,
font=dict(size=20, color="black"))
fig.update_layout(showlegend=False, margin=dict(t=0, b=0, l=0, r=0))
return fig
st.write("#### Top 3 Countries by Medal Distribution")
col1, col2, col3 = st.columns(3)
with col1:
st.plotly_chart(create_donut_chart(top_3_countries.iloc[[0]]), use_container_width=True)
with col2:
st.plotly_chart(create_donut_chart(top_3_countries.iloc[[1]]), use_container_width=True)
with col3:
st.plotly_chart(create_donut_chart(top_3_countries.iloc[[2]]), use_container_width=True)
We first create a function called create_donut_chart(). Within this function, he donut chart is created using the Plotly Pie() function, making sure that we use the hole property (set here to 0.75). And then we add an annotation (using the add_annotation() function) to put the country name and total inside the donut hole.
Then we call our create_donut_chart() function 3 times (for 1st, 2nd and 3rd).
Component 3 - Top 3 Countries by Medal Type
Next, we create the top 3 countries for each medal type (gold/silver/bronze):
# Top 3 countries by each medal type
top_3_gold = filtered_data.nlargest(3, 'Gold')[['Country_Name', 'Gold']]
top_3_silver = filtered_data.nlargest(3, 'Silver')[['Country_Name', 'Silver']]
top_3_bronze = filtered_data.nlargest(3, 'Bronze')[['Country_Name', 'Bronze']]
# Display top 3 countries by medal type with circular icons
st.write("#### Top 3 Countries by Medal Type")
col1, col2, col3 = st.columns(3)
with col1:
st.write("**Gold** 🥇")
for i, row in top_3_gold.iterrows():
st.write(f"{row['Country_Name']}: {row['Gold']}")
with col2:
st.write("**Silver** 🥈")
for i, row in top_3_silver.iterrows():
st.write(f"{row['Country_Name']}: {row['Silver']}")
with col3:
st.write("**Bronze** 🥉")
for i, row in top_3_bronze.iterrows():
st.write(f"{row['Country_Name']}: {row['Bronze']}")
In this code, we set up a filter for the top 3 countries for gold, silver, and bronze (3 data frames). Then we display each in its own column. Note here that GPT-4 even added a simple icon for each medal color. Very cool.
NOTE: When using the Streamlit write() function, you can indicate the HTML header size by the number of hash tags (#) you use. For example st.write("### Top 3 Countries by Medal Type")
is the equivalent of H3.
Component 4 - Table Display For All Countries (sorted by total)
Lastly, the fourth component, the table is displayed. This table contains a sorted list (by medal total) for every country that won a medal at the selected Olympiad.
# Sort the filtered data by total medals
sorted_data = filtered_data.sort_values(by='Total', ascending=False)
st.write("#### Medals by Country")
st.dataframe(sorted_data[['Country_Name', 'Gold', 'Silver', 'Bronze', 'Total']])
The table is displayed using the Streamlit dataframe() function.
And that’s all the code that is needed. If you copy and paste this code in sequence into your favourite Python editor (I use Pycharm) and save it as a .py file. You can then run it as a Streamlit application, for example, from a Terminal window:
does it actually work? You bet! Here’s the result from this code:
All of this from only 80 lines of code!
NOTE: It’s important to state here that like most all current LLMs, the ability of GPT-4 to respond in a “human-like” fashion results in variable responses. The code generated for you most likely will not be exactly like what is shown here. For example, the first time I prompted GPT-4, it used the matplotlib library to create the charts.Future iterations of prompting GPT-4 resulted in the tool generating the charts with the Plotly library.
In Summary…
GPT-4's ability to accurately interpret images and generate corresponding Streamlit code makes it a gold medal winner.
This LLM can be leveraged for practical code-generation tasks - particularly for interactive data visualization prototyping.
This saves significant development time and effort. In this example, armed with an image and a dataset and a modular prompt, I was able to immediately generate a fully functioning prototype.
Give it a try - and add on some additional functionality.
Please leave a comment to let me know how it goes!
And If this article is right up your alley, try my free 5-Day Email Course on Prompting GPT-4 for Data Visuals. Let me know what you think:
I loved this post gonna play around with coding and ai now 🍻great post!