Prompt Engineering with GPT-4: Charting and Mapping European Tourism Trends
Multiple data visualizations using GPT-4’s data visualization toolset.
Efficiency is highly sought after state of being for coders and data analysts.
Transforming CSV data into data visualizations can be boring and tedious with conventional methods - from cleaning data, to creating data frames to mucking around with finicky data frame syntax.
With GPT-4’s Code Interpreter functionality, this process becomes significantly more streamlined.
In this article, we’ll use European tourism data to demonstrate how you can create a custom GPT with the GPT Builder that can swiftly handle data visualization, from initial dataset exploration to detailed mapping.
Let’s get to it.
The Data at a Glance
The dataset provided appears to capture the “International tourism, number of arrivals” for various countries and regions over several decades, from 1960 to 2019 (found HERE).
After downloading the CSV to our computer, we can create a custom GPT to streamline and simplify our data analysis
Click on My GPTs and GPT Builder will prompt you to “Create a New GPT”.
For this exercise, we will create a GPT that is a data scientist who creates data visualizations from uploaded CSV files.
Now before I save this GPT, I need to adjust it slightly. I am not a fan of GPT-4’s ability to wordsmith anything — it is often purposefully vague and noncommittal.
So for this GPT I have used some of my own stronger and more specific language in regards to the rules:
From my experience with prompting GPT-4, I know that if I don’t explicitly tell it to ONLY use data from the data set, it will resort to “placeholder” data when it gets confused or stuck.
We can then upload the CSV file to the Custom GPT Builder tool:
And we can make sure that the “Code Interpreter” checkbox is checked.
Now that we are all setup, we can save our GPT (I called mine DataViz Wizard) and make the link accessible to anyone who wants to use this GPT to create data visuals.
Now this GPT is available for anyone who has a link and a GPT-4 Plus license.
Now we can use the link created to navigate to this new GPT:
Once we paste the link into our browser, we are ready to create our visualizations!
Prompting GPT-4 for Python Data Viz Code
From our brand-new Custom GPT, we can start prompting for our data visualizations.
Prompt to GPT-4: Please provide details on each column in the tourism dataset.
GPT-4 Response:
A
Country Name
or region.An associated
Country Code
, which can be instrumental when mapping the data. This is an iso-3 code, so it can be used for mapping.The
Indicator Name
andIndicator Code
, both of which appear to remain constant for our dataset, indicating the type of data being represented.Data columns spanning from
1960
to2019
that indicate the number of international tourist arrivals for each respective year.
Given that our focus is on European travel, we’ll need to filter this dataset to only include European countries. Our custom GPT can help us with this:
Prompt to GPT-4: Please filter the dataset to only include European countries
Response from GPT-4:
# List of European countries (based on common definitions of European countries)
european_countries = [
"Austria", "Albania", "Belgium", "Bulgaria", "Switzerland", "Cyprus", "Czech Republic", "Germany",
"Denmark", "Spain", "Estonia", "Finland", "France", "United Kingdom", "Greece", "Croatia", "Hungary",
"Ireland", "Iceland", "Italy", "Lithuania", "Luxembourg", "Latvia", "Malta", "Netherlands", "Norway",
"Poland", "Portugal", "Romania", "Russia", "Sweden", "Slovenia", "Slovakia", "Turkey", "Ukraine"
]
# Filter the dataset for European countries
european_travel_data = travel_data[travel_data["Country Name"].isin(european_countries)]
# Display the European countries present in the dataset
european_travel_data["Country Name"].tolist()
The dataset provides travel data for the majority of European countries — definitely good enough for the purposes of this tutorial.
Now let’s get to some data analysis!
1. European Travel Trends: Line Chart
Line charts are an excellent choice to trace data progression over time.
With a line chart, we can display the evolution of international tourist arrivals for a subset of European countries across the decade (2010–2019).
Prompt to GPT-4: Generate a line chart showcasing the international tourist arrivals from 2010 to 2019 for France, Germany, United Kingdom, Spain, and Italy.
Response from GPT-4:
The line chart above illustrates the international tourist arrivals from 2010 to 2019 for a select group of European countries (France, Germany, UK, Spain, Italy).
The most relevant trend to note is that all 5 countries show a steady increase in the overall number of tourists during this decade.
And the Python code generated (Click “[>]”):
# Generate line chart for the international tourist arrivals from 2010 to 2019 for the selected countries
plt.figure(figsize=(14, 7))
for country in subset_countries_linechart:
plt.plot(linechart_data.index, linechart_data[country], label=country, marker='o')
plt.title("International Tourist Arrivals (2010–2019) for Selected European Countries")
plt.xlabel("Year")
plt.ylabel("Number of Tourists")
plt.legend()
plt.grid(True, which='both', linestyle='--', linewidth=0.5)
plt.tight_layout()
plt.show()
NOTE: The default line chart created by GPT-4 for this example initially used different shades of green for each line, making it very difficult to interpret. I asked it to “use contrasting colors” for each line and this quickly fixed it.
2. Tourist Arrivals in 2019: Bar Chart
Bar charts provide a terrific, easy-to-understand visual comparison across categories.
Focusing on the year 2019, we’ll use a bar chart to give an alternative view of the number of tourist arrivals for the same set of European countries.
Prompt to GPT-4: Generate a bar chart comparing international tourist arrivals in 2019 for France, Germany, United Kingdom, Spain, and Italy.
Response from GPT-4:
The bar chart provides a straightforward comparison of international tourist arrivals in 2019 for our selected European countries.
Spain and France stand out as the top tourist destinations, with a significantly higher number of arrivals compared to the other countries.
And the Python code generated (Click “[>]”):
# Generate the bar chart for international tourist arrivals in 2019 for the selected countries
plt.figure(figsize=(12, 6))
sns.barplot(x=barchart_data_2019.index, y=barchart_data_2019.values, palette="viridis")
plt.title("International Tourist Arrivals in 2019 for Selected European Countries")
plt.xlabel("Country")
plt.ylabel("Number of Tourists")
plt.tight_layout()
plt.show()
For this type of analysis, though, a few more data points would be useful — and if we put them in sequence, then it is easier to compare, so let’s try creating another bar chart:
Prompt to GPT-4: Generate a bar chart displaying tourist arrivals in 2019 for France, Germany, United Kingdom, Spain, Italy, Switzerland, and Portugal.
Response from GPT-4:
This looks better — with an ordered horizontal bar chart, we can see the variance from most to least much more clearly.
And the Python code generated (Click “[>]”):
# Selecting a subset of European countries for the bar chart visualization
subset_countries_barchart = ["France", "Germany", "United Kingdom", "Spain", "Italy", "Switzerland", "Portugal"]
# Extracting data for the selected countries for the year 2019
barchart_data = european_travel_data[european_travel_data["Country Name"].isin(subset_countries_barchart)]
barchart_data = barchart_data.set_index("Country Name")["2019"]
# Generate bar chart
plt.figure(figsize=(12, 6))
barchart_data.sort_values().plot(kind='barh', color='skyblue')
plt.title("International Tourist Arrivals in 2019 for Selected European Countries")
plt.xlabel("Number of Tourists")
plt.ylabel("Country")
plt.tight_layout()
plt.show()
3. European Tourist Arrivals Trend: Area Chart
Area charts are effective for visualizing quantitative data over a continuous interval.
Let’s create an area chart to gain insights into the trend of tourist arrivals for 3 countries (Germany, France, and Italy) for the period of time 2010–2019
Prompt to GPT-4: Generate an area chart showing cumulative tourist arrivals from 2010 to 2019 for France, Germany, and Italy.
Response from GPT-4:
The area chart above visualizes the cumulative international tourist arrivals across all European countries from 2010 to 2019. The increasingly shaded area indicates a growth in tourism, especially noticeable in the latter years.
And the Python code generated (Click “[>]”):
# Selecting a subset of European countries for the area chart visualization
subset_countries_areachart = ["France", "Germany", "Italy"]
# Extracting data for the selected countries from 2010 to 2019
areachart_data = european_travel_data[european_travel_data["Country Name"].isin(subset_countries_areachart)]
areachart_data = areachart_data.set_index("Country Name").loc[:, "2010":"2019"].transpose()
# Generate area chart
plt.figure(figsize=(14, 7))
areachart_data.plot.area(stacked=True, figsize=(14,7), alpha=0.6)
plt.title("Cumulative International Tourist Arrivals (2010–2019) for Selected European Countries")
plt.xlabel("Year")
plt.ylabel("Number of Tourists")
plt.tight_layout()
plt.show()
There is just enough variance in color to distinguish each country, but again, you can easily ask GPT-4 to provide “contrasting colours” for each country.
4. Mapping Tourism Popularity Over Time
Now, what are the trends for each country over the decade of 2010-2019? Which countries became more popular to travel to, and which countries became less?
We can prompt GPT-4 to find out!
Prompt to GPT-4: Display a choropleth map illustrating the change in tourist arrivals in all countries from 2010 to 2019.
Countries showing in red are “hot” for tourism - they are increasing in the number of tourists over time. Conversely, those in dark blue have seen a decline in tourism over the period of 2010-2019.
And the Python code generated (Click “[>]”):
# Calculating the change in tourist arrivals from 2010 to 2019
merged_tourism_total_corrected_3['change_2010_2019'] = merged_tourism_total_corrected_3['2019'] - merged_tourism_total_corrected_3['2010']
# Plotting the choropleth map showcasing the change in tourist arrivals from 2010 to 2019
fig, ax = plt.subplots(1, 1, figsize=(15, 10))
merged_tourism_total_corrected_3.boundary.plot(ax=ax, linewidth=1)
merged_tourism_total_corrected_3.dropna(subset=['change_2010_2019']).plot(column='change_2010_2019', ax=ax, legend=True,
legend_kwds={'label': "Change in Tourist Arrivals (2010-2019)"},
cmap='coolwarm', edgecolor='black')
ax.set_title("Change in European Tourist Arrivals from 2010 to 2019")
ax.set_xlim(xlim)
ax.set_ylim(ylim)
plt.show()
This map represents the variation in tourist numbers from 2010 to 2019.
Positive values (in blue shades) suggest an increase in arrivals, whereas negative values (in red shades) indicate a decline.
This visualization can help deduce which countries have experienced a tourism boom and which ones have seen a drop over the decade.
NOTE: Just recently, I have noticed that GPT-4 is having trouble displaying global choropleth maps. I am hoping it just a glitch in the most recent version and that it will be resolved in the next version.
In Summary…
The Custom GPT Builder in GPT-4 offers a glimpse into the next generation of data visualization: fast, precise, and user-friendly.
Through our detailed analysis of European tourism data, it’s clear that the tool provides significant advantages for coders and data analysts.
By incorporating GPT-4 into your workflow, you can be super-speedy with your analysis, leading to overall higher efficiency.
Thank you for reading.
What is your experience using the GPT-4 Custom Builder tool?
I would love to hear your comments!