How I Use GPT-4 to Easily Create Awesome Data Visualizations in Minutes
Some useful tips on using GPT-4 to optimize your data visual workflow
For the past 2 years now I have been a heavy daily GPT-4 user. I pay for the GPT-4o version and I now find it indispensable.
I know it makes me sound like an addict, but I have found many good reasons and many excellent use cases where this tool is an invaluable resource for improving my data workflow.
Let me show you why I am using this awesome tool and then let me provide some specific examples on how I use GPT-4 to improve my productivity.
WHY Do I Use GPT-4 Every Day?
For me, this is an easy question to answer as there are a number of reasons why I am constantly conversing in a GPT-4 chat window.
I am most often prompting this LLM for data visualization code and on-the-fly-visuals because it does all these steps very efficiently.
Automated Data Preparation
GPT-4 automates the tedious process of data preparation and visualization, which traditionally requires extensive coding and debugging.
This saves me a massive amount of time and effort.
A side-effect advantage to this is that I can use this time to do more thorough and iterative data explorations.
As a recent example, I was working with a UN dataset called the Global Peace Index (GPI). The dataset can be downloaded from my Github HERE.
The original dataset looks like this:
What is missing from this data is a categorization of each country by continent, and also by sub-region.
I searched online and found a simple CSV file with this data. Now I can prompt GPT-4 to create a new CSV file with these fields added:
The response from GPT-4 is that it creates a new file (yes, it does it on-the-fly) for download:
Now if I open this new file, I can see the 2 new fields added:
By adding in these fields, I have enriched the possibilities of data analysis by providing the ability to categorize by continent and also by sub-region.
Awesome! Now we can use GPT-4o to do some exploratory data analysis on this comprehensive data set.
Automated Data Visualization
A recent development with GPT-4 is its ability to create on-the-fly data visualizations. If I feed GPT-4 a data set (i.e., a CSV file) and prompt it to create a data visual to answer my question, it can do it.
I give some great examples of this in the next section (the HOW)
As a Learning Tool
For me, GPT-4 is indispensable as a learning tool. It can handle generating code for any Python visualization framework.
I can ask GPT-4 to step me through the code needed to create a visual using a specific framework.
For example, recently I have been using it to assist me in Streamlit code generation. Streamlit is an awesome framework that generates interactive data visualizations.
HOW Do I Use GPT-4 for Data Visualization?
To get the best out of GPT-4 for data visualization, you need to have an understanding of GPT-4’s capabilities.
Then, you can provide clear prompts to get the best results.
Let me guide you through ALL the steps I take to get from raw data to visualization.
To give you is a detailed guide on how to achieve meaningful visualizations with GPT-4, using the global population projections dataset as a basis.
Step 1: Organizing and Preparing Your Dataset
Ensure your dataset is structured and clean. I have a dataset from a UN website showing global security trends from 2008–2022. I can upload this data set to GPT_4 and ask it to take a look:
Now recently, GPT-4o has been responding to these types of queries with a more comprehensive analysis. For example, it now creates a description of each field in tabular format:
Now I know (AND GPT-4 knows) what I have available for analysis.
Step 2: Identify Your Visualization Goals
Decide on the types of exporatory analysis that you want to undertake. You should put these in a list. The more specific you can be, the better.
For example, with this dataset, we can analyze safety trends by country, and by different regions Some example questions I want to ask:
Is the world more/less safe over this period of time.
What are the safest countries? What are the least-safest countries
How safe is Africa? How safe is the Middle East?
How safe is country [x] ? Is it becoming more/less safe?
How safe is country [x] compared to country [y]?
Once we have a clearer idea of what we want to visualize, then we can create more effective prompts to visualize each insight.
Step 3. Crafting Clear, Specific, and Effective Prompts
For GPT-4 to generate accurate visualizations, your prompts need to be detailed. For example, we need to specify such things as specific data fields, specific time ranges, etc.
For our first query, we can prompt GPT-4 specifically for a time-series line graph:
And yes, it really is that easy.
For our second query, we need to be specific to GPT-4 on how many countries we want to display, to ensure it doesn’t create a cluttered chart of all countries.
Again, the response is quick and accurate.
This is so fun that we should create another chart — how safe is each continent?
And then lastly, where GPT-4 really is terrific — we can ask it to create a choropleth (heat) map showing areas that are trending towards being more/less safe for the period of 2010–2020:
You can click on the blue hyperlink to download the file to your computer. The resulting map is terrific:
We can see that countries like Libya, Venezuela, Afghanistan, Yemen, and Ukraine stand out as having become less peaceful during this particular decade of time.
Yes, it is amazing what GPT-4 can do with a well-crafted query.
NOTE: There was a time (about a year ago now) when GPT-4 was able to display choropleth maps in the chat window. Currently it cannot do this but I am hoping this functionality returns in a future update.
In Summary…
Compared to manual exploratory research methods, GPT-4’s big advantage is in its ability to integrate data processing, analysis, and visualization into a seamless and efficient set of 1-step processes.
Not only does this save oodles of time but also opens up data analysis to those without extensive technical backgrounds. You want to create a data visualization from a dataset? Just ask this LLM to show it to you. The key to creating these professional data visuals is in asking the right questions.
Think of GPT-4 as your skilled intern. It can do all the grunt work for you, and produce the visuals you need to tell a great story.
But on the downside for GPT-4 as an intern — it cannot perform any sort of heavy-duty analysis — yet.
It can’t tell the whole story FOR you — your job is still safe in this realm.
At least for now.
NOTE: All of the data files for this article are on my Github: HERE.