Using “Data Analysis” in ChatGPT 4
Guide written by Dan Alexander, Jing Liu & Ken Reid
This short guide contains the basics of using the “Data Analysis” feature in ChatGPT 4o to organize your data, perform statistical analyses, and visualize results. It covers data cleaning, exploratory data analysis, statistical testing, and data visualization.
What is ChatGPT 4’s Data Analysis feature?
The Data Analysis feature in ChatGPT 4o allows researchers to analyze data directly within the chat interface through simple conversational prompts, without requiring programming knowledge.
Key benefits for researchers:
- Quick exploratory data analysis
- Automated statistical testing suggestions
- Easy data visualization
- Assistance with data cleaning and preprocessing
Access:
You do not need the ChatGPT Plus plan in order to use this feature.
Data visualization and exploration
“Data Analysis” can generate summary statistics and visualize data and even recommend data visualization.
- Upload your dataset using the file attachment function in “Data Analysis”.
- Tell “Data Analysis” what you would like it to do. For example:
- “Please clean up the data, remove duplicates, highlight missing values, and so on. And please tell me what you have done after you finish.”
- “Please provide summary statistics of the dataset, including mean, median, range, and others.”
- “Please plot the data as a histogram.”
- Tell “Data Analysis” to recommend statistical tests or visualizations for you. For example:
- “Please plot the data in two ways and explain to me why you choose these two ways.”
- “What statistical analyses do you recommend if I want to understand the effect of online bullying on students’ academic performance?”
1. Simple file uploading.
2. Conduct simple analyses
3. Conduct and plot specific analysis
4. Request suggestions for analysis.
5. Import a spreadsheet for viewing and editing
This can be done through connecting to a Google Drive sheet, uploading an xlsx or CSV file, or copy pasting data directly into the prompt window.
You can then view, and ask ChatGPT to modify data in the sheet, or do other calculations from the data present:
6. Direct ChatGPT to create specific plots
The plots are interactive, and provide additional details on mouseover, or you can request static plots that can be downloaded as images. You can also customize the plots, e.g. by color.
Data Cleaning and Preprocessing
ChatGPT 4’s Data Analysis feature can assist with various data cleaning and preprocessing tasks. Here are some example prompts:
1. Handling missing values
2. Removing rows with missing data
3. Detecting and Addressing Outliers
4. Data consistency
Advanced Statistical Analysis
ChatGPT 4 can perform or recommend a wide range of statistical analyses. Some examples include:
1. Deciding on statistical techniques
2. Interpreting Results
3. Multiple Regression Analysis
4. Time Series Forecasting
Additional Resources
For a tutorial on using GenAI for programming in research, see the tutorial “Code Smarter, Not Harder: Harnessing Generative AI for Research Programming Efficiency”.
For a list of resources including a selected list of papers using GenAI, opinion pieces on using Generative AI in research, and more, see our Generative AI resource page.