Using “Data Analysis” in ChatGPT 4

Guide written by Dan AlexanderJing Liu & Ken Reid

This short guide contains the basics of using the “Data Analysis” feature in ChatGPT 4o to organize your data, perform statistical analyses, and visualize results. It covers data cleaning, exploratory data analysis, statistical testing, and data visualization.

What is ChatGPT 4’s Data Analysis feature?

The Data Analysis feature in ChatGPT 4o allows researchers to analyze data directly within the chat interface through simple conversational prompts, without requiring programming knowledge.

Key benefits for researchers:

  • Quick exploratory data analysis
  • Automated statistical testing suggestions
  • Easy data visualization
  • Assistance with data cleaning and preprocessing

Access:

You do not need the ChatGPT Plus plan in order to use this feature.

Data visualization and exploration

“Data Analysis” can generate summary statistics and visualize data and even recommend data visualization.

  1. Upload your dataset using the file attachment function in “Data Analysis”.
  2. Tell “Data Analysis” what you would like it to do. For example:
    • “Please clean up the data, remove duplicates, highlight missing values, and so on. And please tell me what you have done after you finish.”
    • “Please provide summary statistics of the dataset, including mean, median, range, and others.”
    • “Please plot the data as a histogram.”
  3. Tell “Data Analysis” to recommend statistical tests or visualizations for you. For example:
    • “Please plot the data in two ways and explain to me why you choose these two ways.”
    • “What statistical analyses do you recommend if I want to understand the effect of online bullying on students’ academic performance?”

1. Simple file uploading.

2. Conduct simple analyses

3. Conduct and plot specific analysis

4. Request suggestions for analysis.

5. Import a spreadsheet for viewing and editing

This can be done through connecting to a Google Drive sheet, uploading an xlsx or CSV file, or copy pasting data directly into the prompt window.

You can then view, and ask ChatGPT to modify data in the sheet, or do other calculations from the data present:

6. Direct ChatGPT to create specific plots

The plots are interactive, and provide additional details on mouseover, or you can request static plots that can be downloaded as images. You can also customize the plots, e.g. by color.

Data Cleaning and Preprocessing

ChatGPT 4’s Data Analysis feature can assist with various data cleaning and preprocessing tasks. Here are some example prompts:

1. Handling missing values

2. Removing rows with missing data

3. Detecting and Addressing Outliers

4. Data consistency

Advanced Statistical Analysis

ChatGPT 4 can perform or recommend a wide range of statistical analyses. Some examples include:

1. Deciding on statistical techniques

2. Interpreting Results

3. Multiple Regression Analysis

4. Time Series Forecasting

Additional Resources

For a tutorial on using GenAI for programming in research, see the tutorial “Code Smarter, Not Harder: Harnessing Generative AI for Research Programming Efficiency”.

For a list of resources including a selected list of papers using GenAI, opinion pieces on using Generative AI in research, and more, see our Generative AI resource page