ChatGPT_CodeInterpreter_Sample_Prompts_DataAnylsis

Interactive Data Analysis: A collection of intuitive prompts for data exploration using ChatGPT CodeInterpreter.


Project maintained by Sven-Bo Hosted on GitHub Pages — Theme by mattgraham

Sample Prompts for ChatGPT CodeInterpreter

This document provides a set of generic question prompts that can be used with the ChatGPT CodeInterpreter tool for data analysis tasks. These prompts are designed to facilitate various stages of data analysis, ranging from basic examination of datasets to advanced analytics and recommendations.

Basic Analysis

  1. Give me a brief overview of the dataset.
  2. What is the shape of the dataset?
  3. Show me the first few records.
  4. Are there any missing values?
  5. What types of data does the dataset contain?

Descriptive Statistics

  1. Provide a summary of statistics for all numerical columns.
  2. Which columns have the most variation?
  3. Are there any correlations in the data?
  4. Which columns have the highest/lowest values?
  5. What are the unique values in each non-numerical column?

Data Cleaning

  1. Clean any missing or null data points.
  2. Normalize the numerical data.
  3. Remove any duplicate records.
  4. Reformat inconsistent data entries.
  5. Are there any columns that can be dropped due to lack of data?

Exploratory Data Analysis

  1. Show me the distribution of data for each column.
  2. Which columns are correlated with each other?
  3. Visualize the data in a suitable plot.
  4. Can you cluster similar records?
  5. How does variable X affect variable Y over time?

Advanced Analysis

  1. Predict the outcome based on the available data.
  2. Identify any patterns or trends within the data.
  3. Group similar data points together.
  4. Detect any anomalies or outliers in the data.
  5. Determine the key influencing factors for a specific outcome.

Recommendations and Insights

  1. What insights can be drawn from the dataset?
  2. What recommendations would you suggest based on the analysis?
  3. Are there any potential issues with the data quality?
  4. What additional data might improve the analysis?
  5. How can the data be used to make informed decisions?