Generative AI Research Resources

NEW: Institutional Efforts to Help Academic Researchers Implement Generative AI in Research by Jing Liu and H. V. Jagadish

Welcome to the MIDAS generative AI resource hub. This page is curated to serve researchers looking to integrate generative AI into their work.

generative AI models, capable of creating novel, diverse, and coherent content, are revolutionizing numerous domains. They’ve demonstrated their capabilities in fields as diverse as art, music, chemistry, drug discovery, and many more. This hub is in its initial stage, and will be updated frequently. Given the rapid advance of generative AI, it is not possible for us to build a comprehensive collection. In addition to generative AI overviews and a list of the most common models, we will focus on featuring examples of how generative AI is used in research; specialized, “research-use” generative AI models, as well as models and studies developed by U-M researchers. Please get in touch if you’d like to have your model or research study using generative AI included in our collection.

Last Updated: 9/12/2024

Generative AI Overview

Generative AI in Plain English

A Brief Introduction to GenAI
K. Reid & J. Liu, Michigan Institute for Data & AI in Society.

What are Generative AI models?
IBM Technology

What are Large Language Models (LLMs)?
Google for Developers

What Is ChatGPT Doing … and Why Does It Work?
Stephen Wolfram (2023). Stephen Wolfram Writings.

A Very Gentle Introduction to Large Language Models Without the Hype
Mark Riedl (2023). Medium.

A Comprehensive Survey of Large Language Models
Cobus Greyling (2023), Medium.

Generative AI in Technical Terms

On the opportunities and risks of foundation models.
Rishi Bommasani, et al., arXiv preprint arXiv:2108.07258 (2021).

ChatGPT is not all you need. A State of the Art Review of large Generative AI models
Roberto Gozalo-Brizuela and Eduardo C. Garrido-Merch´an. arXiv preprint arXiv (2023)

Natural Language Processing with Transformers, Revised Edition.
Lewis Tunstall, Leandro von Werra, Thomas Wolf (2022).

Attention is all you need.
Ashish Vaswani, et al., Advances in neural information processing systems 30 (2017).

Fine-tuning language models from human preferences.
Daniel M. Ziegler, et al., arXiv preprint (2019).

Examples of Research Use of Generative AI

Overview

60 ChatGPT Prompts for Data Science (Tried, tested, and rated).
Travis Tang. The Medium (2023)

Scientific Discovery in the Age of Artificial Intelligence.
Haochen Wang, et al. Nature (2023)

“So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy.
Yogesh K. Dwivedi, et al, International Journal of Information Management (2023)

Scientists’ Perspectives on the Potential for Generative AI in their Fields.
Meredith Morris. arXiv (2023)

Generative AI: Perspectives from Stanford HAI.
Russ Altman, HAI, Stanford University (2023)

The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4.
Microsoft Research AI 4 Science, (2023).

AI in academia: An overview of selected tools and their areas of application.
Robert F. J. Pinzolits, Fachhochschule Burgenland GmbH, (2024).

Research Using Generative AI

Emergent autonomous scientific research capabilities of large language models.
Daniil Boiko et al. arXiv (2023)

CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments
Y Qu, et al. bioRxiv (2024)

Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models
Sami Sarsa, et al. ICER ’22 (2022)

Automating Human Tutor-Style Programming Feedback: Leveraging GPT-4 Tutor Model for Hint Generation and GPT-3.5 Student Model for Hint Validation
Tung Phung, et al. Artificial Intelligence (2023)

CodeAgent: Collaborative Agents for Software Engineering
Daniel Tang, et al. arXiv (2024)

SWE-agent
John Yang, et al. (Paper coming April 2024)

Opportunities and Challenges for AI-Assisted Qualitative Data Analysis: An Example from Collaborative Problem-Solving Discourse Data
Leo A. Siiman, et al. Innovative Technologies and Learning (2023)

Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data
Yubin Kim et al. arXiv (2024)

“HOT” ChatGPT: The promise of ChatGPT in detecting and discriminating hateful, offensive, and toxic comments on social media
Lingyao Li et al. ACM Transactions on the Web (2024)

Exploring the Use of Artificial Intelligence for Qualitative Data Analysis: The Case of ChatGPT
David L. Morgan. International Journal of Qualitative Methods (2023)

The Future of Natural History Transcription: Navigating AI advancements with VoucherVision and the Specimen Label Transcription Project (SLTP)
William N. Weaver, et al. Biodiversity Information Science and Standards (2023)

Attention-stacked Generative Adversarial Network (AS-GAN)-empowered Sensor Data Augmentation for Online Monitoring of Manufacturing System.
Yuxuan Li, et al. arXiv (2023)

GAN Based Noise Generation to Aid Activity Recognition when Augmenting Measured WiFi Radar Data with Simulations
Shelly Vishwakarma, et al. IEEE International Conference (2021)

Comparing the Quality of ChatGPT-and Physician-Generated Responses to Patients’ Dermatologic Questions in the Electronic Medical Record
Kelly Reynolds, et al. Clinical and Experimental Dermatology (2024)

Efficient evolution of human antibodies from general protein language models.
Brian Hie, et al. Nature Biotechnology (2023)

Large language models generate functional protein sequences across diverse families.
Ali Madani, et al. Nature Biotechnology (2023)

Atomic structure generation from reconstructing structural fingerprints.
Victor Fung, et al., Machine Learning: Science and Technology (2022)

Deep generative molecular design reshapes drug discovery.
Xiangxiang Zeng, et al., Cell Reports Medicine (2022)

Accelerating drug target inhibitor discovery with a deep generative foundation model.
Vijil Chenthamarakshan et al., Science Advances (2023)

Combining generative artificial intelligence and on-chip synthesis for de novo drug design.
Francesca Grisoni, et al. Science Advances (2021)

Designing Chemical Reaction Arrays using phactor and ChatGPT.
Babak Mahjour et al. ChemRxiv (2023)

polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer informatics.
Christopher Kuenneth and Rampi Ramprasad, Nature Communications (2023)

Galactic ChitChat: Using Large Language Models to Converse with Astronomy Literature.
Ioana Ciucă et al. arXiv. (2023)

Structured information extraction from complex scientific text with fine-tuned large language models.
Alexander Dunn, et al. arXiv (2022)

Leveraging LLMs for KPIs Retrieval from Hybrid Long-Document: A Comprehensive Framework and Dataset
Chongjian Yue et al. arXiv (2023)

Emergent autonomous scientific research capabilities of large language models.
Daniil Boiko et al. arXiv (2023)

WaterGAN: Unsupervised Generative Network to Enable Real-Time Color Correction of Monocular Underwater Images.
Jie Li et al. IEEE Robotics and Automation Letters (2017)

Medical Image Reconstruction Using Generative Adversarial Network for Alzheimer Disease Assessment with Class-Imbalance Problem
Shengye Hu et al. International Conference on Computer and Communications (ICCC) (2020)

Coupled Adversarial Training for Remote Sensing Image Super-Resolution.
Sen Lei et al. IEEE Transactions on Geoscience and Remote Sensing (2020)

Deep learning synthetic angiograms for individuals unable to undergo contrast-guided laser treatment in aggressive retinopathy of prematurity
Joshua Ong, et al. Correspondence (2023)

Use of AI Language Engine ChatGPT 4.0 to Write a Scientific Review Article Examining the Intersection of Alzheimer’s Disease and Bone
Tyler J. Margetts, et al. Current Osteoporosis Reports (2024)

Techniques for supercharging academic writing with generative AI
Zhicheng Lin. Nature Biomedical Engineering (2024)

Signal Copilot: Leveraging the power of LLMs in drafting reports for biomedical signals
Chunyu Liu, et al. medRxiv (2023)

Translating radiology reports into plain language using chatgpt and gpt-4 with prompt learning: Promising results, limitations, and potential.
Lyu, Qing, et al.- arXiv preprint (2023)

New frontiers in health literacy: using ChatGPT to simplify health information for people in the community
Julie Ayre, et al. – General Internal Medicine (2023)

Generative AI as a tool for environmental health research translation
Lauren B. Anderson, et al. GeoHealth (2023)

A Selection of Other Online Resources

AI Tools – A curated list by the University of Michigan of popular AI tools for a variety of applications, including: content, images, video, programming, productivity, and research.

AwesomeLLM – A curated list of LLMs from a technical and AI-focused perspective.

Generative AI Essentials Course – GenAI course by the University of Michigan Center for Academic Innovation.

HuggingFace LLM Leaderboard – Generative AI, and especially LLMs, are evolving rapidly. A curated list of top LLMs on a set of benchmarks can be found here.

Papers with Code – A collection of Generative AI (and non-Generative AI) papers with examples of code, datasets and more. A useful resource for seeing Generative AI applications in research.

Prompt Engineering for ChatGPT – How to apply prompt engineering effectively with ChatGPT.

Stanford Helm – “Holistic Evaluation of Language Models (HELM) is a living benchmark that aims to improve the transparency of language models.”

thepi.pe – “The thepi.pe platform provides a user-friendly interface for scraping and extracting data from various sources” – this page compares LLM evaluations.