Using Generative AI for Scientific Research
A Quick User’s Guide
(Updated on 08/16/2023)
(Updated on 08/16/2023)
This is a guide on how Generative AI can be used in multiple aspects of your research, based on published guidelines by journals, funding agencies and professional societies, as well as our own assessment of Generative AI’s benefits and risks. Generative AI is a rapidly evolving technology, and as a society we are all learning to cope with it. We will update this guide as new information becomes available.
If you have thoughts about what to add to this guide or how to improve it, please email midas-research@umich.edu. We look forward to collaborating with our research community to develop this guide for the community.
For information on the instructional use of Generative AI, please see U-M guidelines: Instruction in an AI-Augmented World.
Table of contents:
The default should be NO for creative contributions, but can be YES for editorial assistance. But in all cases, please make sure you know what is acceptable at your target publication venue.
Generating text and images for publications in scientific journals raises issues of authorship, copyright and plagiarism, many of which are still unresolved. Therefore, this is a very controversial area and many journals and research conferences are updating their policies. If you want to do this, please read very carefully the guidelines for authors of your target journal.
Here are a few examples of new authorship guidelines.
While final output generated by Generative AI is problematic, as described above, Generative AI can play a useful role earlier in the process. For example, non-native English speakers could improve the language of their writing through Generative AI. As long as the human author takes full responsibility for the resulting content, such “editing help” from Generative AI is likely to become acceptable in most disciplines where the specific expressed language is not the part of the scholarly contribution. However, use of such techniques may be limited in the short run on account of conservative editorial policies at some publication venues.
You used Generative AI in the course of writing a research paper. How do you give it credit? And how do you inform the reader of your paper about its use?
Generative AI should not be listed as a co-author, but its use must be noted in the paper, including appropriate detail, e.g. about specific prompts and responses. The Committee on Publication Ethics has a succinct and incisive analysis.
The use of Generative AI should be disclosed in the paper, along with a description of the places and manners of use. Typically, such disclosures will be in a “Methods” section of the paper, if it has one. If you rely on Generative AI output, you should cite it, just as you would cite a web page look up or a personal communication. Keep in mind that some conversation identifiers may be local to your account, and hence not useful to your reader. Good citation style recommendations have been suggested by the American Psychological Association (APA) and the Chicago Manual of Style.
This should be undertaken only with an understanding of the risks involved. The bottom line is that the investigator is signing off on the proposal and is promising to do the work if funded, and so has to take responsibility for every part of the proposal content, even if Generative AI assisted in some parts.
The reasoning is similar to that for writing papers, as discussed above, except that there usually will not be copyright and plagiarism issues. Also, not many funding agencies have well-developed policies as yet in this regard.
For example, although the National Institutes of Health (NIH) does not specifically prohibit the use of Generative AI to write grants (they do prohibit use of Generative AI technology in the peer review process), they state that an author assumes the risk of using an AI tool to help write an application, noting “[…] when we receive a grant application, it is our understanding that it is the original idea proposed by the institution and their affiliated research team.” If AI generated text includes plagiarism, fabricated citations or falsified information, the NIH “will take appropriate actions to address the non-compliance.” (Source.)
No, you should not do this. The National Institutes of Health recently announced that it prohibits the use of generative AI to analyze and formulate critiques of grant proposals. This not only applies to Generative AI systems that are publicly available, but also to systems hosted locally (such as a university’s own Generative AI), as long as data may be shared with multiple individuals. The main rationale is that this would constitute a breach of confidentiality, which is essential in the grant review process. To use Generative AI tools to evaluate and summarize grant proposals, or even let it edit critiques, one would need to feed to the AI system “substantial, privileged, and detailed information.” When we don’t know how the AI system will save, share or use the information that it is fed, we should not feed it such information.
Furthermore, expert review relies upon subject matter expertise, which a Generative AI system could not be relied upon to have. So, it is unlikely that Generative AI will produce a reliable and high-quality review.
For these reasons, we don’t recommend that you use Generative AI for reviewing grant proposals or papers, even if the relevant publication venue or funding agency, unlike NIH, has not issued explicit guidance.
Generative AI can offer multiple advantages. Generative AI can help you summarize a particular paper, so this saves you time and enables you to cover a much larger number of publications in the limited time you have. Generative AI can also help you summarize literature around certain research questions by searching through many papers.
However, you should consider a number of factors that may impact how much you can trust such reviews.
Also, please do keep in mind all the limitations discussed above regarding the use of Generative AI to assist in writing research papers. Subject to those limitations, this seems to be a reasonable thing to do.
Generative AI can, in some situations, be useful to help you draft a letter, or edit your draft and to help you adopt a certain tone. We are not aware of any explicit rules against this. However, please keep in mind the following:
Yes, provided you can read code! Generative AI can indeed output computer programs. But, just as in the case of text, it is possible you get code that is good-looking but erroneous. To the extent that it is often easier to read code than to write it, you may be better off using Generative AI to write code for you.
This applies not just to computer programs, but also to databases. You can have Generative AI write code for you in SQL to manage and to query databases. In fact, in many cases, you could even do some minimal debugging just by running the code/query on known instances and checking to make sure you get the right answers. While basic tests like these can catch many errors, remember that there is no guarantee your program will work on complex examples just because it worked on simple ones.
Generative AI can be beneficial for summarizing or translating your work, especially with its ability to adjust the tone of a text, making it easier to create brief but complete summaries that suit different types of readers. Several advanced Generative AI models are designed specifically to transform scientific manuscripts into presentations.
However, you should be sure that, while using Generative AI to summarize, present, or translate your work, you don’t input confidential information to Generative AI. You should also always verify that summaries, presentations and translations created by Generative AI accurately represent your work. When using Generative AI for translation, it could be challenging if you are not proficient in both languages involved and you need to consult with a fluent speaker for verification. Also note that not all Generative AI models are explicitly designed for translation tasks. Therefore, you should explore and identify the most suitable Generative AI model that aligns with your specific translation needs.
The most important factor is which Generative AI system (what data, what model, what computing requirements) fits well with your research questions. In addition, there are some general considerations.
Open source. “Open source” describes software that is published alongside the source code for use and exploration by anyone. This is a consideration because most Generative AI models are not developed locally by the researchers themselves (as opposed to the usual Machine Learning models). For researchers who would like to fine tune Generative AI models, scrutinize the security and functionality of the system, and improve explainability and interpretability of the models, open-source Generative AIs, as well Generative AI systems trained with publicly accessible data, can be advantageous.
Accuracy and precision. When outputs of a Generative AI can be verified (for example, if it is used in data analytics), you can gauge the efficacy of a Generative AI by its precision and accuracy.
Cost. Some models require subscriptions to APIs (application programming interfaces) for research use. Other models may be able to be integrated locally, but also come with integration costs and potentially ongoing costs for maintenance and updates. When selecting otherwise free models, you might need to cover the cost for an expert to set up and maintain the model.
The nature of Generative AI gives rise to a number of considerations that the entire research community is trying to grapple with. We invite you to think about the following carefully, and be aware that many other issues might arise.
Ethical issues. Data privacy is more complicated with Generative AI when you don’t know for sure what Generative AI does with your input data. Transparency and accountability about the Generative AI’s operations and decision making processes can be difficult when you operate a closed-source system.
Bias in data. Bias in data, and consequently bias in the AI system’s output, could be a major issue because Generative AI is trained on large datasets that you usually can’t access or assess, and may inadvertently learn and reproduce biases, stereotypes, and majority views present in these data. Moreover, most Generative AI models are trained with overwhelmingly English texts, Western images and other types of data. Non-Western or non-English speaking cultures, as well as work by minorities and non-English speakers are seriously underrepresented in the training data. Thus, the results created by Generative AI are definitely culturally biased. This should be a major consideration when assessing whether Generative AI is suitable for your research.
AI hallucination. Generative AI can produce outputs that are factually inaccurate or entirely incorrect, uncorroborated, nonsensical or fabricated. These phenomena are dubbed “hallucinations”. Therefore, it is essential for you to verify Generative AI-generated output with reliable and credible sources.
Plagiarism. Generative AI can only generate new contents based on, or drawn from, the data that it is trained on. Therefore, there is a likelihood that they will produce outputs that are similar to the training data, even to the point of being regarded as plagiarism if the similarity is too high. As such, you should confirm (e.g. by using plagiarism detection tools) that Generative AI outputs are not plagiarized but instead “learned” from various sources in the manner humans learn without plagiarizing.
Prompt Engineering. The advent of Generative AI has created a new human activity – prompt engineering – because the quality of Generative AI responses is heavily influenced by the user input or ‘prompt’. There are courses dedicated to this concept (see our “other training” page). However, you will need to experiment with how to craft prompts that are clear, specific and appropriately structured so that Generative AI will generate the output with the desired style, quality and purpose.
Knowledge Cutoff Date. Many Generative AI models are trained on data up to a specific date, and are therefore unaware of any events or information produced beyond that. For example, if a Generative AI is trained on data up to March 2019, they would be unaware of COVID-19 and the impact it had on humanity, or who is the current monarch of Britain. You need to know the cutoff date of the Generative AI model that you use in order to assess what research questions are appropriate for its use.
Model Continuity. When you use Generative AI models developed by external entities / vendors, you need to consider the possibility that one day the vendor might discontinue the model. This might have a big impact on the reproducibility of your research.
Security. As with any computer or online system, a Generative AI system is susceptible to security breaches and attacks. We have already mentioned the issue of confidentiality and privacy as you input information or give prompts to the system. But malicious attacks could be a bigger threat. For example, a new type of attack, prompt injection, deliberately feeds harmful or malicious contents into the system to manipulate the results that it generates for users. Generative AI developers are designing processes and technical solutions against such risks (for example, see OpenAI’s GPT4 System Card and disallowed usage policy. But as a user, you also need to be aware what is at risk, follow guidelines of your local IT providers, and do due diligence with the results that a Generative AI creates for you.
Many recommendations, guidelines and comments are out there regarding the use of Generative AI in research and in other lines of work. Here are a few examples.
Best Practices for Using AI When Writing Scientific Manuscripts: Caution, Care, and Consideration: Creative Science Depends on It. Jullian M. Buriak, et al. ACS Nano (2023)
Science journals set new authorship guidelines for AI-generated text. Jennifer Harker. National Institute of Environmental Health Sciences (2023)
NIH prohibits the use of Generative AI in peer review. (2023)
Fighting reviewer fatigue or amplifying bias? Considerations and recommendations for use of ChatGPT and other large language models in scholarly peer review. Mohammad Hosseini and Serge P J M Horbach. Research integrity and peer review (2023)
Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge. Annette Flanagin et al, JAMA (2023)