Are researchers turning to ChatGPT and other chatbots to write entire papers to survive in the publish or perish culture of the academic environments? A recent Scientific American article highlights how rampant the practice is by analyzing the publication data obtained from databases such as Dimensions.

What could the overuse of the words “intricate,” “meticulous” and “commendable” in a scientific paper signal? The possible misuse of ChatGPT and other artificial intelligence chatbots to produce a scientific paper, according to a recent Scientific American article, AI Chatbots Have Thoroughly Infiltrated Scientific Publishing. Author Chris Stokel-Walker writes that obvious telltale signs, such as including the exact phrasing from a ChatGPT-produced text in science papers, are easy to spot, but a closer analysis reveals the sudden surge of particular words and turn of phrases in the past year after ChatGPT went mainstream.

A chatbot attack?

Andrew Gray, a librarian and researcher at University College London, used Dimensions to hunt for “AI Buzzwords” and found that “at least 60,000 papers—slightly more than 1 percent of all scientific articles published globally last year—may have used an LLM.” These buzzwords are those that seemed to appear more often in AI-generated sentences than in typical human writing. One such phrase, according to the article, is “complex and multifaceted,” and a quick search in Dimensions reveals that there is definitely an uptick in the  occurrence across different fields of research (see figure below).

So what is the problem with using large language models (LLMs) to generate scientific literature? Scientific integrity consultant Elisabeth Bik, who was quoted in the Scientific American article explained that the AI chatbots are not sufficiently advanced to provide trustworthy outputs and are prone to what is termed as hallucinations, that is, they “make up” text, including citations that simply do not exist. The article also points out that that problem is not just AI-generated text, but also that AI-generated judgements creep into scientific publications. 

Transparency to ensure trustworthy research

But many within the research community recognize that using LLMs to support some aspects of writing a paper in itself is, perhaps, inevitable. Brent Sinclair in his article, Letting ChatGPT do your science is fraudulent (and a bad idea), but AI-generated text can enhance inclusiveness in publishing, argues that “AI-generated text has the potential to make science publishing more inclusive by reducing the language barrier.” However, he emphasizes that a “scientist still needs to check [the AI-generated text], add references and context, and be fully accountable for the contents.”  

This is the sentiment that is underlined in the 2023 statement put out by Nature, and echoed by the publishing guidelines of many scientific publishers: “No LLM tool will be accepted as a credited author on a research paper. That is because any attribution of authorship carries with it accountability for the work, and AI tools cannot take such responsibility.” And if an LLM tool has been used to support the writing authors must be transparent about the use: “researchers using LLM tools should document this use in the methods or acknowledgements sections. If a paper does not include these sections, the introduction or another appropriate section can be used to document the use of the LLM.” 

The newly-launched Dimensions Research GPT / Enterprise and the fully-integrated summarization feature have been developed keeping in mind the need to ensure trust in research. These solutions combine the power of AI technologies with the robust scientific data available through Dimensions – the world’s largest collection of linked research data – to provide answers that are grounded in evidence. These solutions mean that authors can access an advanced literature discovery workflow that merges the scientific evidence base of Dimensions with the Generative AI functionality of ChatGPT – reducing the likelihood of the above-mentioned issue of hallucinations and providing click-through scholarly references for each statement to enable quick and easy validation and further discovery. 
If you want more information on how Dimensions data can be used to support publishers and authors, contact the Dimensions team.