Oxford study says LLMs pose threat to science

  • A study by the Oxford Internet Institute details the threat to science that AI hallucinations pose to science.
  • Generative AI platforms are designed to yield answers, regardless if they are accurate or not.
  • Researchers argue that LLMs should be treated as knowledge sources within academic fields.

The tendency for generative AI platforms or large language models (LLMs) to hallucinate is something that simply is not getting enough attention at the moment. This is according to a study published in the Nature Human Behaviour journal (paywall) by researchers at the Oxford Internet Institute.

For those unfamiliar with AI hallucinations, they are described by IBM as, “a phenomenon wherein a large language model (LLM)—often a generative AI chatbot or computer vision tool—perceives patterns or objects that are nonexistent or imperceptible to human observers, creating outputs that are nonsensical or altogether inaccurate.”

“Generally, if a user makes a request of a generative AI tool, they desire an output that appropriately addresses the prompt (i.e., a correct answer to a question). However, sometimes AI algorithms produce outputs that are not based on training data, are incorrectly decoded by the transformer, or do not follow any identifiable pattern. In other words, it ‘hallucinates’ the response,” the company’s explanation continues.

We have already seen instances where LLMs hallucinating have had real-world consequences, such as a lawyer in the US who utilised ChatGPT in order to search for and collate previous cases in a brief. As it turns out, the vast majority cases cited in the brief were in fact fabricated by the generative AI platform, resulting in punitive action against said lawyer.

As for the study, it is specifically concerned about the manner in which LLMs are used as a knowledge source in the academic field. Given the propensity for generative AI platforms to hallucinate, this poses a real problem for academia, with the researchers going so far as to say it’s a threat to science.

“People using LLMs often anthropomorphise the technology, where they trust it as a human-like information source,” noted Professor Brent Mittelstadt, a co-author on the paper.

“This is, in part, due to the design of LLMs as helpful, human-sounding agents that converse with users and answer seemingly any question with confident sounding, well-written text. The result of this is that users can easily be convinced that responses are accurate even when they have no basis in fact or present a biased or partial version of the truth,” he added per The Next Web.

While the authors are quick to acknowledge the role that LLMs will play in research within the scientific community moving forward, the manner in which these platforms are leveraged needs to be done so ethically and with care.

To that end, the authors call for LLMs to be used as “zero-shot translators”. Here the platforms should be used to transform appropriate data that has been input by a researcher into a conclusion or some form of code. As such, the LLM should be used to do some of the heavy lifting, not all of the work entirely.

As generative AI becomes more pervasive across a number of fields, determining accuracy will become both more difficult, and critical, moving forward.

[Image – Photo by Elimende Inagella on Unsplash]


About Author


Related News