GPT4: the hallucinations continue

OpenAI released the new version of its artificial intelligence. Its power is immense and its capabilities are very attractive. But the mistakes it makes are fundamentally the same. And its creators know it, explains Luca De Biase.
GPT4: the hallucinations continue
A shorter version in Italian of this article has been published on La Svolta. Photo: "The four capital mistakes of open source" by opensourceway is licensed under CC BY-SA 2.0.
The Forum Network is a space for experts and thought leaders—from around the world and all parts of society— to discuss and develop solutions now and for the future. It aims to foster the fruitful exchange of expertise and perspectives across fields, and opinions expressed do not necessarily represent the views of the OECD.

Announcing the release of GPT4, OpenAI boasted of the extraordinary potential of the new version of artificial intelligence that became wildly popular with ChatGPT, which gained one hundred million users within two months. The eloquence of artificial intelligence, however, could not help but prevent its most attentive users from noticing that that technology made very serious errors in the information it generated. And it must be admitted that this aspect has become so important that, just at the announcement of the new version, OpenAI was keen to point out that GPT4 also tends to have “hallucinations”. And it has some other problems, too.

The giant leap in the quality of this new GPT version didn’t change the substance of its epistemological flaws.

On the bright side, GPT4 will increase the dialogic capabilities of an app like Duolingo, which is used to converse in foreign languages for those who want to learn them better. It will be found in Khan Academy math courses. And as they told OpenAI it will be able to help humans in many tasks, from text production to software generation. It will be able to recognise input not only in text form but also in image form. And it is so strong that it can tempt many people who do not care about the quality of information. 

The system has an enormous capacity to use existing documentation, but it has no ability to distinguish between true and false. It is not built for that.

As the New York Times reports, a cardiologist at the University of North Carolina at Chapel Hill tried to see if the new artificial intelligence can make diagnoses and prognoses. And he found that it does them with great accuracy. Greg Brockman, president of OpenAI, also showed how the technology can synthesise complex legal text and even provide a layman-friendly version of it. But on more than one occasion, user queries produced errors, hallucinations, made-up data, and references to completely nonexistent sources. The explanation is clear: the system has an enormous capacity to use existing documentation, but it has no ability to distinguish between true and false. It is not built for that.

Read more on the Forum Network: Automation Anxiety: Why and How to Save Work by Cynthia Estlund, Professor of Law, New York University

The prospect of net job losses overall would further worsen the outlook for most workers, but the ongoing hollowing out of the labour market should draw the concern of citizens, policy makers and scholars.

In general, OpenAI says GPT4's shortcomings are: it may hallucinate, it may be subject to attacks and adversarial prompts that cause it to say socially unacceptable things, and it may contain social biases because of the informational distortions in the documents it processes.

This reality should serve to understand how to use this technology. Will someone be tempted to use artificial intelligence to get medical diagnoses when it is not easy to reach doctors? That someone must know that the diagnoses can be wrong. Will someone be tempted to have artificial intelligence write critical software for an organization without paying a professional programmer? Again, that someone must know that the result could be vulnerable to all kinds of cybersecurity attacks. Will someone attempt to replace journalists with artificial intelligence? Finally, that someone must know that the resulting articles could contain fabricated facts. In short, the power of technology is enormous and the risk of hallucinations is present and for now ineliminable. This means that artificial intelligence should be thought of as a system to increase the productivity of professionals but not to replace them. From this perspective, it can have a gigantic impact.

Generative Artificial Intelligence tools such as ChatGPT, have taken the world by storm. They have passed graduate admission tests, been used to support court rulings, and won art competitions. As technology advances and new tools emerge, it is essential that governments, education institutions and businesses understand how to leverage and adapt to these technologies and how to govern them to ensure they are beneficial for humanity and the environment. To learn more, attend OECD's 2023 International Conference on AI in Work, Innovation, Productivity, and Skills

Please sign in or register for FREE

If you are a registered user on The OECD Forum Network, please sign in

Go to the profile of André VIEIRA
6 months ago

While Open AI is optimising the algorithm, they can add a disclosure alert informing the results could be false, it is up to humans to perform a critical analysis on the results. Search engines like Google and Bing don't distinguish between true and false either.