Illustration: Created with Dall-e according to the prompts ChatGPT had written as a result of the dialogs with the Binaire designer: “Title: ‘The Creative Fusion’. Create a picture that represents the creative fusion of man and machine by and machine by combining the passionate soul of an artist and the innovative digital nature of Dall-e in a unique visual composition.”

Natural Language Processing

ChatGPT: Technological Breakthrough, Social Upheaval?

Artificial intelligence (AI) has arrived in the centre of society, certainly since ChatGPT appeared on the scene at the end of last year. Within a very short time, ChatGPT had 100 million users, far eclipsing even the initial numbers of leading social media platforms like Instagram. ChatGPT is an application from research in the field of Natural Language Processing (NLP), the machine understanding and generation of natural language. For a long time, NLP eked out a rather niche existence within AI. In practice, the method was only used for a few problems, for example to improve the calculation of search engine results or to convert unstructured language data into structured databases. This was also because NLP made mistakes too often to enable effective widespread use. Now, however, it represents ─ at least perceived ─ the spearhead of AI. So, what has changed?

ChatGPT uses a generative AI model, or more precisely, a so-called large language model. Large language models are based on machine learning algorithms that have independently learned from huge amounts of data from the internet, social media, forums, articles, and books to quickly generate new content, in this case text, with minimal input. One figure: GPT-3 is one of the world’s largest NLP models with 175 billion parameters. Its successor GPT-4 is even more versatile, precise, and reliable. Large language models can reproduce human speech as convincingly as if it had been written by a human, mainly because of the sheer number of language examples they have seen. In the case of ChatGPT, there is also the fact that it has been trained on the basis of many sample dialogues to give the most suitable answers possible and has constantly optimised their quality through human feedback.

Such language models can also be coupled with search engines. In Bing Chat, GPT-4 supplements the Bing search engine with the ability to answer in natural language and to ask for details or further aspects, just as we know it from human communication. Through the coupling with Bing, the language model has access to the latest web content and can therefore also answer current questions correctly.

ChatGPT is just one of many large language models alongside Google’s PaLM, Baidu’s ERNIE or BioGPT, which specialises in medical texts. In Germany, too, researchers are working on generative AI and its use to solve many practical problems, for example in the Open GPT-X project, at Aleph Alpha and, of course, in academic institutions such as the L3S (examples can be found in this issue). The areas of application for generative AI are almost limitless. Wherever human language plays a role ─ in customer advice ( read more), in education ( read more), but also in the healthcare system ( read more) ─ it can be effective, sometimes even groundbreaking. It is not just a matter of formulating texts; language models can solve very different problems (often without being designed to do so) because they represent all information in the same way and can thus draw almost any cross-connections.

For many, ChatGPT represents an iPhone moment, a leap innovation that changes the way we work and live in a very short time. Even if the results of ChatGPT are far from perfect. On the contrary, in computer science we speak of hallucination of these models when missing information has to be interpreted due to the complex language depth structure and misinterpretations arise from these gaps. Supposedly perfect, polished texts turn out to be incorrect or imprecise in phases. This can have fatal consequences. Nevertheless, the enormous attention paid to large language models is justified. The programmes and their underlying big models represent an immense leap in terms of the sophistication, refinement, speed of innovation and capabilities of this NLP technology.

But the problem of hallucination remains. We encounter it again and again with generative AI models. An inaccurate route recommendation by a train company’s chatbot may end up being unpleasant for the traveller. But it would be a tolerable hallucination compared to an invented or at least insufficiently validated therapy recommendation for the treatment of an infection with multi-resistant germs. What do we conclude from this? Apple’s iPhone was ridiculed in the mid-2000s because of its (then) shortcomings and the capacitive display (touchscreen). We know the further development. So, let’s not underestimate this breakthrough we are currently experiencing. But let’s also not overestimate it when we look at what lies ahead of us in socio-technical terms. And: Europe continues to stand for a value-oriented approach in dealing with new technologies and their use in sensitive economic and social areas. Our ethical, legal and technical standards are among the most demanding in the world. This can be a disadvantage because other economic regions bring AI-based products and services to market faster and more pragmatically. But it can also be an opportunity to make “AI made in Europe” a global standard of quality and responsibility based on open European language models such as Open GPT-X ( read more).


Wolfgang Nejdl is Managing Director of the L3S Research Centre.

Henning Wachsmuth is a member of L3S and head of the NLP department at the Institute for Artificial Intelligence at Leibniz Universität Hannover.