Can an artificial intelligence (AI) recognise whether someone is happy, stressed or embarrassed – and react empathetically? This is precisely the aim of the EmoNet project, an open platform for the development of emotional AI that better understands human emotions and incorporates them into its reactions.
The TIB – Leibniz Information Centre for Science and Technology and the L3S Research Center are supporting this pioneering project of the non-profit association LAION e. V. and the technology company Intel with their scientific expertise. Their contribution: analysing and evaluating models as well as their know-how in neuro-symbolic AI – a technology that combines machine learning with logical thinking.
”With EmoNet, we are making an important contribution to making artificial intelligence more human, more empathetic and more accessible. The combination of our research on symbolic AI with the innovative methods of LAION shows how important open science and interdisciplinary cooperation are,” explains Professor Sören Auer, Director of the TIB and member of the Board of Directors of the L3S Research Center.
What is emotional AI?
Emotional AI – also known as affective computing – comprises systems that can recognise and react to emotions in speech, facial expressions and behaviour. The aim is not only to recognise simple emotions such as anger or joy, but also to gain a deep understanding of states such as embarrassment, tiredness or concentration.
Possible applications: Where emotional AI can help
The use of emotional AI enables systems to recognise human emotions and respond to them empathetically – significantly improving interaction with people. The possible applications of emotional AI are diverse, as the following examples show:
- Learning apps that recognise when students are overwhelmed and adapt.
- Health applications that recognise early signs of stress or depression.
- Virtual assistants that speak reassuringly when anxiety or insecurity is recognised.
EmoNet: the technical framework
The EmoNet project is an open research initiative that provides an extensive collection of data sets and models that enable AI systems to recognise human emotions in speech and facial expressions and respond to them empathetically. Central components of the project are
- over 200,000 synthetically generated faces with emotional expressions,
- more than 5,000 hours of computer-generated speech in four languages and numerous dialects,
- a finely graded classification of 40 emotional states, from shame to pride to pain,
- careful emotional categorisation by experts from the field of psychology – in both image and sound.
The models developed – Empathic Insight-Face and Empathic Insight-Voice – already achieve image emotion recognition that is superior to many commercial systems. Challenges remain in the differentiated recognition of culturally characterised or mixed emotional states.
”The cooperation with the TIB was a great benefit for EmoNet. Thanks to their expertise, we were able to reliably evaluate complex emotional contexts. We look forward to continuing this fruitful collaboration – for more empathic AI research that is open, safe and usable for everyone,” says Christoph Schuhmann, Chairman of LAION e. V.
Hands-on research: EmoNet is openly accessible
The EmoNet models are made freely available via the Hugging Face platform and can be tested directly in the browser or using a Colab notebook. Developers, researchers and interested parties are invited to join in and further develop the systems.
Outlook: Emotional AI for a more empathetic digital future
The joint project between LAION, TIB, L3S and Intel shows how open science and interdisciplinary research can lead to tangible progress. The joint lab of TIB and L3S supported the LAION project by providing scientific advice, analysing experiments and using neuro-symbolic methods.
LAION, TIB and L3S are working together with partners to create multimodal AI systems that can understand speech, facial expressions and context holistically. Digital systems that not only function, but also empathise and respond better to human needs and emotions, form the basis for a more empathetic digital future.
One of the goals of the next phase is to build an open, multilingual data collection with over 500,000 hours of synthetic speech – open and licence-free for use in research, education and society.
More information on the LAION blog
Source: TIB