SisWiss
Secure Language Models for Knowledge Management
Companies and organisations often have knowledge bases spread across large document repositories, making it difficult and time-consuming to search for and combine information. Large Language Models (LLMs) such as ChatGPT1 have the potential to provide organisations with deep and intuitive insight into their own business processes by providing a language-based interface to such knowledge bases. This ‘communicating with your own data’ offers companies, especially those with no previous expertise in data analysis, the opportunity to keep pace with digitalisation and remain internationally competitive. However, as speech models allow all users inside and outside an organisation to access potentially sensitive and security-critical information, the use of these technologies must be strictly regulated and monitored; for example, sensitive internal company information should only be accessible via the model to selected employees. Intuitively, the language model as an ‘omniscient employee’ therefore requires a corresponding type of non-disclosure agreement that controls and ensures access to information.
However, since the models have no a priori understanding of what information is sensitive and what is not, this knowledge must be artificially supplemented. For this reason, the SisWiss project develops language models that
combine several user and security level-dependent personalities within the same model instance in order to be able to control differentiated access to sensitive information.
The SisWiss project aims to combine the development of theoretical approaches for the certification of an LLM with concrete security tests so that companies can use language models securely and in compliance with existing
and future EU legislation (EU AI Act2, EU Data Governance Act3).
Federal Ministry of Research, Technology and Space (BMFTR), “Sichere Zukunftstechnologien in einer hypervernetzten Welt: Künstliche Intelligenz”
- Laverana