L3S Best Publication of the Quarter (Q4/2024 – Q1/2025)
Category: Explainable AI, RL
Explainable Reinforcement Learning via Dynamic Mixture Policies
Authors: Maximilian Schier, Frederik Schubert, Bodo Rosenhahn
Published in: IEEE
The paper in a nutshell:
Our paper introduces a novel approach to reinforcement learning (RL), a form of machine learning where agents learn to make decisions based on environmental feedback. This research area is of great interest for e.g. autonomous driving and robotics, as RL makes it possible to learn complex strategies for decision-making and controlling a system just by specifying a goal. The method enhances the understandability of RL policies by employing a mixture of distributions, making them explainable by design. It breaks down observations into sub-spaces, each with its own policy, delivering clear, faithful, and easy-to-read explanations with low cognitive load. Our method not only maintains competitive performance but, in some cases, surpasses standard policies in various scenarios, such as automotive applications, ensuring a transparent and trustworthy decision-making process.
Which problem do you solve with your research?
Our research addresses the problem of the lack of transparency and explainability in reinforcement learning (RL) policies, which hinders trust and understanding in their decision-making process in real-world applications.
What is new about your research?
We introduce a new design for stochastic policies in reinforcement learning that utilizes mixture distributions to create policies that are explainable by design, breaking down observations into sub-spaces with clear, component-level explanations. This innovation enhances the transparency of RL decision-making.
What is the potential impact of your findings?
The potential impact of the findings is significant, as they could lead to wider real-world adoption of reinforcement learning by enhancing trust and accountability through transparent and explainable decision-making. Our approach could improve safety and compliance in critical applications such as autonomous vehicles and robotics, where understanding decision processes is crucial.
Paper link: tnt.uni-hannover.de/papers/data/1769/ICRA_2025-4.pdf