Multi-armed bandits for the Recommender system for returns reduction

Development of a contextualized MAB (multi-armed-bandit) to exploit the potential of the recommender system by means of reinforcement learning by means of reinforcement learning

In Germany, e-commerce sales for clothing are increasing, driven by fit, comfort and value for money. The coronavirus pandemic has boosted online orders, but returns cause high costs and CO2 emissions. The fashion industry contributes over 10% of global CO2 emissions and must reduce these by over 50% by 2030 to meet the 1.5°C target. Germany leads the European comparison with up to 50% returns. Laws such as the “duty of care” are intended to counteract the destruction of returned goods, but could mean a bureaucratic burden for retailers. Data-driven solutions, in particular the use of reinforcement learning, offer a preventative alternative for reducing returns. An MAB concept (multi-armed bandit) is being developed to optimize the recommender system using reinforcement learning. The aim is to automatically adapt to changing market situations in order to increase revenue per user by 8-20% and reduce the returns rate by 10%. The cooperation with the Institute for Information Processing (TNT) at Leibniz Universität Hannover promotes innovation and knowledge transfer. In the future, the recommender system is to be extended to other e-commerce sectors in order to create jobs and strengthen the regional economy in Lower Saxony.

Funding program