Skip to main content
Corona: Collecting data for predictive models

Politicians around the world are currently trying to contain the corona virus by, among other things, severely restricting the freedom of movement. The progression of an epidemic is influenced by numerous factors. Scientists are therefore called upon to develop new prediction models for the spread of the virus. The models are based on data - for example on case and test numbers or on the mobility of the population. SoBigData, the European research infrastructure for big data and social mining, has launched numerous activities to collect and analyse recently published data on the epidemic - including news, tweets and publicly available data sets.

One of the main efforts is focused on collecting data for the "Epidemic Datathon", which colleagues from ETH Zurich are currently organising. Participants from all over the world are invited to develop models based on publicly available data. The aim of the Datathon is to better understand the dynamics of the epidemic. The term "Datathon" is derived from "Hackathon" and refers to a challenge in which participants use data to find new solutions to existing problems within a short period of time.

Participants can verify the accuracy of the predictions several times after a few days and weeks using real data, for example by comparing the predicted number of infected people in a country with the actual number of cases. This enables them to determine which of their models are most effective.



Prof. Dr. Avishek Anand

L3S member Avishek Anand is project manager for SoBigData at L3S.