According to estimates from the World Biodiversity Council, nearly a million plant and animal species are threatened by extinction. The situation may be more dire, since exact statistics are not available. However, there are efforts from the scientific community in documenting the plant species occurrences in different regions, which are published in scientific magazines. These efforts, in some cases, date to the XVth century, and for a large plant species base, can be found from the XIXth century onwards. In this project, we plan to digitize such scientific articles, and extract tables, which contain the plant species occurrence, documented in different locations and points in time. We aim at addressing several issues from the technical point of view, such as extraction and interpretation of the diversity of tables, present in such articles; aligning relevant tables, and finally exposing the extracted data in knowledge graphs for further use and alignment with existing efforts from the botanical community. Finally, we investigate research questions that analyze the plant species occurrence and the corresponding trends on the longitudinal and spatial axis.
Preserving and enriching information about biodiversity, published in scientific articles or other relevant sources of information, is a crucial factor in devising informed policies and decisions that can be used to address issues, from biodiversity protection, to detecting diminishing trends of species occurrences that might be correlated to climate and other environmental factors. Our main aim is to make such scientific output regarding plant species occurrences accessible, enrich them with domain knowledge, and interlink them such that they can be used from a wide user group that does not necessarily possess the domain knowledge required to parse through such content. We will do so according to the FAIR principles, a common data publishing standard. Finally, using the extracted data we will conduct a longitudinal and spatial analysis of plant species occurrence, devising factors that may influence the increasing or diminishing of specific plants across time and regions.