As part of her project work in the MA Computer Science at Heinrich Heine University Düsseldorf, Suzan Padjman worked on the development of methods for the automated recognition of textually described location information in participation procedures.
In the context of the mobility transition, consultative processes are a popular tool for giving citizens the opportunity to represent and contribute their interests and concerns. Especially in the case of mobility-related issues, an important analysis aspect of the collected contributions is which locations (e.g. roads, intersections, cycle paths or footpaths) are problematic and in need of improvement in order to promote sustainable mobility. Automated identification of such locations has the potential to support the resource-intensive manual evaluation.
The aim of this work was therefore to find an automated solution for identifying locations using methods from natural language processing (NLP). For this purpose, a location was defined as the description of a specific place of a proposal, which could be marked on a map. Examples of locations are street names, city districts and clearly assignable places, such as “in the city center” or “at the exit of the main train station”. Pure descriptions without reference to a specific place were not considered as locations. Methodologically, the task was regarded as a sequence labeling task, as locations often consist of several consecutive tokens, so-called word sequences.
A comparison of different models (spaCy NER, GermanBERT, GBERT, dbmdz BERT, GELECTRA, multilingual BERT, multilingual XLM-RoBERTa) on two German-language participation datasets on cycling infrastructure in Bonn and Cologne Ehrenfeld showed that GermanBERT achieves the best results. This model can recognize tokens that are part of a textual location description with a promising macro F1 score of 0.945. In future work, it is planned to convert the recognized text phrases into geocoordinates in order to depict the recognized location of citizens’ proposals on a map.
Padjman, Suzan (2021): Unterstützung der Auswertung von verkehrsbezogenen Bürger*innenbeteiligungsverfahren durch die automatisierte Erkennung von Verortungen. Projektarbeit am Institut für Informatik, Lehrstuhl für Datenbanken und Informationssysteme, der Heinrich-Heine-Universität Düsseldorf. (Download)