Automated Topic Categorization of Citizens’ Contributions: Reducing Manual Labeling Efforts Through Active Learning

In this publication in Electronic Government, Julia Romberg and Tobias Escher investigate the potential of active learning for reducing the manual labeling efforts in categorizing public participation contributions thematically.

Abstract

Political authorities in democratic countries regularly consult the public on specific issues but subsequently evaluating the contributions requires substantial human resources, often leading to inefficiencies and delays in the decision-making process. Among the solutions proposed is to support human analysts by thematically grouping the contributions through automated means.

While supervised machine learning would naturally lend itself to the task of classifying citizens’ proposal according to certain predefined topics, the amount of training data required is often prohibitive given the idiosyncratic nature of most public participation processes. One potential solution to minimize the amount of training data is the use of active learning. While this semi-supervised procedure has proliferated in recent years, these promising approaches have never been applied to the evaluation of participation contributions.

Therefore we utilize data from online participation processes in three German cities, provide classification baselines and subsequently assess how different active learning strategies can reduce manual labeling efforts while maintaining a good model performance. Our results show not only that supervised machine learning models can reliably classify topic categories for public participation contributions, but that active learning significantly reduces the amount of training data required. This has important implications for the practice of public participation because it dramatically cuts the time required for evaluation from which in particular processes with a larger number of contributions benefit.

Key findings

  • We compare a variety of state-of-the-art approaches for text classification and active learning on a case study of three nearly identical participation processes for cycling infrastructure in the German municipalities of Bonn, Ehrenfeld (a district of Cologne) and Moers.
  • We find that BERT can predict the correct topic(s) for about 77% of the cases.
  • Active learning significantly reduces manual labeling efforts: it was sufficient to manually label 20% to 50% of the datasets to maintain the level of accuracy. Efficiency-improvements grow with the size of the dataset.
  • At the same time, the models operate within an efficient runtime.
  • We therefore hypothesize that active learning should significantly reduce human efforts in most use cases.

Publication

J. Romberg and T. Escher. Automated topic categorisation of citizens’ contributions: Reducing manual labelling efforts through active learning. In M. Janssen, C. Csáki,I. Lindgren, E. Loukis, U. Melin, G. Viale Pereira, M. P. Rodríguez Bolívar, and E. Tambouris, editors,Electronic Government, pages 369–385, Cham, 2022. SpringerInternational Publishing. ISBN 978-3-031-15086-9

Socio-spatial justice through public participation?

In this presentation at the AESOP (Assosiation of European Schools of Planning) annual Congress in 2022, Laura Mark, Katharina Huseljić and Tobias Escher introduced a framework of distributive socio-spatial justice and the way consultation procedures can contribute, before evaluating the case study Elbchaussee in Hamburg regarding socio-spatial justice, using qualitative and quantitative results. 

Abstract

Our current transport system exhibits significant socio-spatial injustices as it has both major negative environmental effects and structurally disadvantages certain socio-economic groups. Planning processes increasingly include elements of public participation, often linked to the hope of better understanding and integrating different mobility needs into the planning process. However, so far there is little knowledge on whether public participation results indeed in more socio-spatial justice.

To approach this question, we focus on socio-spatial justice as distributive justice and investigate how well consultative planning procedures do actually lead to measures that both contribute to sustainability (i.e. reduce or redistribute negative external effects) and cater for the needs of disadvantaged groups (e.g. those with low income or education, women and disabled people). To this end, we have investigated in detail the case study of the reconstruction of the Elbchaussee, a representative main road of citywide importance in the district of Altona in Hamburg, Germany. We are drawing on both qualitative and quantitative data including expert interviews and public surveys.  

We first show that the process did result in planning measures that contribute slightly to ecological sustainability. Second, in particular through improving the situation for pedestrians and cyclists as well as the quality of stay, the measures should contribute to more justice for some groups but this is recognized only by non-male groups. Beyond this there are no effects for people with low income, low education, those with mobility restrictions or with particular mobility needs often associated with these groups. Overall, we conclude that the consultative planning process provides only a small contribution to socio-spatial justice and we discuss potential explanations.

Key Findings

  • The consultative planning process as a whole resulted in measures that contribute slightly to socio-spatial justice, since they support the transition to more sustainable mobility and will benefit some disadvantages groups, though both to a limited degree.
  • We find that the consultation procedure had no significant influence on the policy. In terms of socio-spatial justice, no positive effects can be traced back to the consultation procedure. Notably, those that participated in the consultation did indeed report less satisfaction with the measures.
  • We trace those limited contributions back to some general features of consultation and the current planning system, but also find that in the case study the scope of possible influence was very limited due to external restrictions and power imbalances.

Publication

We are working on a publication for a peer-reviewed journal. The publication will be linked here as soon as it is published.

A Corpus of German Citizen Contributions in Mobility Planning: Supporting Evaluation Through Multidimensional Classification

In this publication in the Conference on Language Resources and Evaluation, Julia Romberg, Laura Mark and Tobias Escher introduce a collection of annotated datasets that promotes the development of machine learning approaches to support the evaluation of public participation contributions.

Abstract

Political authorities in democratic countries regularly consult the public in order to allow citizens to voice their ideas and concerns on specific issues. When trying to evaluate the (often large number of) contributions by the public in order to inform decision-making, authorities regularly face challenges due to restricted resources.

We identify several tasks whose automated support can help in the evaluation of public participation. These are i) the recognition of arguments, more precisely premises and their conclusions, ii) the assessment of the concreteness of arguments, iii) the detection of textual descriptions of locations in order to assign citizens’ ideas to a spatial location, and iv) the thematic categorization of contributions. To enable future research efforts to develop techniques addressing these four tasks, we introduce the CIMT PartEval Corpus, a new publicly-available German-language corpus that includes several thousand citizen contributions from six mobility-related planning processes in five German municipalities. The corpus provides annotations for each of these tasks which have not been available in German for the domain of public participation before either at all or in this scope and variety.

Key findings

  • The CIMT PartEval Argument Component Corpus comprises 17,852 sentences from German public participation processes annotated as non-argumentative, premise, or major position.
  • The CIMT PartEval Argument Concreteness Corpus consists of 1,127 argumentative text spans that are annotated according to three levels of concreteness: low, intermediate, and high.
  • Der CIMT PartEval Geographic Location Corpus consists of 4,830 locations and the GPS coordinates for 2,529 proposals from public consultations.
  • The CIMT PartEval Thematic Categorization Corpus relies on a new hierarchical categorization scheme for mobility that captures modes of transport (non-motorized transport: cycling, walking, scooters; motorized transport: local public transport, long-distance public transport, commercial transport) and a number of specifications, such as moving or stationary traffic, new services, and inter- and multimodality. In total, 697 documents have been annotated according to this scheme.

Publication

Romberg, Julia; Mark, Laura; Escher, Tobias (2022, June). A Corpus of German Citizen Contributions in Mobility Planning: Supporting Evaluation Through Multidimensional Classification. In Proceedings of the Language Resources and Evaluation Conference (pp. 2874–2883), Marseille, France. European Language Resources Association. https://aclanthology.org/2022.lrec-1.308

Corpus available under

https://github.com/juliaromberg/cimt-argument-mining-dataset

https://github.com/juliaromberg/cimt-argument-concreteness-dataset

https://github.com/juliaromberg/cimt-geographic-location-dataset

https://github.com/juliaromberg/cimt-thematic-categorization-dataset

Robust Methods for Classifying Argument Components in Public Participation Processes for Mobility Planning

In this publication in the Workshop on Argument Mining, Julia Romberg and Stefan Conrad address the robustness of classification algorithms for argument mining to build reliable models that generalize across datasets.

Abstract

Public participation processes allow citizens to engage in municipal decision-making processes by expressing their opinions on specific issues. Municipalities often only have limited resources to analyze a possibly large amount of textual contributions that need to be evaluated in a timely and detailed manner. Automated support for the evaluation is therefore essential, e.g. to analyze arguments.

In this paper, we address (A) the identification of argumentative discourse units and (B) their classification as major position or premise in German public participation processes. The objective of our work is to make argument mining viable for use in municipalities. We compare different argument mining approaches and develop a generic model that can successfully detect argument structures in different datasets of mobility-related urban planning. We introduce a new data corpus comprising five public participation processes. In our evaluation, we achieve high macro F1 scores (0.76 – 0.80 for the identification of argumentative units; 0.86 – 0.93 for their classification) on all datasets. Additionally, we improve previous results for the classification of argumentative units on a similar German online participation dataset.

Key findings

  • We conducted a comprehensive evaluation of machine learning methods across five public participation process in German municipalities that differ in format (online participation platforms and questionnaires) and process subject.
  • BERT surpasses previously published argument mining approaches for public participation processes on German data for both tasks, reaching macro F1 scores of 0.76 – 0.80 for the identification of argumentative units and macro F1 scores of 0.86 – 0.93 for their classification.
  • In a cross-dataset evaluation, BERT models trained on one dataset can recognize argument structures in other public participation datasets (which were not part of the training) with comparable goodness of fit.
  • Such model robustness across datasets is an important step towards the practical application of argument mining in municipalities.

Publication

Romberg, Julia; Conrad, Stefan (2021, November). Citizen Involvement in Urban Planning – How Can Municipalities Be Supported in Evaluating Public Participation Processes for Mobility Transitions?. In Proceedings of the 8th Workshop on Argument Mining (pp. 89-99), Punta Cana, Dominican Republic. Association for Computational Linguistics. https://aclanthology.org/2021.argmining-1.9

New working group on mobility, accessibility and social participation at the ARL – Academy for Spatial Development in the Leibniz Association

We are pleased to have Laura Mark participating in the mentioned working group to contribute to discussions with colleagues about our research. Practitioners and academics meet regularly in the working group to work on various topics related to mobility and social participation. The working group was started in mid-2021 and the substantive work is gradually taking shape: Intersections with our research include the question of procedural justice in planning procedures for the mobility transition – who participates and whose voices are heard? How should planning and participation processes be designed in the future for a sustainable mobility transition that includes everyone? We will report on the ongoing work, publications and events that emerge within the framework of this working group!

Results of the first practical workshop of the junior research group CIMT

Our first practical workshop in summer 2020 focused on the question of how the evaluation of citizen contributions can be technically supported and what requirements practitioners have for a software solution designed to (partially) automate the evaluation.

More information can be found in the working paper (German version only!):