Enabling Researchers to Automatically Code their Text Responses from Assessments
Exchange Your ideas on Text Classification


Fabian Zehner, DIPF Frankfurt Nico Andersem, DIPF Frankfurt


When test persons respond to test questions with natural language, the responses typically need to be scored by humans. We use natural language processing techniques and embeddings to represent word semantics for grouping (i.e., clustering) and classifying responses. This pipeline is available via the R-based Shiny app of ReCo (shinyReCoR) through a graphical user interface to people without knowledge of how to use NLP or ML coding frameworks. Therefore, our app focuses on making the classification process transparent and offers many interactive ways to diagnose resulting models/classifiers, among others through visualizing the semantic space.
At the Coding.Waterkant, we plan to develop two new diagnostic and visualization features.
Literature Andersen, N., & Zehner, F. (2021). shinyReCoR: A Shiny Application for Automatically Coding Text Responses Using R. Psych, 3(3), 422–446. doi: 10.3390/psych3030030 Zehner, F., Sälzer, C., & Goldhammer, F. (2016). Automatic Coding of Short Text Responses via Clustering in Educational Assessment. Educational and Psychological Measurement, 76(2), 280–303. doi: 10.1177/0013164415590022


We use a publically released demo data set from ReCo with about 4,000 text responses to a demo item for demonstration purposes. For internal purposes, we work with confidential text responses from the PISA assessment.

How you can contribute

We are happy to exchange ideas on text classification and how to provide easy access to current NLP and ML methods.