# Enabling Researchers to Automatically Code their Text Responses from Assessments

### Contact

Fabian Zehner, DIPF Frankfurt\
Nico Andersem, DIPF Frankfurt

### Description

When test persons respond to test questions with natural language, the responses typically need to be scored by humans. We use natural language processing techniques and embeddings to represent word semantics for grouping (i.e., clustering) and classifying responses. This pipeline is available via the R-based Shiny app of [ReCo](https://www.reco.science/) (shinyReCoR) through a graphical user interface to people without knowledge of how to use NLP or ML coding frameworks. Therefore, our app focuses on making the classification process transparent and offers many interactive ways to diagnose resulting models/classifiers, among others through visualizing the semantic space.

At the Coding.Waterkant, we plan to develop two new diagnostic and visualization features.

**Literature**\
Andersen, N., & Zehner, F. (2021). shinyReCoR: A Shiny Application for Automatically Coding Text Responses Using R. *Psych, 3*(3), 422–446. doi: 10.3390/psych3030030\
Zehner, F., Sälzer, C., & Goldhammer, F. (2016). Automatic Coding of Short Text Responses via Clustering in Educational Assessment. *Educational and Psychological Measurement, 76*(2), 280–303. doi: 10.1177/0013164415590022

### Dataset

We use a publically released demo data set from ReCo with about 4,000 text responses to a demo item for demonstration purposes. For internal purposes, we work with confidential text responses from the PISA assessment.&#x20;

### How you can contribute

We are happy to exchange ideas on text classification and how to provide easy access to current NLP and ML methods.