Skip to main content

Methods for trustworthy application of Large Language Models in PER

Cornell Affiliated Author(s)

Author

Rebeckah Fussell
Megan Flynn
Anil Damle
Natasha Holmes

Abstract

Within physics education research (PER), a growing body of literature investigates using natural language processing machine learning algorithms to apply coding schemes to student writing. The aspiration is that this form of measurement may be more efficient and consistent than similar measurements made with human analysis, allowing larger and broader data sets to be analyzed. In our work, we are harnessing recent innovations in Large Language Models (LLMs) such as BERT and LLaMA to learn complex coding scheme rules. Furthermore, we leverage methods from uncertainty quantification to help understand the trustworthiness of these measurements. In this talk, I will demonstrate a successful application of LLMs to measure experimental skills in lab notes and apply our methodology to evaluate the statistical and systematic uncertainty in this form of algorithm measurement. This work is supported by NSF Grants #2000739 and #1808945.

Date Published

Conference Name

APS April Meeting 2024

URL

https://ui.adsabs.harvard.edu/abs/2024APS..APRB16004F

Group (Lab)

Natasha Holmes Group

Download citation