Generative Grading: Near Human-level Accuracy for Automated Feedback on Richly Structured Problems

Malik, Ali; Wu, Mike; Vasavada, Vrinda; Song, Jinpeng; Coots, Madison; Mitchell, John; Goodman, Noah; Piech, Chris

Computer Science > Machine Learning

arXiv:1905.09916 (cs)

[Submitted on 23 May 2019 (v1), last revised 23 Mar 2021 (this version, v5)]

Title:Generative Grading: Near Human-level Accuracy for Automated Feedback on Richly Structured Problems

Authors:Ali Malik, Mike Wu, Vrinda Vasavada, Jinpeng Song, Madison Coots, John Mitchell, Noah Goodman, Chris Piech

View PDF

Abstract:Access to high-quality education at scale is limited by the difficulty of providing student feedback on open-ended assignments in structured domains like computer programming, graphics, and short response questions. This problem has proven to be exceptionally difficult: for humans, it requires large amounts of manual work, and for computers, until recently, achieving anything near human-level accuracy has been unattainable. In this paper, we present generative grading: a novel computational approach for providing feedback at scale that is capable of accurately grading student work and providing nuanced, interpretable feedback. Our approach uses generative descriptions of student cognition, written as probabilistic programs, to synthesise millions of labelled example solutions to a problem; we then learn to infer feedback for real student solutions based on this cognitive model.
We apply our methods to three settings. In block-based coding, we achieve a 50% improvement upon the previous best results for feedback, achieving super-human accuracy. In two other widely different domains -- graphical tasks and short text answers -- we achieve major improvement over the previous state of the art by about 4x and 1.5x respectively, approaching human accuracy. In a real classroom, we ran an experiment where we used our system to augment human graders, yielding doubled grading accuracy while halving grading time.

Comments:	10 pages of content
Subjects:	Machine Learning (cs.LG); Computers and Society (cs.CY); Machine Learning (stat.ML)
Cite as:	arXiv:1905.09916 [cs.LG]
	(or arXiv:1905.09916v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1905.09916

Submission history

From: Ali Malik [view email]
[v1] Thu, 23 May 2019 20:47:22 UTC (4,411 KB)
[v2] Fri, 13 Sep 2019 08:39:18 UTC (4,489 KB)
[v3] Fri, 25 Sep 2020 05:38:42 UTC (4,494 KB)
[v4] Fri, 19 Mar 2021 20:08:08 UTC (4,666 KB)
[v5] Tue, 23 Mar 2021 18:07:20 UTC (4,666 KB)

Computer Science > Machine Learning

Title:Generative Grading: Near Human-level Accuracy for Automated Feedback on Richly Structured Problems

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Generative Grading: Near Human-level Accuracy for Automated Feedback on Richly Structured Problems

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators