ExCalibr: Pascal Wallisch’s Award-Winning System for Educators to Calibrate Exams

NYU Center for Data Science
3 min readApr 26, 2022
Pascal Wallisch, CDS Clinical Associate Professor

Congratulations to CDS Clinical Faculty Member Pascal Wallisch for winning NYU’s Arts and Science Teaching Innovation Award. The Arts and Science Teaching Innovation award honors faculty members who provide dynamic learning experiences and leverage technology-enhanced solutions to teach students. Award winners are invited to participate in the Interactive Teaching Innovation Conference to present their findings to the NYU community.

This year, Wallisch introduces ExCalibr, his algorithmic exam calibration system for measuring student understanding fairly. He describes the problem as follows: “the problem with all conventional exams is that we don’t know how fair any given question is, i.e. how students understand any given question. The question is made in the mind of an expert — the professor — but the professor is afflicted by a terrible curse — the curse of knowledge, so the professor might not be able to intuit how students understand the question.” Yet, for lack of students with known abilities, exam questions remain uncalibrated.

In most circumstances, the assumption is that questions are fair until a student brings up a concern and then argues with the professor about the exact phrasing of each question. However, this process is often reminiscent of attrition warfare, resulting in negotiated settlement — which leads to terribly convoluted, carefully hedged and bloated questions, which might further compromise the purpose of the exam, which is to measure student understanding. To overcome this dilemma, Wallisch created a third alternative: algorithmically and iteratively recalculating a student’s score based on the ability of an individual question to predict student success on the whole exam, until the process converges.

What exactly does this mean? The algorithm starts with an equal weighing of the questions, as in a regular exam. It then calculates the score of a given student without a particular exam question. The weight of that question is then assigned as the discriminability index — the ability to distinguish good from bad students on the rest of the exam. This is done for all questions, and the exam is regraded with the new weighted scores iteratively until the scores converge and the student score stabilizes. As a result of this process, each question is calibrated — truly unfair questions — will receive a weight of zero and are thus automatically eliminated.

His paper will be released in the next few months.

About Professor Wallisch

Pascal Wallisch received his PhD from the University of Chicago in late 2007. Subsequently, he worked as a research scientist at the Center for Neural Science at New York University, and now serves as a Clinical Associate Professor in the Department of Psychology and the Center for Data Science. He has received the Golden Dozen and Teach/Tech Awards during his tenure at NYU and a Booth Prize for Excellence in Teaching during his time at UChicago. His current research involves understanding music preference (specifically, emotional response to certain genres) in relation to seasonality. Learn more about his work from his laboratory, the Fox Lab.

--

--

NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.