Unlocking the Truth in Social Science Data: A New Approach to Detecting Real Changes

NYU Center for Data Science
3 min readFeb 23, 2024

In the imperfect real-world data collection conditions of social science, separating genuine insights from the noise can be a daunting challenge. Instead of controlled experiments, social scientists often have to rely on observational data to infer causality. One method that has become standard for this purpose is regression discontinuity designs (RDDs) — but the task is difficult because of noisy data. To this end, Bryant J. Moy, a CDS Faculty Fellow and Visiting Assistant Professor of Politics, alongside Jiaxu Ren, an alumnus of NYU Tandon Engineering & the Courant Institute of Mathematics, argue that Change Point Analysis (CPA) can help applied researchers in these challenging settings. Their recent poster on this topic, “Improving Confidence in RDDs: Discontinuity Detection through Bayesian Change Point Analysis,” received the Best Poster Prize at Asian & MENA PolMeth.

Moy and Ren’s work introduces CPA as a transformative tool within the RDD framework, aiming to assess the evidence for a discontinuity existing at the point of interest. “The core challenge in social science is trying to estimate small treatment effects with very noisy data,” Moy said. Their methodology seeks not only to scrutinize the traditional application of RDDs but also to pave the way for more credible interpretations of data findings.

Moy and Ren searched for two case studies to illustrate the benefits of the CPA approach to RDD. The first one, using a 2021 paper, finds a surprisingly large 5 to 10 year treatment effect of narrowly winning an election. While others have raised general skepticism of the paper’s findings due to power concerns, the NYU researchers provide a more helpful framework to analyze the evidence of discontinuities in the data. Examining the raw data using a Bayesian Change Point Analysis, Moy and Ren find evidence of multiple discontinuities and a low probability that a discontinuity exists at the theorized location pointing towards noisy data.

Conversely, in their second case study, they analyzed data from a paper on the effect of George Floyd’s murder on public opinion. The paper finds that George Floyd’s murder and the protest it sparked led to an increase in beliefs that discrimination against Black people is a problem. Analyzing a subset of the author’s data, Moy and Ren find clear and convincing evidence that a single discontinuity in time — George Floyd’s murder — changed opinions about discrimination.

By employing CPA to detect significant breakpoints without prior assumptions about the data, Moy and Ren challenge the conventional RDD methodology. This approach promises to peel back the layers of ambiguity, revealing clearer, more accurate patterns within the data. “This is me attempting to try to answer what [Columbia Professor of Statistics and Political Science] Andrew Gelman is talking about when he critiques analysis with noisy data,” Moy explained, signaling a move towards a more rigorous examination of evidence in social science research.

Their work calls for a paradigm shift in how researchers approach RDD analysis. By demonstrating CPA’s effectiveness in identifying meaningful discontinuities, Moy and Ren not only enhance the validity of RDD findings but also ignite a broader discussion on evidentiary standards in research. Advocating for a Bayesian approach, their study encourages a meticulous, evidence-based analysis, marking a significant advancement in the pursuit of credible social science research.

Moy and Ren advocate for a comprehensive strategy: visualizing raw data, conducting CPA, and then proceeding with RDD, albeit with an eye towards the descriptive evidence found in the CPA. If adopted, their techniques would increase the trustworthiness of social science, leading to more dependable findings. It’s an ambitious and worthy vision.

By Stephen Thomas

--

--

NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.