Incredible Alumni: Orion Taylor leads NYU Public Safety Lab’s launch of US jail statistics database

The Jail Data Initiative offers a new resource for up-to-date data on incarceration trends

Orion Taylor, CDS Alumnus

New York University’s Public Safety Lab recently launched the Jail Data Initiative (JDI), a dashboard that displays jail statistics collected from more than 1,300 publicly available county jail rosters in the United States, about a third of all county jails nationally. The platform fills in gaps in the existing Bureau of Justice Statistics (BJS) data sets by providing information on daily changes in jail populations, bail amounts, categories of criminal charges, and other data groupings. No other resource has been able to provide daily data on county jail populations since the pandemic, making the NYU initiative an invaluable resource for policymakers, reporters, and researchers within the field of public safety.

Through the project’s development, the JDI has employed several CDS students including Orion Junius Taylor, now the Lead Data Scientist at NYU Public Safety Lab and a CDS MS program alumnus. To learn more about the initiative, CDS asked Orion about how the initiative got started and how the data has been used since.

What got you interested in data science?

I was working at a company called Medidata Solutions which sold a platform for clinical trials. I would hear peripherally about the work being undertaken by the data science team there, e.g., on analyzing sensor data, genomics, etc., and how it was being used to improve trial quality. I realized I wanted to be able to contribute to the technical side of meaningful work like that, so I started taking some math and programming courses at City Tech, then applied to the CDS MS program.

How did the JDI get started and how did you get involved?

The JDI started as an idea Professor Anna Harvey (CDS affiliated faculty) had about how to create better jail data for researchers and advocates. In 2019, there was no daily individual-level data on jail incarceration in the United States, but in about one-third of counties, jails posted daily online rosters in the public record. Prof. Harvey proposed a large-scale web-scraping operation to collect, structure, and analyze this data. She came to the CDS research fair in 2019 to look for students interested in helping write web scrapers, and I applied to join for the summer (through a CDS Moore-Sloan grant!). I was hoping to find something with potential for “social good,” and it fit the bill.

Seeing this project through development and being involved in it over a couple of years, what has the real-world impact of this work been and how has the data been used since?

The most rewarding aspect of this project has been seeing our data used to support research and advocacy endeavors. Our population data is used by organizations like Prison Policy Initiative and the Vera Institute, and bail funds regularly use our booking records to find people sitting in jails unable to afford bail across the country. Our work has also been used to assess the impacts of drug law reform, by immigration advocacy groups and by larger organizations like the ACLU.

When COVID was beginning to spread to overcrowded detention facilities, our data helped people ensure local accountability in decarceration efforts — the Council on Criminal Justice reported our work in “Report: COVID-19 and Jails” and various news outlets have used our data such as the New York Times in“The Coronavirus Has Found a Safe Harbor.” We’ve also been able to devote resources to some side projects along the way, e.g., analyzing NYPD misconduct complaints by precinct.

Where do you see this project going next?

The bulk of the work for this project was creating a scalable scraping, cleaning, and databasing architecture. We’ve also had the opportunity to work on frontend development, creating the Jail Data Initiative site with interactive dashboards and other features. As our data accumulates and our engineering is relatively stable, we can support more research endeavors. We just undertook a probabilistic record linkage program between our data and voter records for the 2020 general election, allowing us to assess causal relationships between jail incarceration and voting, and put out a working paper called “Voting From Jail” on this.

Do you have any advice for current CDS students who might be thinking about taking on projects like this?

It can be daunting to jump into a project like this, but you may wind up with a unique breadth of experience! I ultimately decided to work on the Jail Data Initiative full-time, and finish my CDS coursework part-time, and I’ve felt that my experience has been somewhat unique (and valuable) among my classmates in doing so. This project and my CDS coursework complemented each other, and made me become a more well-rounded data scientist than I otherwise would have been.

If you are interested in learning more about the site, one of the JDI’s funders The Pew Charitable Trusts will be hosting a virtual tutorial of the dashboard followed by a Q&A session on Thurs. November 10, at 12:00 pm EST. Sign up for the webinar through the Pew Showcase Novel Jail Data Dashboard Registration page.

By Meryl Phair



Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.