Neural Data Science: A CSHL Summer School Helping Neuroscientists Find The Signal In The Noise
Neuroscientists today can record more data in a single second than a typical neuroscientist thirty years ago would record in an entire lifetime. But data does not interpret itself, and without a proper toolkit, all these petabytes of data are useless. A summer school named “Neural Data Science,” led by Pascal Wallisch, CDS Clinical Associate Professor of Data Science and Psychology, and Mark Reimers, an associate professor at Michigan State University, teaches that toolkit.
The Neural Data Science summer school, held at the Cold Spring Harbor Laboratory every other year since 2015, is an intensive two-week course that aims to equip neuroscience PhDs and postdocs with the skills to handle large neural datasets. The course is designed to address the growing need for data science skills in neuroscience, a field that is increasingly dealing with large and high-dimensional datasets. As Wallisch puts it, “you will not understand the brain one neuron at a time. You need to look at more neurons. And once you do, you have to find a way to handle that. In a way, ‘more neurons, more problems’.”
The course is designed to handle this deluge of data. It is structured around lectures, coding sessions, and one-on-one mentoring, with a strong emphasis on community building and continuous learning. It also draws some of the world’s most talented PhD students and postdocs, and pairs them with some of the world’s most accomplished faculty.
Kate Hong, now an assistant professor at Carnegie Mellon University, took the course in 2015 when she was a postdoc at Columbia. “It was an ideal time for me to take the course because I was just starting to collect high-speed imaging and video tracking data that required analytical tools outside of my area of expertise.”
In an interview, Wallisch laid out how the course bridges the gap between data science and neuroscience. For eighty years, he explained, neuroscientists predominantly recorded one neuron at a time, recording — and trying to infer insights from — spike times. Within the last ten years, however, there’s been a revolution in data recording, and now it’s common to get data from thousands of neurons concurrently. However, while the ability to record from more neurons is advancing rapidly, the ability to extract insights from these large datasets is lagging. This is where the role of data science becomes crucial.
Wallisch’s approach to teaching is rooted in his belief that data does not speak for itself; one must have the skills and knowledge to interpret it correctly. This philosophy is reflected in the course’s focus on teaching students how to handle big neural data and extract meaningful insights from it. To do this, they provide students with a set of real data from 1300 neurons and teach them to interpret it using a range of cutting-edge tools and methods, including kiloSort, DeepLabCut and UMAP.
The days are long and intense, often stretching into the late hours of the night, but the immersive nature of the course allowed for deep discussions and a strong sense of camaraderie among the participants. Stefanos Stagkourakis, a K99-funded postdoc now at Caltech who attended in 2019, spoke to the course’s intensity: “The inspiring lectures given by scientists who cared deeply for content transmission made it possible to maintain focus over long sessions.”
The Neural Data Science summer school is the beginning of a larger conversation. Wallisch and Reimers and their team — which this year included co-instructors Hadas Ben-Esti and Jennifer Sun, and Scholar-in-Residence Michael X. Cohen — stay in touch with students, tracking their progress and meeting up at conferences like the Society for Neuroscience meeting. This ongoing engagement fosters a sense of community and allows for continuous learning and collaboration.
In fact, this year the students visited the lab of a Neural Data Science alumna, Helen Hou (class of 2017), who is now an assistant professor at Cold Spring Harbor Laboratory. “They got to see all the cutting edge analysis methods they just learned in class in action in a new research lab,” said Hou, “and how we use these methods to make sense of complex behavior and neural data.”
Wallisch noted that the course happens to be perfectly in line with the goals CDS has of advancing the methods of data science and their application to all areas of society and knowledge. “So I’m actually fulfilling CDS’s founding mission of ‘arming researchers and professionals with tools to harness the power of big data’,” said Wallisch, “even while I’m not here at NYU.”
It’s an exciting time to be at the intersection of data science and neuroscience, and we look forward to seeing the impact of Wallisch and Reimer’s course on real-world discoveries.
By Stephen Thomas