CDS at ICLR 2024: A Showcase of Cutting-Edge Research

NYU Center for Data Science
5 min readJun 12, 2024

--

New techniques in deep learning are revolutionizing how we understand and interpret the world around us. This year’s International Conference on Learning Representations (ICLR) in Vienna showcased cutting-edge research from experts across the globe, including several CDS members. The conference provided a platform for these researchers to showcase their work, learn from their peers, and contribute to the advancement of deep learning.

CDS Professor of Computer Science and Data Science Kyunghyun Cho

One of the keynote speakers at the conference was Kyunghyun Cho, who delivered a talk titled “Machine Learning in Prescient Design’s Lab-in-the-Loop Antibody Design.” Cho, affiliated with both CDS and Genentech (Prescient Design), also co-authored a paper that won one of the 5 Outstanding Paper Awards, “Protein Discovery with Discrete Walk-Jump Sampling.” This paper addresses challenges in training and sampling from discrete generative models, which are used to create new proteins. The authors introduced a new method called Discrete Walk-Jump Sampling, which combines two existing techniques to improve the quality of the generated samples. They tested this method on antibody proteins, showing that it produced high-quality, functional designs that worked well in laboratory experiments.

Angelica Chen, a CDS PhD student, presented her Spotlight (top 5%) paper “Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs.” Chen’s work, a collaboration with, among others, CDS Faculty Fellow Ravid Shwartz-Ziv, CDS Professor of Computer Science and Data Science Kyunghyun Cho, and CDS alumnus Naomi Saphra, explores the training dynamics of interpretability artifacts in language models. “I loved hearing about the different approaches that other researchers had taken towards the same general research question,” Chen said. She noted the growing interest in understanding and analyzing the safety training of LLMs, but expressed a desire for more robust evaluation systems grounded in real-world usage.

CDS Alumna Naomi Saphra (left) and CDS PhD Student Angelica Chen (right)

CDS PhD student William Merrill presented his paper “The Expressive Power of Transformers with Chain of Thought.” Merrill noted the interesting discussions around state-space models vs. transformers, and how communication complexity, a framework in computer science, can be used to understand the limitations of these models. He found the interdisciplinary nature of the “Mathematical and Empirical Understanding of Foundation Models” workshop particularly valuable, connecting with students who share similar research interests. He also highlighted CDS Associate Professor of Computer Science and Data Science Andrew Gordon Wilson’s talk at the “How Far are We From AGI?” workshop as a standout.

CDS PhD Student William Merrill

Vishakh Padmakumar, another CDS PhD student, presented his work “Does Writing With Language Models Reduce Content Diversity?”, co-authored with his advisor He He. Padmakumar’s research centers on natural language processing and human-AI collaboration, with a focus on collaborative text generation for creative writing tasks.

CDS PhD Student Vishakh Padmakumar

Zahra Kadkhodaie, also a CDS PhD student, gave a talk on her paper (top 1.2%), “Generalization in diffusion models arises from geometry-adaptive harmonic representations,” co-authored with CDS Faculty Fellow Florentin Guth, CDS Professor of Neural Science, Mathematics, Data Science, and Psychology Eero Simoncelli, and Stéphane Mallat, ENS professor of mathematics, which was another of 5 Outstanding Paper winners at the conference. Kadkhodaie highlighted several papers by others from the conference that she found intriguing, including “Idempotent Generative Network” from Alexei A. Efros’ lab at UC Berkeley, “What’s in a Prior? Learned Proximal Networks for Inverse Problems” by Zhenghan Fang et al., and Xiaosen Zheng et al.’s “Intriguing Properties of Data Attribution on Diffusion Models.”

CDS PhD Student Zahra Kadkhodaie

Adding to the impressive lineup of CDS presenters, CDS PhD student Yunzhen Feng showcased his work in the main conference. Along with co-authors Shanmukha Ramakrishna Vedantam and CDS Silver Professor of Computer Science, Mathematics, and Data Science Julia Kempe, Feng presented their paper “Embarrassingly Simple Dataset Distillation.”

The conference’s workshops also saw significant contributions from CDS members. Elvis Dohmatob, Yunzhen Feng, Pu Yang, Francois Charton, and Julia Kempe presented “A Tale of Tails: Model Collapse as a Change of Scaling Laws” at the Workshop on Navigating and Addressing Data Problems for Foundation Models (DPFM).

At the Workshop on Bridging the Gap Between Practice and Theory in Deep Learning (BGPT), CDS was well-represented with two presentations. CDS PhD student Nikolaos Tsilivis, along with Natalie Frank and Julia Kempe, shared their findings on “The Best Algorithm for Adversarial Training.” In the same workshop, Elvis Dohmatob, Yunzhen Feng, and Julia Kempe delved into theoretical aspects of deep learning with their presentation “Towards a Theoretical Understanding of Model Collapse.”

Some CDS members had their work in the conference but were not able to attend. Yucen Lily Li, a Courant PhD student, presented her work “A Study of Bayesian Neural Network Surrogates for Bayesian Optimization,” a collaboration with CDS Instructor and Faculty Fellow Tim G. J. Rudner and Andrew Gordon Wilson. Additionally, recent CDS PhD alumnus Artem Vysogorets’s “Towards Efficient Active Learning in NLP via Pretrained Representations” paper was presented at the Data-centric Machine Learning Research workshop, and he and Julia Kempe presented “Towards Robust Data Pruning” at the same workshop.

CDS members’ experiences at ICLR revealed a few key trends in the field. Many researchers were focusing on understanding the capabilities and limitations of language models. Specifically, the conference highlighted the growing interest in state-space models and their comparison to transformers, as well as the use of communication complexity to analyze these models’ limitations. Additionally, the interdisciplinary nature of the workshops and talks at ICLR underscored the importance of collaboration and knowledge-sharing across different fields to advance the understanding and application of deep learning techniques.

Beyond the research, the CDS members also enjoyed the opportunity to connect with each other and explore Vienna. Merrill fondly recalled visiting the Albertina Art Museum with his fellow NYU attendees. The conference not only provided a platform for showcasing cutting-edge research but also fostered a sense of community and camaraderie among the attendees.

Vishakh Padmakumar (CDS PhD Student), Angelica Chen (CDS PhD Student), Naomi Saphra (CDS Alumna), and William Merrill (CDS PhD Student)

--

--

NYU Center for Data Science
NYU Center for Data Science

Written by NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.