Blending Theory and Utility: The Vision and Impact of CDS’s MaD Group
Mathematicians want beauty; engineers want to build useful things. Some people want both, and those people, if attending NYU, join the Math and Data research group at the Center for Data Science (CDS), which, thanks to the ever-broader applicability of AI, is now working on some of the most important problems currently facing humanity.
In a series of interviews, Joan Bruna, an associate professor at the Department of Computer Science, CDS, and Mathematics, and Qi Lei, an assistant professor at the Department of Mathematics and CDS, outlined the vision, significance, and groundbreaking research of this dynamic group.
The MaD Seminar and the Birth of MaD
In 2017, before the MaD group existed, Joan Bruna, Carlos Fernandez-Granda, incoming Interim CDS Director, and Afonso Bandeira (now at ETH Zurich), were colleagues at CDS and the NYU’s Courant Institute who recognized they had a shared background in harmonic analysis and were looking to build mathematical foundations for data science. “In the beginning,” said Bruna, “the MaD group was started just to give an umbrella so postdocs could feel identified with an area.”
The group, however, quickly became well-known for a seminar that still serves as its flagship: the MaD seminar. Bruna and the early organizers of the MaD group crafted this seminar to be a nexus of research on the theoretical foundations of data science and machine learning. Over the years, the MaD seminar has attracted many of the leading researchers of the field, and has consolidated as the primary tentpole for the group, enabling networking opportunities and in-depth scientific discussion.
Understanding the Math Behind Neural Nets
These scientific discussions branch off in many directions, but one common goal within the MaD group is to develop a mathematical understanding of neural networks. “We want to use neural networks for so many things, but you’re never going to be able to trust these models 100% if we don’t understand the mathematics that drive them,” said Bruna.
This is an enormously complex — and ambitious — endeavor. It can also be approached in a variety of ways. Bruna and his students do this by looking for mathematical guarantees, or proofs, of how algorithms in neural networks work. The idea is to shed light on the “fundamental mathematical principles that explain [neural nets’] behavior.”
Developing mathematical foundations of neural networks is a uniquely cross-disciplinary challenge that requires blending tools from a variety of fields, such as approximation theory, harmonic analysis, non-convex optimization, partial differential equations, or high-dimensional probability and statistics. These are topics where the MaD group, and more generally, the Courant Institute, has significant expertise.
Beyond The ‘Black Box’
Understanding the principles that explain the behavior of neural networks — or rather, our limitations in doing so — is sometimes called the “black box problem,” but Bruna explained the way the MaD group thinks about what they’re doing is a little different from what people usually mean by that. “Our goal is not really to understand the behavior of every single neuron inside this architecture,” said Bruna.
Bruna compared a neural net to gas in a room. “You can understand how steam works without attempting to model the trajectory of every single molecule of gas in that chamber, right?” Similarly, in neural networks, “what happens with every neuron is not something we should aim to understand, at least not in the first instance,” said Bruna. “In ‘trying to open the black box’, we should aim at the more realistic objective of trying to understand the collective behavior of the neurons.”
Likewise, what MaD researchers are doing is similar to what has come to be called interpretability research, but is also importantly different. Interpretability refers to providing a human interpretation of a decision rule of a neural network, but this is very hard to define mathematically, and attempts to do this are generally not aiming to be mathematically precise or rigorous. “This doesn’t mean that that’s not important,” said Bruna. But that kind of interpretability research is different from the goal of generating a theoretical — i.e. mathematical — understanding of the behavior of neural nets. This is, in part, because, as Bruna said, “a mathematical explanation is not necessarily a human explanation.”
Safety, Privacy, and the Future
Qi Lei, who joined the group last year, echoed the importance of working out a theoretical understanding of the behavior of neural networks, though for her the motivation is fairly specific, and has a tinge of urgency. Citing concerns about AI such as weaponization, deception, and power-seeking behavior, as well as privacy, security, and fairness, Lei said there are many reasons gaining a better understanding of neural networks should be a priority. With our growing dependence on AI systems, the need for theoretical guarantees is paramount. As further examples of concerns about consumer-facing large language models such as ChatGPT, Lei pinpointed the risk of inadvertent data leaks or reconstruction of user interactions.
Some of these problems do not seem obviously tractable with mathematical tools, said Lei, because, for example, decreasing hallucinations and malicious actions do not have a clear metric, but Lei is hopeful techniques can be found.
Research Endeavors: Marrying Mathematics with Practicality
Safe and ethical AI is only one real-world application of the MaD group’s research. Similarly urgent is a project by Joan Bruna, Carlos Fernandez-Granda, and their colleagues at Courant, who are using machine learning techniques to improve climate modeling. Harnessing mathematical frameworks like the Navier-Stokes equation used in fluid dynamics that underpin our understanding of climate systems, the group is working to overcome one of climate modeling’s most significant challenges: the immense computational power required to run the most sophisticated models. With simulations looking 30 years ahead currently unattainable due to resource limitations, their project seeks to supercharge these processes. By integrating machine learning, they aim to expedite and refine these simulations without compromising on their precision.
Jonathan Niles-Weed, meanwhile, studies statistical optimal transport, a mathematical field rooted in the eighteenth century with striking modern applications. Originally inspired by a problem in civil engineering, optimal transport studies how to best allocate a collection of objects to different locations. This research area has “been completely transformed” by computational and theoretical advances in the last ten years and is now used across data science to study “objects” of many types — from computer-generated images of realistic faces to elementary particles in high-energy physics experiments and embryos developing in utero. Niles-Weed’s work on statistical optimal transport focuses on how these techniques can be used when only noisy or incomplete information is available.
Uniquely MaD: Theory, Applied
In all the areas in which MaD group researchers work, Lei saw a common thread, noting the marriage of deep mathematical understanding and practical utility. While mathematicians tend to yearn for profound understanding, and machine learning practitioners for tangible outcomes, the MaD group successfully bridges this gap, striving to meet aspirations from both worlds. This allows them not just to theorize but to design effective algorithms grounded in a robust theoretical framework.
Bruna agreed, stating, “We’re all happy working on beautiful mathematical problems. One of the appealing aspects of developing foundations of data science is the possibility of combining mathematical beauty with tangible, real-world applications.” Furthermore, the importance of the problems on which the group focuses “creates a very strong motivation to not compromise on the mathematics. We really need to understand [neural networks] at the fundamental level. I believe this joint purpose is the common DNA of our research as a group.”
The MaD group at NYU’s CDS stands as a testament to the incredible potential that emerges when the elegance of mathematics is married to the pragmatism of machine learning. Their commitment to understanding the core of complex algorithms, ensuring their safe application, and continuously pushing the boundaries of knowledge makes them an invaluable asset to the data science community.
By Stephen Thomas