Where do ideas come from? This is an important question to Itai Yanai, CDS affiliated professor and the Director at the Institute for Computational Medicine at NYU Langone Health. When many people think of science, creativity may not be the first thing that comes to mind. The Professor of Biochemistry and Molecular Pharmacology holds a different perspective, one in which he has started a new podcast, Night Science, to explore. What is night science? …
CDS Assistant Professor of Computer Science and Data Science, He He, recently gave a talk, “Guarding Against Spurious Correlations in Natural Language Understanding”, at the WING NUS Natural Language Processing (NLP) Seminar on July 7. The conference is held by the Web Information Retrieval / Natural Language Processing Group (WING) of the National University of Singapore. The group focuses on research in the areas of applied language processing as well as information retrieval to the world wide web and related technologies. The seminar is scheduled to run virtually (and tentatively) from May 20 to July 20, 2021.

“Guarding Against Spurious…

Oftentimes we are inclined to think about academic research as existing in one of two worlds: The world of the arts and humanities, and the world of science. However, thinking of these two areas as separate can be very limiting. In fact, it is at the intersection of these two worlds that some of the academic world’s most interesting discoveries have occurred.
Such is the case with a new type of visual illusion dubbed, the “Scintillating Starburst.” The illusion which can be best described as a series of rays of light that burst out from the center of a series…

Masked language models (MLM — a task that involves masking part of the input text then asking the model to predict the missing information) have become the presumed approach when it comes to processing text. A number of alternative approaches have recently been presented to enhance word representations with external knowledge sources. However, these models are designed and assessed in a monolingual setting only, which is limiting for obvious reasons. Iacer Calixto, a visiting academic currently working with researchers at CDS, has co-authored a project “Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks” that…

At CDS, we have no shortage of opportunities to feel proud of our students’ accomplishments. We are so happy to share in the celebration of one such accomplishment by PhD student Angelica Chen who was recently named an honorable mention in this year’s cohort of NSF Graduate Research Fellowship recipients.
The NSF Graduate Research Fellowship Program (or NSF GRFP) is a highly competitive award given by the National Science Foundation each year to outstanding graduate students in NSF-sponsored fields who are pursuing either a Master’s or doctoral degree at a US accredited education institution. Receipts of the award often go…
This entry is a part of the NYU Center for Data Science blog’s recurring guest editorial series. Tianshu Chu and Xinmeng Li are CDS MS students. Huy V. Vo is a PhD student from INRIA and Valeo.ai. Dr. Ronald M. Summers is a senior investigator in the Imaging Biomarkers and Computer-Aided Diagnosis Laboratory at the NIH Clinical Center. Dr. Elena Sizikova is a Moore-Sloan CDS Faculty fellow.

Two CDS masters students, Tianshu Chu and Xinmeng Li, Huy V. Vo, a PhD student from INRIA and Valeo.ai, and Dr. Elena Sizikova, a CDS Faculty fellow, recently collaborated with Dr. Ronald. M…

Sam Bowman, CDS Assistant Professor of Data Science & Linguistics, recently gave a presentation at the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). The NAACL assembles conferences and promotes information exchange among communities in related scientific and professional fields. The conference was held virtually from June 6-June 11, 2021.
Sam’s talk is based on “What Will it Take to Fix Benchmarking in Natural Language Understanding?”, a paper he co-authored with colleague George E. Dahl, a research scientist…

At work, at home and even at the doctor’s office, AI continues to be utilized in more and more spaces in our lives. As the application of AI continues to expand so does the need for models that can recognize human speech and the intent of that speech.
As part of their Master’s capstone project, CDS students Sujeong Cha, Wangrui Hou, Hyun Jung, My Phung published a paper titled “Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs” which details their efforts to create a spoken language understanding system system that can effectively understand the intent…
Joint work with Irina Esjepo Morales, Kyle Cranmer, Lukas Heinrich, Gilles Louppe, and more colleagues at CERN.
This entry is a part of the NYU Center for Data Science blog’s recurring guest editorial series. Irina Espejo Morales is a CDS Ph.D. student in data science and also a DeepMind fellow. Kyle Cranmer is a CDS professor of data science and professor of physics at the NYU College of Arts & Science. Lukas Heinrich is a staff scientist at CERN working with the ATLAS experiment at the LHC and former NYU graduate student. …

In the turbulent times of a global pandemic, the importance of modern medical advances such as drugs and therapeutics is more apparent than ever. However, the process is time-consuming and expensive. One major hurdle is the sheer number of possible molecules that can be synthesized. With so many possibilities it is difficult for models to generate outputs with desired properties. Machine learning approaches can help address this problem by generating molecules automatically rather than relying on explicitly enumerated heuristics or expert intuition alone.
A team of scientists comprising CDS PhD student Omar Mahmood, recent Computer Science PhD graduate Elman Mansimov…

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.