CDS PhD Student Presents on Transfer Learning in NLP

NYU Center for Data Science
2 min readFeb 25, 2021
Phu Mon Htut, CDS PhD Student

Phu Mon Htut, CDS PhD student, gave a talk last month on transfer learning in NLP at the Hamburg Natural Language Processing Meetup, which is a community for anyone who is interested in natural language processing (NLP), “a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language”. 1 The meetup was a joint event with PyData Hamburg, an educational program of NumFOCUS, which offers a forum for users and developers of data analysis tools to collaborate and learn from one another. A video of the event can be viewed on YouTube.

PyData Hamburg NLP Jan ’21: Transfer Learning in NLP & NLP-based Customer Automation at Hapag-Lloyd

In this talk, Htut detailed what transfer learning is and what it means in NLP. She described how to maximize the chance of success when using transfer learning, and provided pertinent information on useful resources for transfer learning such as libraries and open source tools.

During her presentation, Htut delved deep into supervised learning, which is “the machine learning task of learning a function that maps an input to an output based on example input-output pairs”.2 She explained that in instances where there’s not enough data, transfer learning can assist since essentially it’s a way to share knowledge across languages, domains, or related tasks. Transfer learning also requires less training data (since not it’s built from scratch) and because of this, saves time and computational resources. Her presentation clarifies how transfer learning has led to improvements in NLP and why we should use transfer learning in NLP. As an example, she references the SuperGlue Benchmark, which is a “new benchmark styled after the original GLUE benchmark with a new set of more difficult language understanding tasks, improved resources, and a new public leaderboard”.3 Htut asserts that SuperGlue tasks are dominated by transfer learning models and methods and are performing exceptionally well.

There are several different transfer learning methods but Htut focused specifically on sequential transfer learning, which refers to a method where a model is trained on a task and subsequently that pretrained model is used to train another task. Additionally, she discussed helpful resources such as jiant, which is a software toolkit built on PyTorch for NLP research. Jiant is “designed to facilitate work on multitask learning and transfer learning for sentence understanding tasks.” At the lecture’s conclusion, Htut presented a demo of multi-task learning in jiant.

For more information on Phu Mon Htut and her research, please visit her website. To view her presentation, please visit the event YouTube video.

References:

  1. Natural Language Processing Wikipedia webpage
  2. Supervised Learning Wikipedia webpage
  3. SuperGlue Benchmark website
  4. Jiant website

By Ashley C. McDonald

--

--

NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.