Meet CDS PhD student Vishakh Padmakumar. He researches NLP and human-AI collaboration and is advised by Assistant Professor of Computer Science & Data Science He He. Vishakh is currently focused on collaborative text generation for creative writing tasks and in other interactive settings.
Prior to joining CDS, Vishakh earned his MS in Computer Science at the Courant Institute of Mathematical Sciences, and interned at LAER.AI, where he worked on unstructured document retrieval and active learning.
This interview has been edited lightly for clarity.
How did you choose what to focus on in your research?
During my MS at NYU, I did an internship at NYU’s Center for Social Media and Politics, and was introduced to my advisor, He He, which was how I started getting interested in natural language processing. Then in 2020, I started my PhD. I got a little bit lucky then because that was just after language models really took off as tools. After that, my research became driven by the recognition that this technology and these models are really good, and I could think backwards on how to use them to meet real user needs. This then led me to the intersection of natural language processing and human-AI collaboration.
The introduction of large language models changed the direction of your research?
Yes. Until around 2021, we were often training models to do some specific task. But once these pre-trained language models progressed beyond a certain point, you essentially have a general purpose model with many different capabilities that can be deployed to real users. As a result, these days, one line of what I’m thinking about has changed to the real-world impact of these public-facing models. We recently put out a preprint asking the question of whether people write with less diversity when collaborating with language models. In other words, if everybody uses ChatGPT to write, does their writing lose idiosyncratic personal characteristics that define their ‘voice’? Another cool space is adapting language models for specific tasks such as controlled generation across domains like our ICML paper.
Can you tell us more about your ICML paper?
Right. So the two-minute version of that project is this: Let’s say you have a statement like, “I had lunch today and the food was very tasty.” And then you want to edit it to have a more extreme sentiment — like “really, really, tasty” or “unbelievably scrumptious” — but to a higher degree than anything you’ve seen in your training data. This extrapolation to unseen ranges is a slightly different take on the very well-studied controlled generation problem, and this project started out as a way to achieve this extrapolation by iteratively improving the statement at each step.
But then we kind of fell backwards into proteins, because we found out that the same technique that we were using for natural language text was also applicable in this other domain of proteins. And we found that our original goal, to write a more positive sentence, which didn’t have much real-world use, could be reframed in the protein domain as generating a protein sequence more stable than you’ve ever seen before — which could then be very valuable.
Do you have a sense of how this might be used in industry?
In the real world, a company like AstraZeneca might want to produce candidate mutations that are as stable as possible while retaining some properties of the original protein sequence. So if your training data only contains examples with a stability score of say, 1 to 10, they would want to be able to identify mutations at level 11 and 12 and 13, and so on. This is where you get into extrapolation, into things never before seen in the training data — things no one might have seen, but which you can guess might exist.
Can you tell me about the ACL Student Research Workshop you helped run. What was it?
So ACL is one of the biggest annual NLP conferences, and as part of the main conference every year they also have a co-located student research workshop. Since ACL is a very competitive venue, the student research workshop is meant to encourage submissions from students from universities that aren’t as well-represented in the NLP world, or who might not have as much experience. If their paper gets accepted to the workshop, they get to travel and go to the conference and get that exposure.
What was your role in it?
My mentor from my internship last year, Yang Liu, was actually the General Chair of the conference as a whole. So I got the chance to volunteer to organize the student workshop because this outreach is something I am passionate about and felt I could help out with, having seen both sides as an international student in the US. My co-organizers and I put out the call for papers, conducted a mentorship phase where students could get feedback on their work from eminent researchers, assembled the program committee to review the papers and finally organized the workshop presentations itself.
Do you have thoughts on where NLP is going in the future?
I think the way that NLP is evolving is shaped more and more by the fact that the current state-of-the-art models are being used by real users. This human-centered vision might involve training smaller models for specific user needs, like training a model to make fine-grained text edits or write poetry. Or more broad range societal impact studies such as the work I mentioned on diversity. At this time it feels like these models are useful and here to stay, so we should be adopting an interdisciplinary view of understanding their impacts on these real users.
By Stephen Thomas