NYU Center for Data Science Incredible Alumni: Manoj Kumar

Data science is a rapidly growing area of study. Yet many people are still left asking, “What actually is a data scientist?” If there were a simple answer, data science wouldn’t be all the rage it is now. What makes data science so unique, among other things, is the application of the knowledge and skills it takes to be a data scientist to a vast, interdisciplinary array of studies and professions.

Data is everywhere, and data scientists are following. CDS gives students the tools and networks necessary to set them up for success in whatever industry they enter. But don’t take our word for it… Allow us to introduce one of NYU Center for Data Science’s Incredible Alumni: Manoj Kumar!

photo courtesy of Manoj Kumar

Manoj graduated from the Masters in Data Science program at CDS in 2017. He spoke with us about his experience at CDS, projects during and after the program, and possible future applications of his work.

This interview has been lightly edited for clarity.

How did you get started in the field of Data Science?

I actually did my undergrad in mechanical engineering. I did a Google open source program called Google Summer of Code during my undergrad. I used to write blog posts every week, and I contributed bug fixes and improved the differential equation solver of Sympy, a symbolic mathematical library.

Then I moved more into machine learning, starting with an online course. Because I did the Google Summer of Code I was comfortable learning on the fly and digging into large codebases to grasp the underlying math. I was just getting started with the scikit-learn library, which is a popular library for machine learning. They had a lot of issues and bugs, so I slowly hacked my way through them. My start in machine learning came through contributing to an open source library.

Tell me about your experience with CDS.

I felt my experience in machine learning was hackish, if there was a machine learning feature to be implemented in the library, I had learned just enough to know what the feature was about. I needed a more solid understanding of the concepts, so that’s why I applied to the program at CDS.

There are a lot of people from different backgrounds, and it helped to solidify my theoretical understanding of machine learning concepts. For example, the machine learning course with David Rosenberg was important, and even a stats course was very helpful. They helped me build a more solid understanding of machine learning concepts.

What kinds of projects have you been working on lately?

During CDS I applied and got in to the Google Brain Residency program for deep learning research. I mainly worked on two programs, the first was related to hyperparameter search. While I was at CDS I worked on scikit-learn too where as an extension, we build this library called scikit-optimize. We packaged the existing algorithms in a way that people can easily use.

The second program was an algorithm for next frame prediction in videos. You provide the model with the first few frames of the video of a moving robotic arm, and the model completes it for you suggesting multiple possible completions. We came up with a new algorithm that has a number of advantages compared to previous ones that average possible outcomes.

What future projects are you excited about?

In general, I’m excited about scaling up these predictive models. It’s much more difficult if you want to, say, use them for a movie. I am also interested in other creative applications of generative models. I am interested in working on domains that haven’t been done before, and go further than just making pretty pictures. Many applications are limited to vision, sound, and text, and it would be interesting to see more applications of machine learning outside of these domains.

What are some of the most rewarding moments throughout your career?

We were working on scikit-optimize, and we looked at it mainly as a learning experience, like we have to understand how these different algorithms work. But a few months after we packaged it, it became quite popular.. I feel happy when people use my work in some way or another. Similarly at Google when researchers started using tools or code that I worked on, it feels good, like my work is useful.

One of my projects was one of three papers selected for a contributed talk at the INNF Workshop at ICML (International Conference on Machine Learning). There were some machine learning celebrities who spoke before me, so I was really happy to have that opportunity!

Do you have any advice for someone entering a program like CDS?

I got into data science and programming because I was curious, I thought it was cool. I recommend being curious about what’s going on and what people are working on. I would advise someone entering the program to make use of the opportunity to get a solid understanding of the basics of machine learning while working on great projects.

By Mary Oliver

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.