What Exactly is a Data Scientist? CDS Discusses
The CDS Academics team recently held a panel, “What the Heck is a Data Scientist?”, to meaningfully address just that for our students. The panel was facilitated by Tim Baker, CDS Assistant Director of Undergraduate Studies and hosted by Andrea Jones-Rooy, CDS Director of Undergraduate Studies and Visiting Associate Professor. Panel speakers included Chris Ick, CDS PhD student, Yash Despande, CDS alumni and Data Scientist at Prometheus Biosciences, Ethan Assouline, CDS alumni and Data Scientist at Hellofresh, Kiri Oler, IMF, Quantitative Analyst for the Philadelphia Phillies, and Ben Zweig, CEO at Revelio Labs and NYU Adjunct Professor at Stern School of Business.
Andrea kicked off the session by having each panelist walk through “the typical day” of a data scientist from their own experience. Those who are currently students or recent graduates spoke to the flexibility that grad students have, time being split between research, classes, and other tasks in an open-ended manner. Those in the corporate space described their experiences as being shaped very much by environment and company objectives. “My first job as a data scientist was at a hedge fund where I was the only quantitative (technical) person on the desk,” Ben described. “Basically, I would arrive in the morning, put on headphones, and be in the data writing code all day. It was very non-interactive. And (in contrast) my next role was very interactive with a lot of meetings…” The field of data science is so new that there’s a broad variety of types of roles in the corporate space and academia. “I often hear students say ‘I want to be a data scientist, but I don’t want to sit in a room by myself alone and do nothing but coding’ while others say ‘I want to be a data scientist, but I just want to sit alone in a room with code… there are many, many varieties of both of those existences,” Andrea recounted.
Two questions were raised that yielded particularly insightful answers: what’s great about being a data scientist and if you could, what would you change about the nature of the job. “I think that skills are very transferable,” said Yash. Seconding that, Ben emphasized that data is everywhere and everyone needs to understand it. As for the latter question, the answers were fairly lighthearted. One panelist expressed their dislike for data cleaning while another lamented the frustration felt when models don’t easily scale.
In terms of general advice, Ben presented a couple of pointers that especially stood out. He recommended that early-career data scientists working in industry go a little bit beyond their minimum requirements, specifically in what they have passion and/or interest in. That may sound like common sense to some, but he stressed that this paves the way to explore a more satisfactory path. Additionally, once you demonstrate proficiency in what you’re actually interested in, it’s harder to be pigeonholed into areas you’re less excited about.
The second, and more specific piece of advice, is that early-career data scientists should learn causal inference. “You should know how regression discontinuity works. You should know about instrumental variables and difference-in-differences. These are things that I think are very important for getting an intuition of how data behaves,” he asserted.
Overall, it was an open and helpful discussion, and one we hope students found informative. For questions about upcoming career events at CDS, please contact Tim Baker at tb116@nyu.edu.
By Ashley C. McDonald