Data and Baseball: a CDS alum’s experience working for the LA Dodgers

NYU Center for Data Science
6 min readJan 6, 2021

Data science intersects with almost every topic you can come into contact with these days. A natural intersection is that of data science and baseball. We spoke to a recent CDS alum, Esteban Navarro Garaiz, who started working with the LA Dodgers on their quantitative analysis team. He spoke to CDS social media team member, Colton Laferriere, about how he got interested in data science, his love of sports and what it was like watching the Dodgers win the world series:

The following was edited for clarity.

CL: How did you get interested in data science academically and professionally?

ENG: I majored in mathematics back home in Mexico and I knew that the things I wanted to do after school would require further education. Particularly, my programming skills were lacking for the kinds of problems I wanted to work on. I was also interested in all of the machine learning tendencies that the industry is taking. Because of those two things, grad school felt like a very natural transition for me.

CL: And what about CDS in particular attracted you?

ENG: CDS had a lot of coursework that was relevant to what I wanted to do and had some interesting alumni that had done Sports Analytics, although making it to the industry was a stretch goal back then. It also felt like New York was the place to be to do data science. And that ended up being true, right? There’s always a job fair or an interesting event at CDS. There’s always some interesting company trying to recruit and a lot of these jobs are either there or in Silicon Valley. So it felt the right place to be.

CL: That actually leads me to my next question, which is: can you explain to me, a lay person, how professional sports and data science intersect?

ENG: It’s a really natural application. Sports has clear, defined goals and metrics that you can optimize for. And so, it just lends itself very easily to data science. There is also as much data as you might ever want, and continuously growing. Player tracking data is generated at 30/60 frames per second, measuring the positions of everyone on the field, and things like the trajectory of the ball. We measure outcomes, body positions, movement, pitch characteristics. The idea is to try to quantify what should have happened and not what happened. A lot of my work revolves around taking luck out of the outcomes. So if you hit the ball very hard, but you hit it right at someone, then that’s going to be an out, instead of a hit. But as a hitter, you can really only control how hard you hit it. And in the long term, if you keep hitting the ball hard, on average you will have favorable outcomes.

CL: I see, so then you’re taking that information and your insights from and you’re giving it to the coaches or the players or both?

ENG: Both. Our team generates day to day things — like where fielders should be positioned on defense — and long-term needs of the team, like player evaluation.

The defender placement model is a similar model to what I mentioned earlier, but in the other direction: we have this batter / pitcher matchup, so where do we expect the ball to go? That’s where we want defenders, so that we can get this guy out. And we are also involved in making personnel decisions: acquiring players that bring value to the team and retaining the correct players.

CL: Do you find that when you’re trying to project things that they actually end up happening on the field?

ENG: Yeah, this happens often. The defensive placement is a perfect example. I haven’t worked directly on it, but it’s something that my team does and when you see a guy with a very specific defensive alignment to one side and then he hits it right at a defender, I think, “oh, that’s us right there!” and that’s awesome.

CL: So back to CDS. Is there anything in particular like courses or networks, you’ve built there that have helped you prepare for the work for you to what you’re doing now?

ENG: Yeah, I think CDS is very strong at teaching the soft skills of data science; of taking the technical results and correctly communicating them to a non-technical audience. And I think that is going to be key, both in my current work and throughout my career because like you said, the relevant stakeholders here are either front office people that are trying to make player acquisition decisions or coaches that are trying to make in-game decisions, and so they don’t particularly care about the intricacies of the model. At the end of the day, they mainly care about the insights that were generated. And so you need to take this very complex model and distill it into a handful of simple results that you can communicate

CL: And what parts of the curriculum taught you to do that?

ENG: I think the focus on having class projects continuously. I was also lucky enough to land a sports analytics Capstone with the Brooklyn Nets while at CDS. That was my first professional sports project, where we had a stakeholder with which we had weekly meetings. We were presenting our progress and getting feedback and tweaking our work based on the received feedback and steering the direction of the project

CL: That’s great! We’ve talked a lot about data science but where did your sort of passion for sports come from?

ENG: I grew up watching and playing soccer back home in Mexico. And later, my passion for data science and sports just grew together. I started doing a lot of similar things to what I’m doing now as a hobby, like fantasy baseball. Basically anything that involved sports and prediction that I could be a part of. I really didn’t have baseball in my life until later and it sort of grew together with that, like, “how can I predict this thing” passion. I am very lucky that I ended up turning it into a career.

CL: You mentioned that your love of sports started during your childhood in Mexico, can you share anything about how immigrating to the US informed your perspective on the data science field?

ENG: It made me much more excited about it. I knew going abroad would be an important step in my career, and a lot of the cutting edge research in ML is done here (and a lot of it in CDS!). For Sports Analytics, except for soccer (which is mostly in Europe), it’s all here. And getting your foot in the door is very, very hard. So I knew if I wanted to make it, physically being in the US would be crucial. There’s not a lot of international folks in the industry. To the best of my knowledge, I am the only full-time latino analyst in MLB, and I know of only one other full-time latino analyst in all major US sports. I am hoping me being here will help leave the door open for others and to help more latinxs make it into front office roles in a league that has 30% latino players.

CL: What was it like when working with the Dodgers when they won the World Series?

ENG: I got to be at the game, which was really exciting. And our two Mexican pitchers, Julio Urías and Víctor González, pitched in that game. So that made it three times as sweet. It was awesome. It’s hard to describe. I was sitting with someone that has worked on analytics for the Dodgers for something like six years now and she just started crying non-stop the moment we won. It was a very emotional moment, particularly with everything that 2020 has brought, and being away from friends and family.

I can’t claim any of the hard work that has led to this championship; I feel like that kid in the group project that does nothing and gets an A+. But being there and experiencing it, and getting a World Series ring out of it is just immensely cool.

--

--

NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.