Saba Nejad
Saba Nejad is a Data Engineer at Point72 working mostly with alternative data within the energy and industrials sector. She is broadly interested in using mathematics and programming to gain insight from real world data. Prior to joining Point72, she was studying at MIT where she was doing research at the Institute for Data, Systems, and Society. She was previously a Product Manager at Quantopian.

Sessions
Let’s say you want to run a machine learning experiment – you want to tag a paragraph with appropriate labels. Given all the models and sample code out there, writing a notebook or python script that does this is often relatively easy, though running it end to end on your single computer can, at times, take a while and make everything else slow – these processes are compute heavy.
Now, let’s say you like what your notebook is doing, you have made the best tagging script out there, and you want to enable all your users to be able to tag their paragraphs using your magic. Imagine if you wanted to deploy this to many users, how would you do that? What are steps you can take to not have all your users waiting for minutes before they get their tagged output?
Running machine learning inference on your local machine vs deployed to hundreds of users can look very different. In this talk, I will walk through a case study and share learnings from having to optimize a tagging project for latency and memory.