Episode #96: Serverless and Machine Learning with Alexandra Abbas

Serverless Chats

เนื้อหาจัดทำโดย Jeremy Daly and Rebecca Marshburn เนื้อหาพอดแคสต์ทั้งหมด รวมถึงตอน กราฟิก และคำอธิบายพอดแคสต์ได้รับการอัปโหลดและจัดเตรียมโดย Jeremy Daly and Rebecca Marshburn หรือพันธมิตรแพลตฟอร์มพอดแคสต์โดยตรง หากคุณเชื่อว่ามีบุคคลอื่นใช้งานที่มีลิขสิทธิ์ของคุณโดยไม่ได้รับอนุญาต คุณสามารถปฏิบัติตามขั้นตอนที่อธิบายไว้ที่นี่ https://th.player.fm/legal

3y ago 44:19

MP3•หน้าโฮมของตอน

About Alexa Abbas

Alexandra Abbas is a Google Cloud Certified Data Engineer & Architect and Apache Airflow Contributor. She currently works as a Machine Learning Engineer at Wise. She has experience with large-scale data science and engineering projects. She spends her time building data pipelines using Apache Airflow and Apache Beam and creating production-ready Machine Learning pipelines with Tensorflow.

Alexandra was a speaker at Serverless Days London 2019 and presented at the Tensorflow London meetup.

Personal links

Twitter: https://twitter.com/alexandraabbas
LinkedIn: https://www.linkedin.com/in/alexandraabbas
GitHub: https://github.com/alexandraabbas

datastack.tv's links
Web: https://datastack.tv
Twitter: https://twitter.com/datastacktv
YouTube: https://www.youtube.com/c/datastacktv
LinkedIn: https://www.linkedin.com/company/datastacktv
GitHub: https://github.com/datastacktv
Link to the Data Engineer Roadmap: https://github.com/datastacktv/data-engineer-roadmap

This episode is sponsored by CBT Nuggets: cbtnuggets.com/serverless and
Stackery: https://www.stackery.io/

Watch this video on YouTube: https://youtu.be/SLJZPwfRLb8

Transcript
Jeremy: Hi, everyone. I'm Jeremy Daly, and this is Serverless Chats. Today I'm joined by Alexa Abbas. Hey, Alexa, thanks for joining me.

Alexa: Hey, everyone. Thanks for having me.

Jeremy: So you are a machine learning engineer at Wise and also the founder of datastack.tv. So I'd love it if you could tell the listeners a little bit about your background and what you do at Wise and what datastack.tv is all about.

Alexa: Yeah. So as you said, I'm a machine learning engineer at Wise. So Wise is an international money transfer service. We are aiming for very transparent fees and very low fees compared to banks. So at Wise, basically, designing, maintaining, and developing the machine learning platform, which serves data scientists and analysts, so they can train their models and deploy their models, easily.

Datastack.tv is, basically, it's a video service or a video platform for data engineers. So we create bite-sized videos, educational videos, for data engineers. We mostly cover open source topics, because we noticed that some of the open source tools in the data engineering world are quite underserved in terms of educational content. So we create videos about those.

Jeremy: Awesome. And then, what about your background?

Alexa: So I actually worked as a data engineer and machine learning engineer, so I've always been a data engineer or machine learning engineer in terms of roles. I also worked, for a small amount of time, I worked as a data scientist as well. In terms of education, I did a big data engineering Master's, but actually my Bachelor is economics, so quite a mix.

Jeremy: Well, it's always good to have a ton of experience and that diverse perspective. Well, listen, I'm super excited to have you here, because machine learning is one of those things where it probably is more of a buzzword, I think, to a lot of people where every startup puts it in their pitch deck, like, "Oh, we're doing machine learning and artificial intelligence ..." stuff like that. But I think it's important to understand, one, what exactly it is, because I think there's a huge confusion there in terms of what we think of as machine learning, and maybe we think it's more advanced than it is sometimes, as I think there's lower versions of machine learning that can be very helpful.

And obviously, this being a serverless podcast, I've heard you speak a number of times about the work that you've done with machine learning and some experiments you've done with serverless there. So I'd love to just pick your brain about that and just see if we can educate the users here on what exactly machine learning is, how people are using it, and where it fits in with serverless and some of the use cases and things like that. So first of all, I think one of the important things to start with anyways is this idea of MLOps. So can you explain what MLOps is?

Alexa: Yeah, sure. So really short, MLOps is DevOps for machine learning. So I guess the traditional software engineering projects, you have a streamlined process you can release, really often, really quickly, because you already have all these best practices that all these traditional software engineering projects implement. Machine learning, this is still in a quite early stage and MLOps is in a quite early stage. But what we try to do in MLOps is we try to streamline machine learning projects, as well as traditional software engineering projects are streamlined. So data scientists can train models really easily, and they can release models really frequently and really easily into production. So MLOps is all about streamlining the whole data science workflow, basically.

And I guess it's good to understand what the data science workflow is. So I talk a bit about that as well. So before actually starting any machine learning project, the first phase is an experimentation phase. It's a really iterative process when data scientists are looking at the data, they are trying to find features and they are also training many different models; they are doing architecture search, trying different architecture, trying different hyperparameter settings with those models. So it's a really iterative process of trying many models, many features.

And then by the end, they probably find a model that they like and that hit the benchmark that they were looking for, and then they are ready to release that model into production. And this usually looks like ... so sometimes they use shadow models, in the beginning, to check if the results are as expected in production as well, and then they actually release into production. So basically MLOps tries to create the infrastructure and the processes that streamline this whole process, the whole life cycle.

Jeremy: Right. So the question I have is, so if you're an ML engineer or you're working on these models and you're going through these iterations and stuff, so now you have this, you're ready to release it to production, so why do you need something like an MLOps pipeline? Why can't you just move that into production? Where's the barrier?

Alexa: Well, I guess ... I mean, to be honest, the thing is there shouldn't be a barrier. Right now, that's the whole goal of MLOps. They shouldn't feel that they need to do any manual model artifact copying or anything like that. They just, I don't know, press a button and they can release to production. So that's what MLOps is about really and we can version models, we can version the data, things like that. And we can create reproducible experiments. So I guess right now, I think many bits in this whole lifecycle is really manual, and that could be automated. For example, releasing t...

142 ตอน

#Cloud IT #Serverless #Lambda #Jeremy Daly #Tech #Industries #Podcasting Education #Faa #Baa #AWS