Serverless on AWS Lambda with Stephanie Prime

newline

เนื้อหาจัดทำโดย newline, Nate Murray, and Amelia Wattenberger เนื้อหาพอดแคสต์ทั้งหมด รวมถึงตอน กราฟิก และคำอธิบายพอดแคสต์ได้รับการอัปโหลดและจัดเตรียมโดย newline, Nate Murray, and Amelia Wattenberger หรือพันธมิตรแพลตฟอร์มพอดแคสต์โดยตรง หากคุณเชื่อว่ามีบุคคลอื่นใช้งานที่มีลิขสิทธิ์ของคุณโดยไม่ได้รับอนุญาต คุณสามารถปฏิบัติตามขั้นตอนที่อธิบายไว้ที่นี่ https://th.player.fm/legal

4y ago 1:00:46

MP3•หน้าโฮมของตอน

newline Podcast Sudo Steph

Nate: [00:00:00] Steph, just tell us a little bit about your work and kind of your background with, like AWS and like what you're doing now.

Steph: [00:00:06] Yes, so I work as a engineer for a manage services provider called Second Watch. We basically partner with other big companies that use AWS or some other clouds sometimes Azure for managing their cloud infrastructure, which basically just means that.

We help big companies who may not, their focus may not be technology, it may not be cloud stuff in general, and we're able to just basically optimize the cost of everything, make sure that things are running reliably and smoothly, and we're able to work with AWS directly to kind of keep people ahead of the curve when.

New stuff is coming out and just it changes so much, you know, it's important to be able to adapt. So like personally, my role is I develop automation for our internal operations teams. So we have a bunch of, you know, just really smart people who are always working on customer specific AWS issues. And we see some of the same issues.

Pop up over and over. Of course, you know, security , auditing, cost optimization. And so my team makes optimizations that we can distribute to all of these clients who have to maintain their own. You know, they have their own AWS account. It's theirs. And we make it so that we're actually able to distribute these automations same way in all of our customers' accounts.

So the idea is that, and it's really wouldn't be doable without serverless because the idea is that everyone has to own their own infrastructure, right? Your AWS account is yours does or your resources, you don't, for security reasons, want to put all of your stuff on somebody else's account. But actually managing them all the same way can be a really difficult, even with scripts, because permissions different places have to be granted through the AWS permissions up with access, I identity and access management, right? So serverless gave us the real tool that we needed to be able to at scale, say, Hey, we came up with a little script that will run on an hourly basis to check to see how much usage these servers are getting, and if they're not production servers, you know, spin them down if they're not in use to save money.

Little things like that when it comes to operations and AWS Lambda is just so good for it because it's all about, you know, like I said, doing things reliably. Doing things in a ways that can be audited and logged and doing things for like a decent price. So like background wise, I used to work at AWS in AWS support actually, and I kind of supported some of their dev ops products like OpsWorks, which is based on chef for configuration management, elastic Beanstalk and AWS CloudFormation, specifically. After working there for a bit, I really got to see, you know, how it's made and what the underlying system is like. And it was just crazy just to see how much work goes into all this, just so you can have a supposedly, easier to use for an end. But serverless just kinda changed all that for the better.

Luckily.

Amelia: [00:02:57] So it sounds like AWS has a ton of different services. What are the main ones and how many are there?

Steph: [00:03:04] So I don't think I can even count anymore because they just, they do release new ones all the time. So hundreds at this point, but really main ones, and maybe not hundreds, maybe a little over a hundred would be a better estimate.

I mean, EC2 which is elastic compute is. The bread and butter. Historically, AWS is just, they're virtualized servers basically. So EC2, the thing that made AWS really special from the beginning and that made cloud start to take over the world was the concept of auto scaling groups, which are basically definitions you attached to EC2 and it basically allows you to say, Hey, if I start getting a lot of traffic on.

This one type of server, right? You know, create a second server that looks exactly the same and load balance the traffic through it. So when they say scaling, that's basically what, how you scale, easy to use auto scaling groups and elastic load balancers and kind of distribute the traffic out. The other big thing besides the scalability of with auto scaling groups is.

Redundancy. So there's this idea of regions within AWS, and within each region there's availability zones. So regions are the general, like you can think of it as the place where data center is kind of like located within like a small degree. So it's usually like. Virginia is one, right? That's us East one.

It's the oldest one. Another one is in California, but they're all over the world now. So the idea is you pick a region to minimize latency, so you pick the region that's closest to you. And then within the region, there's the idea of availability zones, which are basically just discreet, like physical locations of the servers that you administer them the same way, but they're protected.

So like if a tornado runs through and hits one of your data centers. If you happen to have them distributed between two different availability zones, then you'll still be able to, you know, serve traffic. The other one will go down, but then the elastic load balancer will just notice that it's not responding and send the traffic to the other availability zone.

So those are the main concepts that make it like EC2 those are what you need to use it effectively.

Nate: [00:05:12] So with an easy to instance, that would be like a virtual server. I mean, it's not quite a Docker container, I guess we're getting to nuance there, but it's basically like a server that you would have like command line access to.

You could log in and you can do more or less whenever you want on an EC2 instance.

Steph: [00:05:29] Right, exactly. And so it used to be AWS used what was called Zen virtualization to do it. And that's just like you can run Zen on your own computer, you can get a computer and set up a virtual machine, almost just like they used to do it .

So they are constantly putting out like new ways of virtualizing more efficiently. So they do have new technology now, but it's not something that was really, I mean, it was well known, but they really took it to a new kind of scale, which made it really impressive.

Nate: [00:05:56] Okay, so EC2 lets you have full access to the box that's running and you might like load bounce requests against that.

How does that contrast with what you do with AWS Lambda and serverless?

Steph: [00:06:09] So with , you still have to, you know, either secure shell or, you know, furious and windows. Use RDP or something to actually get in there. You care about what ports are open. You have security groups for that. You care about all the stuff you would care about normally with a server you care about.

Is it patched and up today you care about, you know, what's the current memory and CPU usage? All those things don't go away on EC2 just because it's cloud, right? When we start bringing serverless into the mix, suddenly. They do go and away. I mean, and there's still a few limitations. Like for instance, a Lambda has a limit on how much memory it can process with, just because they have to have a way to kind of keep costs down and define the units of them and define where to put them.

Right? But at its core, what a Lambda is, it actually runs on a Docker container. You can think of it like a pre-configured Docker container with some pre-installed dependencies. So for Python, it would have. The latest version of Python that it says it has, it would have boto. It would have the stuff that it needs to execute that, and it would also have some basic, it's structured like it was, you know, basic Linux.

So there's like a attempt. So slash temp you can write files there, right. But really it's like a Docker container. That runs underneath it on a fleet of . As far as availability zone distribution goes, that's already built into land, but you don't have to think about it with . You do have to think about it.

Because if you only run one easy to server and it's in one availability zone, it's not really different from just having a physical server somewhere with a traditional provider.

Nate: [00:07:38] So. There are these two terms, there's like serverless and Lambda. Can you talk a little bit about like the difference between those two terms and when to use each appropriately?

Steph: [00:07:48] Yeah, so they are in a way sorta interchangeable, right? Because serverless technology just means the general idea of. I have an application, I have it defined it an artifact of we'll say zip from our get repo, right? So that application is my main artifact, and then I pass it to a service somewhere. I don't know.

It could be at work. The Google app engine, that's a type of serverless technology and AWS Lambda is just the specific AWS serverless technology. But the reason AWS Lambda is, in my opinion so powerful, is because it integrates really well with the other features of AWS. So permissions management works with AWS Lambda API gateway.

there's a lot of really tight integrations you can make with Lambda so that it doesn't, it's not like you have to keep half of your stuff one place and half of your stuff somewhere else. I remember when like Heroku was really big . A lot of people, you know, maybe they were maintaining an AWS account and they were also maintaining a bunch of stuff and Heroku, and they're just trying to make it work together.

And even though Heroku does use, you know, AWS on the backend, or at least it did back then, it can just make things more complicated. But the whole server, this idea of the artifact is you make your code general, it's like a little microservice in a way. So I can take my serverless application and ideally, you know, it's just Python.

I use NF, I write it the right way. Getting it to work on a different server. This back end, like for, exit. I think Azure has one, and Google app engine isn't really too much of a change. There's some changes to permissions and the way that you invoke it, but at the core of it, the real resource is just the application itself.

It's not, you know, how many, you know, units of compute. Does it have, how many, you know, how much memory, what are the IP address rules and all that. You

Nate: [00:09:35] know. So what are some good apps to build on serverless?

Steph: [00:09:39] Yes. So you can build almost anything today on serverless, there's actually so much support, especially with AWS Lambda for integrations with all these other kinds of services that the stuff you can't do is getting more limited.

But there is a trade off with cost, right? Because. To me the situation where it shines, where I would for no reason ever choose anything but serverless, is if you have something that's kind of bursty. So let's say you're making like a report generation tool that needs to run, but really you only run it three times a week or something like things that.

They need to live somewhere. They need to be consistent. They need to be stable, they need to be available, but you don't know how often they're going to call. And even if they can go from that, there is small numbers of times it's being called, because the cool thing about serverless is , you're charged per every 100 milliseconds of time that it's being processed.

When it comes to , you're charged and units that are, it used to be by the hour, I think they finally fixed it, and it's down to smaller increments. . But if you can write it. Efficiently. You can save a ton of money just by doing it this way, depending on what the use cases. So some stuff, like if you're using API gateway with Lambda, that actually can.

Be a lot more expensive than Lambda will be. But you don't have to worry about, especially if you need redundancy. Cause otherwise you have to run a minimum of two two servers just to keep them both up for a AZ kind of outages situation. You don't have to worry about that with Lambda. So anything that with lower usage 100%.

If it's bursty 100% use Lambda, if it's one of those things where you just don't have many dependencies on it, then Lambda is a really good choice as well. So there's especially infrastructure management, which is, if you look, I think Warner Vogels, he wrote something recently about how serverless driven infrastructure automation is kind of going to be the really key point to making places that are using cloud use cloud more effectively.

And so that's one group of people. That's a big group of people. If you're a big company and you already use the AWS and you're not getting everything out of it that you thought you would get. Sometimes there's serverless use cases that already exist out there and like there's a serverless application repo that AWS provides and AWS config integrations, so that if you can trigger a serverless action based off of some other resource actions. So like, let's say that your auto scaling group scaled up and you wanted to like notify somebody, there's so many things you could do with it. It's super useful for that. But even if you're just, I'm co you're coming at it from like a blank slate and you want to create something .

There are a lot of really good use cases for serverless. If you are, like I said, you're not really sure about how it's going to scale. You don't want to deal with redundancy and it fits into like a fairly well-defined, you know, this is pretty much all Python and it works with minimal dependencies. Then it's a really good starting place for that.

Nate: [00:12:29] You know, you mentioned earlier that serverless is very good for when you have bursty services in that if you were to do it based on and then also get that redundancy one. You're going to have to run while you're saying you'll have to run at least two EC2 instances, just 24 hours a day. I'm paying for those.

Plus you're also going to pay for API gateway. Do you pay hourly for API gateway

Steph: [00:12:53] API gateway? It, it would work the same either way, but you would pay for, in that case, like a load balancer.

Nate: [00:12:59] What is API gateway? Do you use that for serverless?

Steph: [00:13:02] All the time. So API gateway?

Nate: [00:13:04] Yeah. Tell us the elements of a typical serverless stack.

So I understand there's like Lambda, for example, maybe you say like you use CloudFront. In front of your Lambda functions, which may be store images and S3 that like a typical stack? And then can you explain like what each of those services are,

Steph: [00:13:22] how you would do that? Yeah, so you're, you're not, you're on the right track here.

So, okay. So a good way to think about it is, if you look at AWS has published something which a lot of documentations on it called the serverless application management standard. So S a N. And so basically if you look at that, it actually defines the core units of serverless applications. So which is the function, an API, if you, if you want one.

And basically any other permission type resources. So in your case, let's say it was something where I just wanted like a really. Basic tutorial that AWS provides is someone wants to upload an image for their profile and you want to, you know, scale it down to like a smaller image before you store it on your S3.

You know, just so they're all the same size and it saves you a ton, all that. So if you're creating something like that, the AWS resources that you would need are basically an API gateway, which is. Acts as basically the definition of your API schema. So like if you've ever used swagger or like a open API, these standards where you basically just define, and JSON, you know it's a rest API, you do get here, post here, this resource name.

That's a standard that you might see outside of AWS a lot. And so API Gateway is just basically a way to implement that same standard. They work with AWS. So that's how you can think of API gateway. It also manages stuff like authentication integration. So if you want to enable OAuth or something on something, you could set that up the API gateway level.

Nate: [00:14:55] if you had API gateway set up. Then is that basically a web server hosted by Amazon?

Steph: [00:15:03] Yeah, that's basically it.

Nate: [00:15:05] And so then your API gateway is just assigned essentially randomly a DNS name by Amazon. If you wanted to have a custom DNS name to your API gateway. How do you do that?

Steph: [00:15:21] Oh, it's just a setting.

It's pretty. so what you could do, yeah, so if you already have a domain name, right? Route 53 which is AWS is domain name management service, you can use that to basically point that domain to the API gateway.

Nate: [00:15:35] So you'd use route 53 you configure your DNS to have route 53 point a specific DNS name to your API gateway, and your API gateway would be like a web server that also handles like authentication and AWS integration. Okay,

Steph: [00:15:51] got it. Yeah, that's a good breakdown of what that works. So that's your first kind of half of how people commonly trigger Lambdas. And that's not the only way to trigger it, but it's a very common way to do it. So what happens is when the API gateway is configured, part of it is you set what happens when the method is invoked.

So there's like a REST API as a type of API gateway that. People use a lot. There's a few others, like a web socket, one which is pretty cool for streaming uses, and then they're always adding new options to it. So it's a really neat service. So you would have that kind of input go into your API gate.

We would decide where to route it. Right. So in a lot of cases here, you might say that the Lambda function is where it gets routed to. That's considered the integration to it. And so basically API gateway passes it all of this stuff from the requests that you want to pass it. So, you know, I want to give it the content that was uploaded.

I want to give it the IP address. It originally came from whatever you want to give it.

Nate: [00:16:47] What backend would people use for API gateway other than Lambda? Like would you use an API gateway in front of an EC2 instance?

Steph: [00:16:56] You could, but I would use probably a load balancer or application load balancer and that kind of thing.

There's a lot of things you can integrate it for. Another cool one is, AWS API calls. It can proxy, so it can just directly take input from an API and send it to a specific API call if you want to do that. That's kind of some advanced usage, but Lambdas are kind of what I see is the go-to right now.

Nate: [00:17:20] So the basic stack that we're looking at is you use API gateway to be in front of your Lambda function, and then your Lambda function just basically does the work, which is either like a writing to some sort of storage or calculating some sort of response. You mentioned earlier, you said, you know the Lambda function it can be fronted by an API if you want one. And then you mentioned, you know, there's other ways that you can trigger these Lambda functions. Can you talk a little bit about like what some of those other ways are?

Steph: [00:17:48] Yeah, so actually those are really cool. So the cool thing is that you could trigger it based off of basically any type of CloudWatch event is a big one.

And so CloudWatch is basically a monitoring slash auditing kind of service that AWS provides. So you can set triggers that go off when alarms are set. So. It could be usage, it could be, Hey, somebody logged in from an IP address that we don't recognize. You could do some really cool stuff with CloudWatch events specifically. And so those are one that I think for like management purposes are really powerful to leverage. But also you can do it off of S3 events, which are really cool. So like you could just have it, so anytime somebody uploads a. Let's say it was a or CI build, right? You're doing IA builds and you're putting your artifacts into a S three bucket, so you know this is released version 1.1 or whatever, right?

You put it into an S3 bucket, right? You can hook it up so that when ever something gets put into that S3 bucket. That another action is that takes place so you can make it so that, you know, whenever we upload a release, then you know, notify these people. So now an email or you can make it so that it, you know, as complicated as you want, you can make it trigger a different kind of part in your build stage.

If you have things that are outside of AWS, you can have it trigger from there. There's a lot of really cool, just direct kind of things that you don't need. An API for. An S3 is a good one. The notification service, SNS it's used within AWS a lot that can be used. The queuing service AWS provides called SQS.

It works with, and also just scheduled events, which I really like because it replaces the need for a crown server. So if you have things that run, you know, every Tuesday or whatever, right, you can just trigger your Lambda to do that from just one configuration point, you don't have to deal with anything more complicated than that.

Nate: [00:19:38] I feel like that gives me a pretty good grounding in the ecosystem, in the setting. Maybe we could talk a little bit more about tools and tooling. Yeah, so I know that in the JavaScript world, on like the node world, they have the serverless framework, which is basically this abstraction over, I think it's over like Lambda and you know, Azure functions and Google up.

Google cloud probably too. Do they have like a serverless framework for Python or is there like a framework that you end up using? Or do you just generally just write straight on top of Lambda?

Steph: [00:20:06] So that's a great question and I definitely do recommend that even though there is like a front end you could do to just start, you know, typing code in and making the Lambda work right.

It's definitely better to have some sort of framework that. Integrates with your actual, like, you know, wherever you use to store your code and test it and that kind of thing. So serverless is a really big one, and that's, it's kind of annoying because serverless, you know, it also refers to the greater ecosystem of code that runs without managing underlying servers.

But in this particular case, Serverless is also like a third party company in tooling, and it does work for Python. It works for, a whole budget. That's kind of like the serverless equivalent in my head of like Terraform, something that is kind of meant to be kind of generic, but it offers a lot of kind of value to people just getting started. If you just want to put something in your, read me that says, here's how to, you know, deploy this from Github. You know, serverless is a cool tool for that. I don't personally like building with it myself just because I find that this SAM, which is Serverless Application Model, I think I said management earlier, but it's actually model.

I just looked that up. I feel like that has everything I really want for AWS and I get more fine grain control. I don't like having too much obstruction and I also don't like. When you have to install something and something changes between versions and that changes the way your infrastructure gets deployed.

That's a pet peeve of mine, and that's why I don't really use Terraform too much for the same reason. When you're operating really in one world, which in my case is AWS, I just don't get the value out of that. Right. But with the serverless application model, and they have a whole Sam CLI, they have a bunch of tools coming out.

So many examples on their Github repos as well. I find that it's got really everything. I want to use plus some CloudFormation plugs right into that. So if I need to do anything outside of the normal serverless kind of world, I can do that. So it's better to use serverless than to not use anything at all. I think it's a good tool and really good way to kind of get used to it and started, but at least my case where it really matters to have super consistent deployments where I'm sharing between people and accounts and all of that. And I find that SAM really gives me the best kind of best of both worlds.

Amelia: [00:22:17] So, as far as I understand it, serverless is a fairly new concept.

Steph: [00:22:22] You know, it's one of those things it's catching on. Recently, I felt like Google app engine candidate a long time ago, and it was kind of a niche thing for awhile, but it recently it, we're starting to see. Bigger enterprises, people who might not necessarily want bleeding edge stuff start to accept that serverless is going to be the future.

And that's why we're seeing all this stuff come up and it's, it's actually really exciting. But the good thing is it's been around long enough that a lot of the actual tooling and the architecture patterns that you will use are mature. They've been used for years. Their sites you've been using for a long time that.

You don't know that it's serverless on the back end, but it is because it's one of those things that doesn't really affect you unless you're kind of working on it. Right. But it's new to a lot of people, but I think it's in a good spot where it's more approachable than it used to be.

Nate: [00:23:10] When you say that there's like a lot of standard patterns, maybe we could talk about some of those.

So when you write a Lambda function and code, either with like Python or Java script or whatever, there are bloods, they say Python because you use Python primarily right? Well, maybe we could talk a little bit about that. Like why do you prefer Python?

Steph: [00:23:26] Yeah, so just coming from my background, which is, like I said, I did some support, did some straight dev ops, kind of a more assisted mini before the world kind of became a more interesting place kind of background.

Python is just one of those tools that is installed on like every Linux server in the world and works kind of predictably. Enough people know it that it's, it's not too hard to like. Share between people who may not be, you know, super advanced developers, right? Cause a lot of people I work with, maybe they have varying levels of skills and Python's one of those ones you can start to pick up pretty quickly.

And it's not too foreign really to people coming from other languages as well. So it's just a practicality thing for a lot of it. But there's also a lot of the tooling that is around. Dev ops type stuff is in Python, like them, Ansible for configuration management, super useful tool. You know, it's all Python.

So really there's, there's a lot of good reasons to use Python from, like in my world it's, it's one of the things where you don't have to use one specific language, but Python is just, it has what I need and it gets, I can work with it pretty quickly. The ecosystems develop. There's still a lot of people who use it and it's a good tool for what I have to do.

Nate: [00:24:35] Yeah, there's tons, and I was looking at the metrics. I think Python is like, last year was like one of the fastest growing programming languages too. There's a lot of new people coming into Python,

Steph: [00:24:44] and a lot of it is data science people too, right? People who may not necessarily have a strong programming background, but there's the tooling they need in a Python already there.

There's community, and it sucks that it's not as scary looking as some other languages, frankly. You know.

Nate: [00:24:58] And what are some of the other like cloud libraries that Python has? Like I've seen one that's called like Boto

Steph: [00:25:03] Boto is the one that Amazon provides as their SDK, basically. And so every Lambda comes bundled with Boto three you know, by default.

So yeah, there was an older version of ODA for Python too. But Boto three is the main one everyone uses now. So yeah, Bodo is great. I use it extensively. It's pretty easy to use, a lot of documentation, a lot of examples, a lot of people answering questions about it on StackOverflow, but I'm really, every language does have an SDK for AWS these days, and they all pretty much work the same way because they're all just based off of.

The AWS API APIs and all the API APIs are well-defined and pretty stable, so it's not too much of a stretch to use any other language, but Bono's the big one, the requests library in Python is super useful just because it makes it easier to deal with, you know, interacting with API APIs or interacting with requests to APIs.

It's just all about, you know, HTP requests and all that. Some of the new Python three. Libraries are really good as well, just because they kind of improve. It used to be like with Python 2, you know, there's URL lib for processing requests and it was just not as easy to use. So people would always bundle a third party tool, like requests, but it's getting better.

Also, you know, Python, there's some. Different options for testing Py unit and unit test, and really there's just a bunch of libraries that are well maintained by the community. There's a kazillion on PyPy, but I try to keep outside dependencies from Python to a total minimum because again, I just don't like when things change from underneath me, how things function.

So it's one of the things where I can do a lot without. Installing third party libraries, so wherever I can avoid it, I do.

Nate: [00:26:47] So let's talk a little bit about these patterns that you have. So Lambda functions generally have a pretty well defined structure, and it's basically through that convention. It makes it somewhat straightforward to write each function. Can you talk a little bit about like, I don't know, the anatomy of a Lambda function?

Steph: [00:27:05] Yeah, so at its basic core, the most important thing that every Lambda function in the world is going to have is something called a handler. And so the handler is basically a function that is accessed to begin the way that it starts.

So, any Lambda function when it's invoked. So anytime you are calling it, it's called invoking a Lambda function. It sends it parameters that are event. An event is basically just the data that defines, Hey, this is stuff you need to act on. And it sends it something called context, which a lot of time you never touched the context object.

But it's useful too, because AWS provides it with every Lambda and it's basically like, Hey, this is the ID of the currently running Lambda function. You know, this is where you're running. This is the Lambdas name. So like for logging stuff, context can be really useful. Or for stuff where it's like your function code may need to know something about where it is.

You can save yourself time from, you don't have to use like an environment. They're able, sometimes if you can look in the context object. So at the core it's cause you have at least a file, you can name it whatever you want. A lot of people call it index and then within that file you define a function called handler.

Again, it doesn't have to be called handler, but. That makes it easy to know which one it is, and it takes that event and context. And so really, if that's all you have, you can literally just have your Lambda file be one Python file that says, you can say def handler takes, you know, object and then return something.

And then that can be it. As long as you define that index dot handler as your handler resource, which is, that's a lot of words, but basically we need to find your Lambda within AWS. The required parameters are basically the handler URI, which is the name of the file, and then a.in the name of the handler function.

So that's at its most basic. Every Lambda has that, but then you start, you know, scoping it out so you can actually know, organize your code decently. And then it's just a matter of, is there a read me there. Just like any other Python application really, you know, do you have a read me? Do you want to use like a requirements.txt file to like define, you know, these are the exact third party libraries that I'm going to be using.

That's really useful. And if you're defining it with SAM, which I really recommend. Then there's a file called template.yaml And that's just contains the actual, like AWS resource definition, as well as any like CloudFormation defined resources that you're using. So you can make a template.yaml as the infrastructure kind of as code, and then everything else, just the code as code.

Nate: [00:29:36] Okay. So unpacking that a little bit, you'll invoke this function and they'll basically be two arguments. One is the event that's coming in the event in particular, and then it'll also have the context, which is sort of metadata about the context in which this request is running. So you mentioned some of the things that come in the context, which is like what region you're in or what the name of the function is that you're on.

What are some of the parameters in the event object.

Steph: [00:30:02] So the interesting thing about the event object. Is, it can be anything. It just has to be basically a Python dictionary or basically, you know, you could think of it like a JSON, right? So it's not predefined and Lambda itself doesn't care what the event is.

That's all up to your code to decide what is it, what is a valid event, and how to act on it. So API gateway if you're using that. There's a lot of example events, API gateway will send and so if you like ever try it, look at like the test events for Lambda, you'll see a lot of like templates, which are just JSON files with like expected outputs.

But really it can be anything.

Nate: [00:30:41] So the way that Lambda's structured is that API gateway will typically pass in an event that's maybe like the request was a POST request, and then it has these like query parameters or headers attached to it. And all of that would be within like the request object. But the event could also be like you mentioned like CloudWatch, like there's like a CloudWatch event that could come in and say, you basically just have to configure your handler to handle any of the events you expect that handler to receive.

Steph: [00:31:07] Yeah, exactly.

Nate: [00:31:09] So let's talk a little bit more about the development tooling. How in the world do you test these sorts of things? Like with, do you have to deploy every single time or tell us about like the development tooling that you use to test along the way.

Steph: [00:31:22] Yeah. So I'm, one of the great things about SAM and there's some other tools for this as well, is that it lets you test your Lambdas locally before you deploy it, if you want.

And the way that it does that is, I mentioned earlier that Lambda is really at its core, a container, like a Docker container running on a server somewhere. Is, it just creates a Docker container that behaves exactly like a Lambda would, and it sends your events. So you would just define basically a JSON with the expected data from either API gateway or whatever, right?

You make a test one and then it will send it to that. It'll build it on demand for you and you test it all locally with Docker. When you like it, you use the same tool and it'll package it up and deploy it for you. So yeah, it's actually not too bad to test locally at all.

Nate: [00:32:05] So you create JSON files of the events that you want it to handle, and then you just like invoke it with those particular events.

Steph: [00:32:12] Yeah, so basically like if I created it like a test event, I would save it to my repo is tests slash API gateway event.json Had put in the data I expect, and then I would do like a SAM. So the command is like SAM, a local invoke, and then I would give it to the file path to the JSON, and it would process it.

I'd see the exact same output that I would expect to see from Lambda. So it'll say, Hey, this took this many milliseconds to invoke the response code was this, this is what was printed. So it's really useful just for. It's almost a one to one with what you would get from Amazon Lambda output.

Amelia: [00:32:50] And then to update your Lambda functions.

Do you have to go inside the AWS GUI or can you do that from the command line.

Steph: [00:32:57] yeah, no, you can do that from the command line with Sam as well. So there's a Sam package and Sam deploy command. It's useful if you need to use basically any type of CII testing service to like manage your deployments or anything like that.

Then you can get a package it and then send it the package to your, Whatever you're using, like Gitlab or something, right. For further validation and then have Gitlab deploy it. Like if you don't want people to have deployed credentials on their local machine, that's the reason it's kind of broken up into two steps there.

But basically you just do a command, Sam deploy, and what it does is it goes out to Amazon. It says, Hey, update the Lambda to point to this as the new resource artifact to be invoked. And if you're using and which I think it's enabled by default, not actually the versioning feature, it actually just adds another version of the Lambda so that if you need to roll back, you can just go to the previous one, which is really useful sometimes.

Nate: [00:33:54] So let's talk a little bit about deployment. One of the things that I think is stressing when you are deploying Lambda functions is like, I have no idea how much it's going to cost. How is it going to cost to launch something, and how much am I going to pay? And I guess maybe you can kind of calculate if you estimate the number of requests you think you're going to get, but how do you approach that when you're writing a new function?

Steph: [00:34:18] Yeah, so the first thing I look at is what's the minimum, basically timeout, what's the minimum memory usage? So number of invocations is a big factor, right? So like if you have free tier, I think it's like a million invocations you get, but that's like assuming like a hundred under a hundred milliseconds each.

So when you just deploy it, there's no cost for just deploying it. You don't get charged until it's invoked. If you're storing like an artifact and as three, there's a little cost for you keeping it in as three. But it's usually really, really minimal. So the big thing is really, how many times are you give it?

Is it over a million times and or are you not on free tier? The costs, like I said, it gets batched

together and it's actually really pretty cheap just in terms of number of invocations cause at the bigger places where you can normally save costs. Is it over-provisioned for how much memory you give it?

Right. I think the smallest unit you can give it as 128 that can go up to like two gigabytes maybe more now. So if you have it set where, Oh, I want it to use this much memory and it really never is going to use that much memory and that's kind of like wasteful or if you know, if it uses that much, that's like something's wrong

Nate: [00:35:25] cause you pay, you configure beforehand, like we're going to use max 128 megabytes of memory and then it's allocated on invocation or something like that.

And then if you set it too high, you were going to pay more than you need to. Is that right?

Steph: [00:35:40] Yeah. Well and it's more like, I think I'll have to double check cause it actually just show you how much memory you use each time in Lambda is invoked. So you can sort of measure if it's getting near that or if you think you need more than it might give an error.

If it doesn't, it isn't able to complete . But in general, like. I haven't had many cases where the memory has been the limiting factor. I will say that, the timeout can sometimes get you, because if a Lambda's processing forever, like let's say API gateway, a lot of times API gateway has its own sort of timeout, which is, I think it's like 30 seconds to respond.

And if your Lambda is set to, you know, you give it five minutes to process it always five minutes processing. If you, let's say that you program something wrong and there's like a loop somewhere and it's going on forever, it'll waste five minutes. Computing API gateway will give up after 30 seconds, but you'll still be charged for the five minutes that Lambda was kind of doing its thing.

Nate: [00:36:29] it's like, I know that AWS is services and Lambda are created by like world-class engineers. It's the highest performing infrastructure probably in the world, but as a user, sometimes it feels like there's giant Rube Goldberg machine, and I have like no idea. All of the different aspects that are involved in, like how do you manage that complexity?

Like when you're trying to learn AWS, like let's say someone who's listening to this, they want to try to understand this. How do you. Go about making sense of all of that. Like difficulty.

Steph: [00:37:02] You really have to go through a lot of the docs, like videos, people showing you how they did something isn't always the best just because they kind of skirt around all the things that went wrong in the process, right? So it's really important just to understand, just to look at the documentation for what all these features are before you use them. The marketing people make it sound like it's super easy and go, and to a degree, it really is like, it's easier than the alternative, right?

It's where you put your complexities the problem

Nate: [00:37:29] yeah, and I think that part of the problem that I have with their docs is like they are trying to give you every possible path because they're an infrastructure provider, and so they support like these very complex use cases. And so it's, it's like the world's most detailed choose your own adventure.

It's like, Oh, have you decide that you need to take this path? Go to or this one path B. Path C there's like so many different like paths you can kind of go down. It's just a lot when you're first learning.

Steph: [00:37:58] It is, and sometimes like the blog posts have better kind of actual tutorial kind of things for like real use cases.

So if you have a use case that is general enough, a lot of times you can just Google for it and there'll be something that one of their solution architects wrote up about had actually do it from like a, you know, user-friendly perspective that anything with the options is that you need to be aware of them too, just because the way that they interact can be really important.

If you do ever do something that's not done before and the reason why it's so powerful and what, you know why it takes all these super smart people to set up and all this stuff is actually because are just so many variables that go into it that you can do so much with that. It's so easy to shoot yourself in the foot.

It always has been in a way, right? But it's just learning how to not shoot yourself in the foot and use it like with the right agility. And once you get that down, it's really great.

Amelia: [00:38:46] So there's over a hundred AWS services. How do you personally find new services that you want to try out or how does anyone discover any of these different services.

Steph: [00:38:57] What I do is, you know, I get the emails from AWS whenever they release new ones, and I try to, you know, keep up to date with that. Sometimes I'll read blog posts that I see people writing about how they're using some of them, but honestly, a lot of it's just based off of when I'm doing something, I just keep an eye out.

If there's something like, I wished that it did sometimes, like, I used some AWS systems manager a lot, which is basically. You can think of it. It's sort of like a config management an orchestration tool. It lets you, basically, it's a little agent. You can sell on servers and you can, you know, just automate patching and all this other like little stuff that you would do with like Chef or Puppet or other config management tools.

And. It seems like they keep announcing services. What are really just like tie ins to existing ones, right? Which is like, Oh, this one adds, you know, for instance, like the secret management and the parameter store would secrets. A lot of them are really just integrations to other AWS services, so it's not as much.

The really core ones that everyone needs to know is, you know, EC2 of course Lambda, so big API gateway and CloudFormation because it's basically. The infrastructure as code format that is super useful just for structuring everything. And I guess S3 is the other one. Yeah. Let's talk about

Nate: [00:40:15] cloud formation for a second.

So earlier you said your Lambda function is typically going to have a template.yaml. Is that template.yaml CloudFormation code.

Steph: [00:40:26] So at its core, yes. But the way you write it is different. So how it works is that the Sam templating language is defined to simplify. What you would with CloudFormation.

So a CloudFormation you have to put a gazillion variables in.

And it's like, there's some ways to like make that easier. Like I really like using a Python library called Tropo sphere, where you can actually use Python to generate your own cloud formation templates for you. And it's really nice just cause, you know, I like to know I'll need a loop for something or I'll need to like fetch a value from somewhere else.

And it's great to have that kind of flexibility with it . The, the Sam template is specifically a transform, is what they call it, of cloud formation, which means that it executes against the CloudFormation service. So the CloudFormation service receives that kind of turns it into the core that it understands and executes on it.

So at the core of it, it is executing on top of CloudFormation. You could create a mostly equivalent kind of CloudFormation template usually, but there's more to it. But there's a lot of just reasons why you would want to use Sam for serverless specifically, just because they add so many niceties and stuff around, you know, permissions management that you don't have to like think of as much and shortcuts and it's just a lot easier to deal with, which is a nice change.

But the power of CloudFormation is that if you wanted to do something. That like maybe SAM didn't support the is outside the normal scope. You could just stick a CloudFormation resource definition in it and it would work the same way cause it's running against it. It's one of those services where people, sometimes it gets a bad rap because it's so complicated, but it's also super stable.

It behaves in a super predictable way and it's just, I think learning how to use that when I worked at AWS was really valuable.

Nate: [00:42:08] What tools do you use to manage the security when you're configuring these things? So earlier you mentioned IAM, which is a, I don't know what it stands for.

Steph: [00:42:19] Identity and access management,

Nate: [00:42:20] right?

Which is like configuration language or configuration that we can configure, which accounts have access to certain resources. let me give you an example. One question I have is how do you make sure each system has the minimum level of permissions and what tools you use? So for example, I wrote this Lambda function a couple of weeks ago.

Yeah. I was just following some tutorial and they said like, yeah, make sure that you create this IAM role as like one of the resources for launching this Lambda function, which I think they're like, that's great. But then like. How do I pin down the permissions when I'm granting that function itself permissions to grant new IAM roles. So it was like I basically just had to give it route permission according to my low, my skill level, because otherwise I wasn't able to. Create, I am roles without the authority to create new roles, which just seems like root permissions.

Steph: [00:43:13] Yes. So there are some ways that's super risky, honestly, like super risky.

Nate: [00:43:17] Yeah. I'm going to need your help,

Steph: [00:43:19] but it is a thing that there are case you can, you can limit it down with the right kind of definition. So

IAM. It's really powerful. Right? So the original case behind a MRI was that, so you're a servers so that if you had a, an application server and a database server separately.

You could give them separate IAM roles so that they could have different things they could do. Like you never really want your database server to maybe. Interface directly with, you know, an S three resource, but maybe you want your application server to do that or something. So it was nice because it really let you limit down the scope from a servers and you don't, cause you have to leave keys around if you do it .

So you don't have to keep keys anywhere on the server if you're using IAM roles to access that stuff. So anytime you're storing like an AWS secret key on a server, or like in a Lambda, you kinda did something wrong. The thing they are just because that's, AWS doesn't really care about those keys. It just looks, is it a key?

Do it here. But when you actually use IAM policies, you could say it has to be from this role. It has to be executed from, you know, this service. So it can make sure that it's Lambda or the one doing this, or is it somebody trying to assume Lambda credentials? Right? There's so much you can do to kind of limit it.

With I am. So it was really good to like learn about that. And like all of the AWS certifications do focus on IAM specifically. So if anyone thinking about taking like an AWS certification course, a lot of them will introduce you to that and help a lot with understanding like how to use those correctly.

But for what you talked about with you, like how do you deal with a function that passes, that creates a role for another function, right? What you would do in that kind of case is there's an idea of IAM paths. So basically you can give them like as namespacing for IAM permissions, right? So you can make a, I am role that can grant functions that can create roles .

Only underneath its own namespace. Within its own path.

Nate: [00:45:20] When you say namespaces, I mean did inherit permissions. But the parent permission has?

Steph: [00:45:28] Depends. So it doesn't inherit itself. But like, let's say that I was making a build server . And my build server, we had to use a couple of different roles for different pieces of it. For different steps. Cause they used different services or something. So we would give it like the top level one of build. And then in my S3 bucket, I might say aloud upload for anyone whose path had built in it. So that's, that's the idea that you can limit on the other side, what is allowed.

And so of course, it's one of the things where you want to by default blacklist as much as possible, and then white list what you can. But in reality it can be very hard to go through some of that stuff. So you just have to try to, wherever you can, just minimize the risk potential and understand what's the worst case that could happen if someone came in and was able to use these credentials for something.

Amelia: [00:46:16] What are some of the other common things that people do wrong when they're new to AWS or DevOps?

Steph: [00:46:22] One thing I see a lot is people treating the environment variables for Lambdas as if they were. Private, like secrets. So they think that if you put like an API key in through the environment variable that that's kind of like secure, but really like I worked in AWS support, anyone would be able to see that if they were helping you out in your account.

So it's not really a secure way to do that. You would need to use a surface like secrets manager, or you'd have some kind of way to, you would encrypt it before you put it in and then the Lambda would decrypt it, right? So there's ways to get around that, but like using environment variables as if there were secure or storing.

Secure things within your git repositories that get pushed to AWS is like a really big thing that should be avoided. And we said, what else did you ever own?

Nate: [00:47:08] I'm pretty sure that I put an API key in mine

Steph: [00:47:11] before. So yeah, no, it's one of the things people do, and it's one of those things that. A lot of people, you know, maybe nothing will go wrong and it's fine, but if you can just reduce the scope, then you don't have to worry about it.

And it just makes things easier in the future.

Amelia: [00:47:27] What are like the new hot things that are up and coming?

Steph: [00:47:30] So I'd say that there's more and more kind of uses for Lambda at edge for like IOT integration, which is pretty cool. So basically Lambda editor. Is basically where you can process your lamb dos computers, basically, like, you know, like, just think of it as like raspberry pi.

It's like that kind of type thing, right? So you could take a

small computer and you could put it like, you know, maybe where it doesn't have a completely like, consistent internet connection . So maybe if you're doing like a smart vending machine or something. Think of it like that. Then you could actually execute the Lambda logic directly there and deploy it to there and manage it from AWS whenever it does have like a network connection and then you can basically, it just reduces latency.

A lot and let your coat and lets you test your code both like locally and then deploy it out. So it was really cool for like IOT stuff. There's been a lot of like tons of stuff happening in machine learning space on AWS too much for me to even keep on top of. But a lot of the stuff around Alexa voices is super cool, like a poly where you can just, if you play with your Alexa type thing before, it's cool, but you could just write a little Lambda program to actually generate, you know, whatever you want it to say in different accents, different voices on demand, and integrate it with your own thing, which is pretty cool. Like, I mean, I haven't had a super great use case for that yet, but it's fun to play with.

Amelia: [00:48:48] I feel like a lot of the internet of things are like that.

Steph: [00:48:52] Oh, they totally are. That they really are. But yeah, it's just one of the things you had to keep an eye out for. Sometimes the things that, for me, because I'm dealing so much with like enterprisey kind of stuff that excite me are not really exciting to other people cause it's like, yay, patching has a way to like lock it down to a specific version of this at this time.

You know, it's like, it's like, it's not really exciting, but like, yeah.

Nate: [00:49:14] And I think that's one of the things that's interesting talking to you is like I write web apps, I think of serverless from like a web app perspective, but it's like, Oh, I'm going to write an API that will let her know, fix my images on the way up or something.

But a lot of the uses that you alluded to are like using serverless for managing, other parts of your infrastructure, they're like, you're using, you've got a monitor on some EC2 instance that sends out a cloud watch alert that like then responds in some other way, like within your infrastructure.

So that's really interesting.

Steph: [00:49:47] Yeah, no, that's, it's just been really valuable for us. And like I said, I mentioned the IAM stuff. That's what makes it all possible really.

Amelia: [00:49:52] So this is totally unrelated, but I'm always curious how people got into DevOps, because I do a lot of front end development and I feel like.

It's pretty easy to get into front end web development because a lot of people need websites. It's fairly easy to create a small website, so that's a really good gateway, but I've never like on the weekend when it to spin up a server or any of this,

Steph: [00:50:19] honestly for me, a lot of it was like my first job in college.

Like I was basically part-time tech support / sys admin. And I always loved L nuxi because, and the reason I got into Lennox in the first place is I realized that when I was in high school that I could get around a lot of the schools, like, you know, spy software that won't let you do fun stuff on the internet or with the software if you just use a live boot Linux USB.

So part of it was just, I was using it. So, you know. Get around stuff, just curiosity about that kind of stuff . But when I got my first job, that's kind of like assist admin type thing. It kind of became a necessity. Because you know when you have limited resources, it was like me and like another part time person and one full time person and hundreds of people who we had to keep their email and everything.

Working for them. It kind of becomes a necessity thing cause you realize that all the stuff that you have to do by hand back then, you can't keep track of it all. You can't keep it all secured for a few people. It's extremely hard. And so one way people dealt with that was, you know, offshoring or hiring people, other people to maintain it.

But it was kind of cool at the time to realize that the same stuff I was learning in my CS program about programming. There's no reason I couldn't use that for my job, which was support and admin stuff. So, I think I got introduced to like chef, that was the first tool that I really, I was like, wow, this changes everything.

You know, because you would write little Ruby files to do configuration management and then your servers would, you know, you run the chef agent to end, you know. You know, they'd all be configured exactly the same way. And it was testable. And there's all this really cool stuff you could do with chef that I, you know, I had been trying to play to do with like, you know, bash script or just normal Python scripts.

But then chef kind of gave me that framework for it. And I got a job at AWS where one of the main components was supporting their AWS ops work stool, which was basically managed chef deployments. And so that was cool because then I learned about how does that work at super high scale. What are other things that people use?

And right before I actually, you know, got my first job as a full time dev ops person was when they, they were releasing the beta for Lambda. So I was in the little private beta for AWS employees and we were all kind of just like, wow, this changes a lot. They'll make our jobs a lot easier, you know, in a way it will reduce the need for some of it.

But we were so overloaded all the time. And I feel like a lot of people from a perspective know what it feels like to be like. There's so much going on and you can't keep track of it all and you're overloaded all the time and you just want it to be clean and not have to touch it and to do less work at dev ops was kind of like the way forward.

So that's really how I got into it.

Amelia: [00:52:54] That's awesome. Another thing I keep hearing is that a lot of dev ops tests are slowly being automated. So how does the future of DevOps look if a lot of the things that we're doing by hand now will be automated in the future?

Steph: [00:53:09] Well, see, the thing about dev ops is really, it's more of like a goal.

It's an ideal. A lot of people, if they're dev ops purists and they'll tell you that it means it's having a culture where. There are not silos between developers and operations, and everyone knows how to deploy and everyone knows how to do everything. But really in reality, not everyone's a generalist.

And being a generalist in some ways is kind of its own specialty, which is kind of how I feel about the DevOps role that you see. So I think we'll see that the dev ops role, people might go by different names for the same idea, which is. Basically reliability engineering, like Google has a whole book about site reliability engineering is the same kind of philosophy, right? It's you want to keep things running. You want to know where things are. You want to make things efficient from an infrastructure level. But the way that you do it is you use a lot of the same tools that developers use. So I think that we'll see tiles shift to like serverless architect is a big one that's coming up because that reliability engineering is big.

And we may not see people say dev ops is their role as much, but I don't think the need for people who kind of specialize in like infrastructure and deployment and that kind of thing is going to go away. You might have to do more with less, right? Or there might be certain companies that just hire. A bunch of them, like Google and Amazon, right?

They're pro still going to be a lot of people, but maybe they're not going to be working at your local place because if they're going to be working for the big people who actually develop the tools that are used for that resource. So I still think it's a great field and it might be getting a little harder to figure out where to enter in this because there's so much competition and attention around the tools and resources that people use, but it's still a really great field overall. And if you just learn, you know, serverless or Kubernetes or something that's big right now, you can start to branch out and it's still a really good place to kind of make a career.

Nate: [00:54:59] Yeah. Kubernetes. Oh man, that's a whole nother podcast. We'll have to come back for that.

Steph: [00:55:02] Oh, it is. It is.

Nate: [00:55:04] So, Steph, tell us more about where we can learn more about you.

Steph: [00:55:07] Yeah. So I have a book coming out.

Nate: [00:55:10] Yes. Let's talk about the book.

Steph: [00:55:12] Yeah. So I'm releasing a book called, Fullstack Serverless. See, I'm terrible.

I should know exactly what the title, I don't

Nate: [00:55:18] know exactly the title. . Yeah. Full stack. Python with serverless or full-stack serverless with Python,

Steph: [00:55:27] full stack Python on Lambda.

Nate: [00:55:29] Oh yeah. Lambda. Not serverless.

Steph: [00:55:31] Yeah, that's correct. Python on Lambda. Right. And that book really has, it could take you from start to finish, to really understand.

I think if you read this kind of book, if I, if I had read this before, like learning it, it wouldn't feel so maybe. Some people confusing or kind of like it's a black box that you don't know what's happening. Cause really at its core lambda that you can understand exactly everything that happens. It has a reason, you know it's running on infrastructure that's not too different from people who run infrastructure on Docker or something.

Right. And the code that you write. Can be the same code that you might run on a server or on some other cloud provider. So the real things that I think that the book has that maybe kind of hard to find elsewhere is there's a lot of information about how do you do proper testing and deployment?

How do you. Manage your secrets, so you aren't storing those in them in those environment variables. Correct. It has stuff about logging and monitoring, all the different ways that you can trigger Lambda. So API gateway, you know, that's a big one. But then I mentioned S3 and all those other ones. there's going to be examples of pretty much every way you can do that in that book.

Stuff about optimizing cost and performance and stuff about using that. SAM, serverless application, a repository, so you can actually publish Lambdas and share them and even sell them if you want to. So it's really a start to finish everything you need to. If you want to have something that you create from scratch.

In production. I don't think there's anything left out that you would need to know. I feel pretty confident about that.

Nate: [00:57:04] It's great. I think one of the things I love about it is it's almost like the anti version of the docs, like when we talked about earlier that the docs cover every possible use case.

This talks about like very specific, but like production use cases in a very approachable, like linear way. You know, even though you can find some tutorials online, maybe. Like you mentioned, they're not always accurate in terms of how you actually do or should do it, and so, yeah, I think your book so far has been really great in covering these production concerns in a linear way.

All right. Well, Steph is great to have you.

Steph: [00:57:37] Thank you for having me. It was, it was great talking to you both.

9 ตอน

#Tech #Newline #Amelia Wattenberger #Nate Murray #Careers #JavaScript #Programming #Web Development

Serverless on AWS Lambda with Stephanie Prime

newline

published 4y ago

แบ่งปัน