Saving My Pennies And Saving My Dimes

Reducing Costs By Serving Static Content From Amazon S3 Instead Of Lambda

by JamesQMurphy | December 23, 2019

DevOps engineers have a different perspective than developers, because they solve different problems. I know this because I've been in both roles. And I was reminded of this difference just last week. Thanks to the logging dashboard I described in my previous post, I realized that I had set up my API Gateway all wrong. From a developer's standpoint, it was totally fine. But when I took a look from a DevOps perspective, (and more importantly, an account owner's perspective,) I realized that my mistake could have cost me money.

All changes described in this blog post were made in releases v0.3.7 and v0.3.8.

Static Versus Dynamic Content

First, a quick recap of how I have my site set up. This is a simplified diagram; a full diagram is available on the site's About page. Web content can come from two places: The S3 bucket (static content only), and the Lambda Function (static or dynamic content):

Diagram of site showing API Gateway forwarding static content to AWS S3 and dynamic content to AWS Lambda

By static content, I mean content that doesn't have to be generated specifically for each user -- images, style sheets, JavaScript files, etc. Static content can be (and should be) cached in the browser by supplying a Cache-Control response header. In a typical ASP.NET MVC application, the static content is placed inside the wwwroot folder, which makes deployment easier. Dynamic content, of course, includes content that changes depending on the request context, and includes most (if not all) of the HTML that comprises your site, as well as data returned from web services. It requires actual running code to generate the content on-the-fly.

Most ASP.NET MVC web applications will serve both static and dynamic content. To ease the demand on web servers, many large organizations will place the static content on CDN (content delivery network) type devices, and use some sort of network routing or reverse proxy to send the requests for the static content to the device. Using an S3 bucket is simply the cloud-based analog of this approach, and I described the basic setup in a previous post. However, I was approaching the problem from the wrong angle, as I'll describe later.

AWS Pricing

Let's compare the costs of the three services in the diagram (API Gateway, Amazon S3, and AWS Lambda). I calculated the prices of one million hits for each service in the "US East - Virginia" region (us-east-01), which is the region in which this site is hosted. My calculations are based on the basic options for each of the services, and the initial pricing tier (the rates go down a little when you hit certain levels). I've linked to each service's pricing page if you want further details. All prices mentioned are at the time of this writing (December 2019).

API Gateway: API Gateway REST API, which is what this site uses, charges per request and per GB transferred (source: AWS API Gateway Pricing). The per-request rate is $3.50 per one million hits. The data transfer rate, however, is a little harder to find. Way down at the bottom, under "Data transfer", the page states that "If you use external data transfers, you will be charged at the EC2 data transfer rate." This rate can be found here, and at the time of this writing, it's $0.09 per GB per month. However, the first GB is free, although the per-request rate still applies.

Amazon S3: The cost of standard S3 storage itself is pretty cheap... only $0.023 (that's less than three cents) per GB per month. The data transfer rate is also comparatively cheap. A million GET requests cost $0.401. If we were serving the S3 content directly to the Internet, the rate would also be $0.09 per GB, but since I'm transferring it via another AWS service (API Gateway) in the same region, it costs nothing additional. There is a free tier, but it's only good for a year (source: Amazon Simple Storage Service Pricing).

Lambda: Here's where costs get interesting. AWS Lambda charges per request, per duration, and per GB of RAM allocated (source: AWS Lambda - Pricing). For the duration, let's assume the very best Lambda performance for static content and use the minimum billable processing time, which is 100 milliseconds. The Lambda function is currently allocated at 320MB of RAM (0.32 GB), so by using the rate of $0.000016667 per GB-second, a million requests would cost 1,000,000 × 0.1 sec × 0.32 GB × $0.000016667 per GB-sec = $0.53. Add on the flat per-request rate of $0.20 per million, and we arrive at $0.73 per million requests. Since I'm not using the provisioned concurrency feature, I qualify for the free tier, so the first million hits at these execution times wouldn't cost me anything in any given month.

Here are the prices summarized:

Service Per Million Hits Per GB Downloaded Pricing Links
API Gateway $3.50 $0.09 (first GB free) Per Hit/Per GB
S3 $0.40 free Pricing
Lambda $0.73 (first million free) free Pricing

From the table above, we can see that the amount of data downloaded gets billed at the same rate, no matter if it comes from Lambda or S3. It's the per-request numbers that differ. Using the assumptions above, one million requests served from S3 cost $3.50 + $0.40 = $3.90, whereas the same million requests from Lambda (assuming 100 millisecond requests at 0.32 GB) cost $3.50 + $0.73 = $4.23. It may not seem like much of a difference, but remember that we assumed that all Lambda requests would only take 100ms. In reality, there are cold starts and other factors that might make the Lambda function take more time. After all, it's running an ASP.NET Core MVC application, so the request gets handled like all other requests.

In short, unless you somehow stay under the free tier, static content served from Amazon S3 is always cheaper. (And it's also always faster, by the way.) But there's another wrinkle: The million hits include all web requests, which means it includes any stray web request that comes to your site! Even a "404 NOT FOUND" response is more expensive when it comes from AWS Lambda!

But It's Only A Little Static Content, Right?

When I was first setting up this site back in June, I was focused on getting it up and running. I did make sure that I served most of my static content from S3, but not all of it. This is how I initially set up API Gateway:

Resource Method Mapped To
/ GET Lambda Function
/blogimages GET S3 Bucket
/dist GET S3 Bucket
/images GET S3 Bucket
Everything else ANY Lambda Function

If the request was for anything that wasn't in the blogimages, dist, or images folder, the request would be handled by the Lambda function. Initially, I was okay with this, because there were only a few small static content files to worry about:

  • favicon.ico, which would be cached by the browser2
  • robots.txt, which would only be requested by bots (and the curious)
  • js/site.js, which would also be cached by the browser

But I totally underestimated the number of "404 NOT FOUND" responses that could occur.

Giddy Up, Giddy Up, 404

In my last post I illustrated how I used CloudWatch Logs Insights to create a "Top 10 404 Responses" dashboard. The results were certainly eye-opening (well, at least to me they were):

Page Count
/apple-touch-icon-precomposed.png 22
/apple-touch-icon.png 22
/apple-touch-icon-152x152-precomposed.png 14
/apple-touch-icon-152x152.png 14
/apple-touch-icon-120x120-precomposed.png 8
/apple-touch-icon-120x120.png 8
/wp-login.php 4
/support/troubleshooting/china-status 1
/support/troubleshooting 1
/support/troubleshooting/china-status/ 1

The majority of the 404 responses were requests for PNG files beginning with apple-touch-icon. These requests are Apple's Safari web browser looking for Webpage Icons for your site. As you might guess from the requested file names, Safari looks for various sizes of the icon. If it doesn't find an icon, Safari will use a rather bland-looking Banangrams-style tile containing the first letter of the title of your website:

Picture of Safari using a Bananagrams-style tile for Cold-Brewed DevOps

I honestly would never had known about these icons if I didn't look at the logs. So I added the icons to the root of the wwwroot folder3, and the result looks great:

Picture of Safari using the apple-touch-icon for Cold-Brewed DevOps

The other 404 responses were bots just probing the site, looking for well-known avenues of attack. I understand the probe for wp-admin, which is the administrative page for WordPress sites, but the "support/troubleshooting" requests were a surprise.

Map The Routes, Not The Content

The point is, there's nothing you really can do to stop all the 404 responses -- they're going to happen. What you want to ensure, however, is that they aren't costing you money. That's why I changed the way that API Gateway was set up. Now, I explicitly map the routes to AWS Lambda and assume everything else is static content. (I still need a resource mapped to the /blogimages folder, since blog images don't live in the same location as the source code.)

Resource Method Mapped To
/ GET Lambda Function
/account ANY Lambda Function
/admin ANY Lambda Function
/blog ANY Lambda Function
/blogimages GET S3 Bucket
/home ANY Lambda Function
Everything else GET S3 Bucket

This is what it looks like in the AWS console:

Screenshot of API Gateway Setup in AWS Console

Re-doing the Nested Stacks

Previously, I only had two resources in the API calling the Lambda function -- the root resource (/) and the proxy resource underneath (/{proxy+}). Now, although the root resource still needs to call the Lambda function, the main proxy method (which is now a GET method) needs to call S3. In the CloudFormation template, I renamed the method to TheProxyGetMethod and pointed it to S3:

TheProxyGetMethod:
  Type: 'AWS::ApiGateway::Method'
  Properties:
    RestApiId: !Ref TheGatewayRestAPI
    ResourceId: !Ref TheProxyResource
    HttpMethod: GET
    AuthorizationType: NONE
    RequestParameters:
      method.request.path.proxy: true
    MethodResponses:
      - StatusCode: 200
        ResponseParameters:
          'method.response.header.Timestamp': true
          'method.response.header.Content-Length': true
          'method.response.header.Content-Type': true
          'method.response.header.Cache-Control': true
    Integration:
      Type: AWS
      IntegrationHttpMethod: GET
      Credentials: !GetAtt TheRoleForTheProxyGetMethod.Arn
      Uri: !Sub arn:aws:apigateway:${AWS::Region}:s3:path/${S3BucketForCodeParameter}/${S3BucketPathForStaticFilesParameter}/{fullpath}
      PassthroughBehavior: WHEN_NO_MATCH
      RequestParameters:
        integration.request.path.fullpath: 'method.request.path.proxy'
      IntegrationResponses:
      - StatusCode: 200
        ResponseParameters:
          'method.response.header.Timestamp': 'integration.response.header.Date'
          'method.response.header.Content-Length': 'integration.response.header.Content-Length'
          'method.response.header.Content-Type': 'integration.response.header.Content-Type'
          'method.response.header.Cache-Control': !Sub "'public, max-age=31536000'"

Because of this change, I could remove the resources that pointed at static content (/dist and /images), and instead specify resources that pointed at the dynamic content. Since I now had four new resources (one for each route) to point to the Lambda function, I defined a new stack template named cf-apiGatewayToLambda.yaml, which encapsulates all necessary elements to map an API Gateway resource to a Lambda function:

  • The resource itself, mapped to the Lambda function
  • An ANY method for the resource
  • A proxy resource underneath the main resource, to handle all subpaths
  • An ANY method for the proxy resource
  • A Lambda Permission resource

The source code for the new stack template can be found here. Invoking the template in the main template is straightforward; here is how it is called for the home route:

  TheHomeRouteResourceStack:
    Type: 'AWS::CloudFormation::Stack'
    Properties:
      TemplateURL: !Sub 'https://${S3BucketForCodeParameter}.s3.amazonaws.com/${S3BucketPathForStaticFilesParameter}/cf-apiGatewayToLambda.yaml'
      TimeoutInMinutes: 10
      Parameters:
        RestApiIdParameter: !Ref TheGatewayRestAPI
        ParentResourceIdParameter: !GetAtt TheGatewayRestAPI.RootResourceId
        ApiResourceNameParameter: home
        LambdaArnParameter: !GetAtt TheLambdaFunction.Arn

The same stack template is used for the account, admin, and blog routes.

Totally Un-Static

After I created Release 0.3.7, I realized that the Lambda function was not serving dynamic content any more, so I could remove the call to app.UseStaticFiles(). Well, sort of. I couldn't totally remove it, since I need it when the site runs locally on my desktop. But I could control whether or not it was used via configuration:

if (Configuration["UseStaticFiles"].ToLowerInvariant() == "true")
{
    app.UseStaticFiles();
}

I made this change and released it as v0.3.8. The value defaults to "true" but is set to "false" when deployed to AWS Lambda.

Some Minor Drawbacks

There are some things to watch out for with this approach:

  • Case now matters. API Gateway sees /home and /Home as two different routes; ASP.NET does not. However, I saw this as an opportunity to standardize on (lower) case, since Google also sees them differently. As part of Release 0.3.7, I also made all routes lowercase.

  • New routes will require a new nested stack. If I create a new route, I need to remember to create a new stack in the CloudFormation file. As a general rule, I'm not a fan of having to "remember" to do anything, and while it is conceivable to build a mechanism that links the two, it isn't worth it. Unlike the traditional approach, I don't have to worry about setting up a route or filtering rule in a networking device -- it's still a source code change, and it lives with the rest of the source code4. And if it does get forgotten, it will be caught in the DEV environment.

  • Beware of exposing binary files. The deployment process I use simply unzips the wwwroot folder and uploads it to an S3 bucket. If your solution serves files from other locations, make sure you're not accidentally serving files that you shouldn't be, such as config files or DLLs. S3 doesn't discriminate... it will return whatever it is told to return.

Summary

For any type of cloud application, big or small, you need to watch costs -- and this is only feasible if you are also collecting the metrics. AWS is very up-front with their pricing and billing structure, and every service they use writes to CloudWatch logs. As long as you are gathering the information, you should have all the data you need to make an informed decision.


  1. Even though it's relatively inexpensive, be aware that this rate includes the requests that occur if you're just browsing the contents of your S3 buckets in the AWS console!

  2. There is also an issue when serving .ico files from Lambda functions via API Gateway -- they don't get decoded properly. Previous versions of the site were not serving the favicon.ico file properly, and as a result, the browsers were constantly re-requesting this file, even when the Cache-Control header was present. This issue was eliminated when I switched it over to serving it from S3, so I didn't pursue it any further. Check out this Stack Overflow question for more details.

  3. I could have can use <link rel="apple-touch-icon" href="..."> links on all pages to specify a different folder for the icons, but this wouldn't have completely eliminated the requests in the root folder. If somebody browsed directly to an image file on this site, there would be no accompanying <link> tag, and Safari would default back to looking for the icon in the root folder.

  4. You probably know this, but this is what DevOps folks call infrastructure as code, and this example illustrates why it's a good thing.

 

Comments


 

Want to leave a comment? Sign up!
Already signed up? Sign in!