Cheap Insights

Logging API Gateway Calls Using AWS CloudWatch Logs Insights and PowerShell Lamba Functions

by JamesQMurphy | November 26, 2019

Did you know that you can write AWS Lambda Functions in PowerShell? Honestly, I didn't know it until a few weeks ago.

AWS announced Lambda support of PowerShell Core over a year ago, but I didn't learn about it until I was chatting with another attendee at Microsoft Ignite 2019¹. And apparently, I wasn't the only one who didn't know. He went on to tell me that his coworkers insisted that it wasn't possible. Naturally, he took great pleasure in showing them his PowerShell scripts already running as Lambda functions. This was big news for me, because I'm a terrible JavaScript programmer, and I don't know a lick of Python². But I know PowerShell very well, and I was thrilled to learn that I could start writing rich Lambda functions in my favorite scripting language, without having to ramp up on JavaScript or Python first.

When you think about it, it makes sense. AWS Lambda supports .NET Core 2.1, which is what PowerShell 6.0 runs on. But AWS took it one step further. This is a fully-supported solution, complete with tooling that takes care of everything you need to package and deploy PowerShell scripts as Lambda functions. I won't get into the mechanics of how the AWS libraries do it; that's a topic for a future blog post. Instead, I'll focus on my first use of the technique -- AWS CloudWatch Logs Insights.

How AWS Does Logging

The built-in logging system in AWS is CloudWatch Logs, and to my knowledge, every AWS service uses it for logging. CloudWatch consists of Log Groups (similar to folders), which in turn contain Log Streams (akin to files). Each instance of a service within AWS writes log messages to Log Streams within its own Log Group. For example, in the screenshot below, you can see the Log Groups created by Lambda functions as well as the ones created by API Gateway:

Screenshot of CloudWatch Log Groups Page

Clicking into a Log Group (in this case, one of the API Gateway Log Groups), you can see the various Log Streams. Note that API Gateway keeps several empty Log Streams available in case it needs to quickly write logs:

Screenshot of a CloudWatch Log Group Page, showing a list of Log Streams

Here is a snippet of one of the CloudWatch Log Streams for API Gateway. Note that I needed to turn on logging inside of API Gateway; this is done at the deployment level in case you need to do it.

Screenshot of CloudWatch Logs for API Gateway, showing timestamp and message fields

Each log message in the stream has just two parts: the timestamp, and the message itself. As you can see, API Gateway writes the Request ID into each message, and then writes any details on the end. Compare this to a CloudWatch Log Stream from AWS Lambda, which simply writes the console output to the log. (Those familiar with .NET Core will recognize most of the messages as .NET Logging messages. I set the default level to Information.)

Screenshot of CloudWatch Logs for Lambda, showing timestamp and message fields

Each service can (and does) write to CloudWatch however it chooses. As we will soon see, this poses a challenge when attempting to analyze the logs, even with the built-in tools like CloudWatch Logs Insights.

Trying CloudWatch Logs Insights

I first learned of AWS CloudWatch Logs Insights through, of all things, a banner on the CloudWatch Logs page:

Screen capture of AWS CloudWatch Logs with

I decided to give CloudWatch Logs Insights a try and see what it could analyze from my website's Log Groups. For starters, I selected the Log Group for the API Gateway service. This is what the default query returned:

CloudWatch Logs Insights screen with default query returning @ingestionTime, @logStream, @message, and @timestamp

For API Gateway, CloudWatch Logs Insights will only discover the basic built-in fields such as @timestamp and @message. It didn't even discover the Request ID field.

Next, I looked at the sample queries. There was nothing specific to API Gateway, but there were some other AWS services listed:

Sample queries dropdown showing options for Lambda queries, VPC flow log queries, CloudTrail queries, Common queries, Route 53 queries, and AWS AppSync queries

The sample queries for Lambda functions piqued my interest, so I selected the Log Group for my website's Lambda function and chose the "Find the most expsensive requests" query. The query worked, and furthermore, a few more fields were automatically discovered:

CloudWatch Logs Insights for a Lambda query with many more discovered fields

However, even with these additional fields, the query didn't provide much useful information. In the screenshot above, we can see that two requests were billed at 4500 milliseconds, but since we only have the request IDs, we don't know what those requests actually are without further querying.

Turning back to the API Gateway logs, we can see that all of the relevant information is present in the logs. The problem is that the information is smeared across several log messages. What I needed was a way to combine the relevant parts of those messages into a singular, more useful log message.

It turns out, there is a way... and this is where I got to use my PowerShell skills.

Subscribing to CloudWatch Logs

One of the coolest features of Lambda Functions is that they can be triggered in response to just about anything within AWS. In this case, I took advantage of CloudWatch Logs Subscriptions, which can trigger a Lambda function after a Log Stream has been written. And since Lambda Functions can be written in PowerShell, I now had an easy way to run a PowerShell script to process the CloudWatch Logs from API Gateway.

The full source of the resulting PowerShell script can be found [here].

When running as an AWS Lambda Function, PowerShell scripts get two built-in variables: $LambdaInput and $LambdaContext. For CloudWatch triggers, $LambdaInput contains a GZip-compressed, Base 64-encoded stream of the log messages that were written. This is how to decode the data in PowerShell:

$ZipBytes = [System.Convert]::FromBase64String($LambdaInput.awslogs.data)
$ZipStream = New-Object System.IO.Memorystream
$ZipStream.Write($ZipBytes, 0, $ZipBytes.Length)
$ZipStream.Seek(0, [System.IO.SeekOrigin]::Begin) | Out-Null
$GZipStream = New-Object System.IO.Compression.GZipStream($ZipStream, [System.IO.Compression.CompressionMode]([System.IO.Compression.CompressionMode]::Decompress), $true)
$streamReader = New-Object System.IO.StreamReader($GZipStream, [System.Text.Encoding]::UTF8)
$event = $streamReader.ReadToEnd() | ConvertFrom-Json

In the resulting $event object, the log messages themselves are in the logevents property, so it's a simple call to ForEach-Object to iterate through the messages. Using the Request ID GUID as a grouping mechanism, I create a custom object ($requestObj) to hold the details of each HTTP request. After the loop, I wind up with a collection (as a hashtable) of objects, each one representing a single HTTP request:

$regexGuid = '[0-9a-f]{8}[-][0-9a-f]{4}[-][0-9a-f]{4}[-][0-9a-f]{4}[-][0-9a-f]{12}'
$requestObjects = @{}
$event.logEvents | ForEach-Object {

    $thisLogEvent = $_

    # See if message matches patten:  (GUID) logMessage
    if ($thisLogEvent.message -match "^\(($regexGuid)\) (.+)`$") {
        $requestId = $Matches[1]
        $logMessage = $Matches[2]

        if ($requestObjects.ContainsKey($requestId)) {
            $requestObj = $requestObjects[ $requestId ]
        }
        else {
            $requestObj = New-Object -TypeName PSObject -Property @{
                id = $requestId
                timestamp = $thisLogEvent.timestamp
                'aws.loggroup' = $event.logGroup
                'aws.logstream' = $event.logStream
            }
        }

        # Look for log messages with specific patterns (see below)
        switch -Regex ($logMessage) {
            ...
        }

        $requestObjects[ $requestId ] = $requestObj
    }
}

To gather the specific properties of each request, I look for specific log messages using regular expressions. The majority of the properties come from the Method request headers and Method response headers messages. To parse those, I wrote a function named Split-Headers; you can find the source code to this function in the PowerShell script. Note that I use the Successfully completed execution message to calculate the total elapsed time of the request:

        # Look for log messages with specific patterns
        switch -Regex ($logMessage) {

            '^HTTP Method: (\w+), Resource Path: (.+)$' {
                $requestObj |
                    Add-Member -MemberType NoteProperty -Name method       -Value $Matches[1] -PassThru |
                    Add-Member -MemberType NoteProperty -Name resourcepath -Value $Matches[2]
            }

            '^Method request headers: \{(.*)\}$' {
                Split-Headers -headers $Matches[1] -prefix request | ForEach-Object {
                    $requestObj | Add-Member -MemberType NoteProperty -Name $_.name -Value $_.value
                }
            }

            '^Method response headers: \{(.*)\}$' {
                Split-Headers -headers $Matches[1] -prefix response | ForEach-Object {
                    $requestObj | Add-Member -MemberType NoteProperty -Name $_.name -Value $_.value
                }
            }

            '^Method completed with status: (\d+)$' {
                $requestObj |
                    Add-Member -MemberType NoteProperty -Name httpstatus -Value $Matches[1]
            }

            '^Successfully completed execution$' {
                $elapsed = $thisLogEvent.timestamp - $requestObj.timestamp
                $requestObj |
                    Add-Member -MemberType NoteProperty -Name elapsed -Value $elapsed
            }
        }

Once I have the request objects, it's simply a matter of writing them out in JSON format to a new CloudWatch stream, since AWS CloudWatch Logs Insights will automatically discover fields in JSON-formatted data. For simplicity, I simply wrote the contents to the console, knowing that they would be captured in the AWS CloudWatch Logs:

# Write out each request object
$requestObjects.Values.GetEnumerator() | ForEach-Object {
    Write-Host ($_ | ConvertTo-Json -Compress)
}

Below are the results. The "[Information]" text comes from the PowerShell host that AWS uses (which I'll explore in another post). Since it sits outside of the JSON formatting, it doesn't get in the way.

CloudWatch Log Steam output from PowerShell Lambda Function

In fact, expanding the record right within the AWS Console will format the JSON nicely:

CloudWatch Log Steam output from PowerShell Lambda Function with JSON message expanded

If I use CloudWatch Logs Insights on the PowerShell scrpts Log Group, it discovers all of the JSON fields:

CloudWatch Logs Insights with all JSON fields discovered

And now we can set up some more useful queries. Here are a couple that I popped into a dashboard. Of course, I don't have any real traffic yet, but I'm prepared for when it happens!

CloudWatch Logs Insights Dashboard with Top Ten Pages Requested and Top Ten 404 Errors

The actual queries used in the dashboard are:

Top Ten Pages Requested

filter resourcepath like /^/[^\.]*$/
| stats count(*) as Count by resourcepath as Page
| sort Count desc
| limit 10

Top Ten 404 Errors

filter httpstatus = '404'
| stats count(*) as Count by resourcepath as Page
| sort Count desc
| limit 10

Hey, What Are Those?

Even for a site that gets almost no traffic, I learned something from analyzing the logs. The list of 404 errors is especially telling. Initially, I had no idea what was causing the apple-touch-icon-XXX.png requests. After a couple of quick searches, I had my answer, but the point is that I would have never have known about them without the logging analysis. (And I'll explain them in an upcoming post!)

A Disadvantage To This Approach

Setting up this logging mechanism was quick and easy, especially since I could leverage my PowerShell knowledge and I didn't have to worry about creating my own CloudWatch Log Stream. But one slight disadvantage to this approach is that the timestamps used by CloudWatch Logs Insights are the timestamps of the script, not the timestamps of the original events! They are close; in the example above, the timestamp of the processed log message was only 29 seconds behind the timestamp of the original log message. But they are different, which may or may not be significant in your use case.

The proper fix for this, of course, is to simply write the JSON message to a separately-created Log Stream using the CloudWatch Logs API, where you can pass the actual timestamp value as a parameter. This approach requires some coding, and it also requires that the Lambda function has permission to create a Log Stream.

One other point is that the logs here come exclusively from API Gateway. CloudWatch Logs Insights does not allow for any "join" operations between different Log Groups, so if you include the logs from another source (e.g., Lambda), they would not be related in any way.

Summary

Although I should still learn Python, AWS Lambda's support of PowerShell means that my years of PowerShell experience can be put to use to write rich Lambda functions. Using this knowledge, plus the built-in CloudWatch Logs Insights tooling, I was able to quickly spin up a couple of dashboards that provided me some real data -- and some new items to work on!

One of the best parts of attending Microsoft Ignite, at least for an extrovert like me, is the random conversations you have with people from all parts of the industry. This story is a prime example of learning something awesome at a conference, not from a conference.↩
For now, at least. Python is extremely popular these days and as a result, it's showing up everywhere. The writing is on the wall... I need to learn it.↩