Querying CloudTrail Logs with AWS Athena
Overview
AWS CloudTrail is a tool to record all activities occur in your account. You can use CloudTrail to view, search, download, archive, analyze, and respond to account activity across your AWS infrastructure. By viewing Event history, you can troubleshoot operational and security incidents over the past 90 days in the CloudTrail console.
AWS Athena is a serverless interactive query service that makes it easy to analyze data directly in Amazon S3 using standard SQL. There is no infrastructure to set up or manage, and you pay only for the queries you run.
By combining these two services is a powerful way to enhance your analysis of AWS service activity. For example, you can use queries to identify trends and further isolate activity by attributes, such as source IP address or user.
Scenario
In this lab, you will learn to transfer logs from CloudTrail to AWS Athena. Then, we spend a lot of effort discussing how to query logs.
Step by step
CloudTrail logs to AWS Athena
You can automatically create tables for querying CloudTrail logs directly from the CloudTrail console.
-
Make sure CloudTrail had already created a Trail and it is recording.
-
On the service menu, select CloudTrail, Event history and click Run advanced queries in Amazon Athena.
-
Select bucket stored CloudTrail logs and click Create table.
According to the Cloudtrail setting, all logs will be stored in a specific bucket.
-
Remember the Athena table name which will be used later.
- Go to Athena to query logs -> click the button Go to Athena. > If you find some error during implementing this lab, it means you have already used the service before. You can copy the following Query sample to Athena Console of the Region in CloudTrail logs.
- Wait a minute for jumping to Athena.
Query Logs in Athena
Using Athena with CloudTrail logs to enhance your analysis of AWS service activity.
-
You can use SQL query to find data.
-
Copy the SQL query below and Run Query.
This query helps you find out What IAM User : Tina create, launch, start or run on 2019-07-02 and ordered by time.
Please change the <_table NAME_> created in step four.
SELECT eventtime, eventname, eventSource, awsRegion, requestID, eventID FROM
WHERE userIdentity.username = ‘Tina’ AND (eventname LIKE ‘%Create%’ OR eventname LIKE ‘%Launch%’ OR eventname LIKE ‘%Start%’OR eventname LIKE ‘%Run%’) AND eventtime Like ‘%2019-07-02%’ ORDER BY eventtime DESC
If you want to know more about SQL, look up more tips for querying cloud trail logs.
- Results look like this. > To save the results of the most recent query to CSV, choose the file icon.
Create a new table for querying results
Create tables from query results, without repeatedly querying raw data sets. This makes it easier to work with raw data sets.
- Click create a table from query.
-
Enter information below and Next.
- Database : select default.
- Table name :
YourTableName
> Note : Must be lowercase and only use underscore special characters - Output location :
s3://mybucket/myfolder/
> Replce your own bucket and folder name. - Output data format : Parquet
Before you typing this information you should create the bucket and the folder first.
-
After finishing checking the existing bucket and folder, click next -> Click Create and wait for two minutes.
Now you can run advanced SQL for your logs.
Save query results
- Click Save as next to Create.
-
Enter information below:
- Name :
YOUR NAME
- Description:
CreateRunStartLaunch
- Click Save. > replace name and description on your own.
- Name :
-
On the navigation bar, click Saved Queries, and you can see the following page showing the saved queries like below.
By saving query results, you can keep query forever until you delete it.
Run Advanced Query in Athena
- If you had noticed the format of the table created from CloudTrail to Athena, you will see some parameters whose format are string, such as requestParameters and responseElements.
-
If you want to extract information from that JSON format column, you can use json_extract.
Please change the <YourTableName> to your own table name created from the query.
SELECT DISTINCT eventtime, eventname, eventSource, awsRegion, requestID, eventID, json_extract(requestParameters, ‘$.roleName’)as roleName , json_extract(requestParameters, ‘$.resourcesSet’)as resourceitem , json_extract(requestParameters, ‘$.tagSet’)as tag, json_extract(requestParameters, ‘$.routeTableId’)as routeTableId, json_extract(requestParameters, ‘$.vpcId’)as vpcId, responseelements FROM
WHERE userIdentity.username = ‘Tina’ AND (eventname LIKE ‘%Create%’ OR eventname LIKE ‘%Launch%’ OR eventname LIKE ‘%Start%’OR eventname LIKE ‘%Run%’) AND eventtime Like ‘%2019-07-02%’ ORDER BY eventtime DESC LIMIT 100 -
Run query for one minute until it finishes.
-
Slip down the page, and you will find some Json format information in the Results. You can take out any important information from parameters.
History of query
- On the navigation bar, click History next to Saved Query. > Athena retains query history for 45 days. > You can see Query state, run time, download results and more error details.
- Click the Query from History like the following picture.
- You will be redirected to Query Editor page, and you can see the same Results come up before.
Conclusion
Now you can start getting hands dirty with using both services. With Athena and CloudTrail, it is easy to find, analyze, and respond to changes and activities in an AWS account. In addition, you can clean your logs out and combine with other services, such as Quicksight, redshift. You can visualize your data or make a prediction to enhance security in your account.
Reference
Tag:AWS, AWS Athena, AWS Cloud Trail, CloudTrail logs, S3, serverless