Amazon Athena Query Auditing Using Amazon EventBridge
Introduction
Amazon Athena is a serverless and an interactive query service which make it easy to query the data stored on Amazon S3.
Typically, large enterprises have many different lines of business and use AWS Organizations to manage many AWS Accounts for different environments like Dev, QA & Prod. Often, these lines of business use Amazon Athena in their specific account to query the data stored on Amazon S3.
There is a need to Audit log all the Amazon Athena queries used across the enterprise in one place for better compliance and regulatory needs.
Currently, AWS CloudTrail logs capture Amazon Athena audit events information like which user ran what type of queries and the underlying data the query accessed. At the moment, CloudTrail doesn’t allow to capture specific events for analysis, CloudTrail will capture all the events and then we have to do filter to only look at certain events like Amazon Athena related events in this example.
In this blog, we will look at making use of Amazon EventBridge to capture specific Amazon Athena audit API Events from different accounts of the enterprise in one central account.
Technical Solution
We will make use of the below AWS Services,
• Amazon CloudWatch Event/Amazon EventBridge rules which trigger on CloudTrail API event. This is for capturing Amazon Athena only events
• EventBus – We will make use of the default event bus.
• IAM Role for triggering the default event bus of the “Central/Master” account.
• Each line of business account will use Amazon EventBridge to capture Amazon Athena Audit trail and use “Event Bus of different account” as target. This will make sure all the different accounts with in the enterprise uses the event bus of the “Central/Master” account as target.
• AWS Kinesis Data Firehose for persisting the data on S3, which can be later utilized for analytics or we can simply create an Amazon Athena table to query the Amazon Athena audit log information.
• AWS Glue Crawler/ AWS Glue Catalog – This is optional, sometimes there is a need to query the Amazon Athena Audit events so we can make use of Glue Crawler to crawl the data persisted on S3 and create Glue Catalog Table.
• Amazon Athena on Central/Master account – This is optional, once the data is cataloged, we can make use of Amazon Athena on the Central Master account to query Amazon Athena Audit events information.
Below is the high-level architecture diagram which shows how the Amazon Athena API Calls are captured using Amazon CloudWatch events/Amazon EventBridge in different line of business AWS Accounts and then sent to default event Bus on a central/master account. Once the data is on the event bus, this data can be used in different ways based on the requirement.
Amazon CloudWatch Events/Amazon EventBridge currently support different targets like SNS, SQS, Kinesis Data Streams etc. For example, we can make use AWS Kinesis Data Streams to analyze the data real time and look for anomalies in the users submitting the queries.
This way, the captured data is streamed through AWS Kinesis Firehose and persisted on S3.
Step by Step Setup
• Setup Amazon Cloudwatch Events/Amazon EventBridge Rule in the Line of Business Account for capturing Athena API calls using CloudTrail. Note Amazon EventBridge can also be used to do this,
• The target of the rule should be “Event Bus in another Account” since we want to capture all these events in a central account. The Account ID should be the AWS Account ID of Master account in an organization, if you use AWS Organizations to manage multiple AWS Accounts.
• As you can see, the target is the cross account EventBus, we have to enable permissions in the Central/Master Account to accept the events from different AWS accounts in an organization.
Login to Central/Master account
For the purpose of this setup, we will use the default EventBus of the account, although you can also create a custom EventBus for this purpose.
Navigation
Amazon EventBridge -> Event Buses -> Default Event Bus -> Manage Permissions.
Here we have enabled access for all the accounts with in the organization.
• Now in order to capture the Athena Audit API events we have to create a Amazon CloudWatch Events/Amazon EventBridge rule in the Central/Master Account as well,
• Amazon CloudWatch Events/Amazon EventBridge Rule out of the box provides different Targets like SNS, SQS, Kinesis Data Streams, Kinesis Data Firehose etc.
In this example we are making use of Firehose Stream to persist the Auditing data into S3.
Once the Amazon CloudWatch Event/Amazon EventBridge Rule is saved, it can start capturing the events from different accounts.
Firehose by default persist the data on S3 using year, month, date, hour partitions or newly introduced feature of dynamic partitions can be used. We can make use of Glue Crawler to create a Glue Catalog table and use Amazon Athena to query the audit information.
References
Appendix
Athena API captures the below details,
- User who ran the query
- IP Address of the user
- The actual query
- WorkGroup used for Cost purpose
- Query Location
- Athena Query Execution Id (This can be used to know the status of the executed Query).
Example event below,