Utilizing DML Auditing in Amazon Keyspaces (for Apache Cassandra)

Chanci Turner Amazon IXD – VGT2 learningLearn About Amazon VGT2 Learning Manager Chanci Turner

Amazon Keyspaces (for Apache Cassandra) is a managed database service that is highly available and scalable, compatible with Apache Cassandra. It allows you to execute your Cassandra workloads on AWS, utilizing the same application code and developer tools familiar to you. This means there’s no need for provisioning, patching, or managing servers, nor maintaining or operating software.

Recently, Amazon Keyspaces introduced support for auditing Data Manipulation Language (DML) events, enhancing the existing Data Definition Language (DDL) auditing capabilities. For further information, you can refer to the Amazon Keyspaces Data Definition Language (DDL) section in CloudTrail. With DML auditing now in place, operations such as reads, updates, inserts, and deletes can be logged and audited via AWS CloudTrail.

This article will explore the significance of DML auditing for various organizations and guide you through the setup process for Amazon Keyspaces. Furthermore, we’ll illustrate how the seamless integration between Amazon Keyspaces and CloudTrail simplifies the recording and analysis of audit trails (change events) across multiple tables in a keyspace without necessitating additional tools.

Advantages of DML Auditing

Organizations using Amazon Keyspaces may be functioning within highly regulated sectors or areas with stringent data privacy and sovereignty laws that require auditing of data access and usage. Data privacy regulations often dictate that organizations demonstrate that all data access is logged, and that these logs are retained and auditable for a specified duration. Examples of such requirements include:

  • Privacy Regulations: Data governance and privacy typically necessitate the monitoring of authorized data access, requiring the audit trail to include:
    • The timestamp of data access
    • Identity of the user accessing the data (user, source IP)
    • Actions performed by the user (read, update, insert, delete)
  • Security Requirements: Information security mandates can stem from various sources, including company policies and laws for regulated industries. Security focuses on preventing unauthorized data access, often from malicious insiders or external threats.

Database administrators handling sensitive or regulated data in Amazon Keyspaces can now log essential database events and leverage CloudTrail and Amazon CloudWatch integrations for automated monitoring and alerting. CloudWatch logs enable tracking and alerts for specific events, such as data deletions, bulk updates, or reads captured by CloudTrail. Moreover, these logs create an audit trail of events that can assist in assessing and conducting root cause analysis on security incidents.

The following sections will delve deeper into the process of setting up DML auditing in Amazon Keyspaces and how it can aid in demonstrating compliance with your organization’s security and data governance mandates.

Solution Overview

Amazon Keyspaces is integrated with CloudTrail, which records AWS API calls made by an AWS Identity and Access Management (IAM) identity, such as a user, role, or AWS service. CloudTrail captures relevant API calls for Amazon Keyspaces, logging events from both interactions in the Amazon Keyspaces console and code using the Amazon Keyspaces APIs, encompassing both Cassandra Query Language (CQL) and AWS SDK actions. The information collected by CloudTrail allows you to review requests made to Amazon Keyspaces, including the originating IP addresses, the IAM principals involved, request timestamps, and other pertinent details.

Creating a CloudTrail Log for Cassandra Table DML Events

By default, DML event auditing is not activated in CloudTrail. You can enable DML event logging through the CloudTrail console or AWS Command Line Interface (CLI). In this article, we will utilize the console. For CLI users, refer to the guide on Creating, updating, and managing trails with the AWS CLI.

Follow these steps:

  1. In the CloudTrail console, select Trails from the navigation pane.
  2. Click on Create trail.
  3. Enter a name for your trail.
  4. Choose to create a new S3 bucket.
  5. Ensure that log file SSE-KMS encryption is enabled.
  6. For Log file SSE-KMS encryption, select a new customer-managed key. Provide a name for the AWS Key Management Service (AWS KMS) alias. The key will be auto-generated for you.
  7. Confirm that Log file validation is enabled.
  8. Skip all optional sections and click Next.
  9. Deselect Management events and select Data events. This separation simplifies locating relevant events, as data events will be in their own log file.
  10. For Data events, choose Cassandra table as the event type.
  11. For the Log selector template, select Log all events.
  12. Click Next to review your setup.
  13. Finally, click Create trail.

You have now established a data trail for the following DML data events in Amazon Keyspaces.

CloudTrail eventName CQL Action AWS SDK Action
Select SELECT GetKeyspace, GetTable, ListKeyspaces, ListTables, ListTagsForResource
Insert INSERT No AWS SDK actions available
Update UPDATE No AWS SDK actions available
Delete DELETE No AWS SDK actions available

Reviewing a CloudTrail Log File

Approximately five minutes after creating the Amazon Keyspaces DML trail, CloudTrail will deliver the initial set of log files to your Amazon Simple Storage Service (Amazon S3) bucket. Let’s execute some CQL statements on an Amazon Keyspaces table and examine a log file to understand the information included. If you don’t have a sample table yet, refer to Getting started with Amazon Keyspaces (for Apache Cassandra) to create a sample keyspace and table and upload some data from a CSV file.

  1. In the Amazon Keyspaces console, navigate to the CQL editor.
  2. Use a test table in the CQL editor to delete a row. Here is an example CQL statement:
DELETE book_title FROM catalog.book_awards WHERE year=2020 AND award='Richard Roe' AND category='Fiction' AND rank=1;
  1. In the CloudTrail console, select Trails from the navigation pane.
  2. Open the trail you just created.
  3. Under trail details, click on Trail log location. The Amazon S3 console will open, showing you the bucket at the top level for log files.
  4. Navigate through the bucket folder structure to today’s date to find the log for the delete event.

You should see a file starting with your AWS account ID and ending with the extension .gz. To view a .gz file, you must download it first, then unzip or extract its contents. You can then open the extracted file in a plain text editor or a JSON file viewer.

Let’s analyze the following log file entry to examine our delete event:

{
 "eventVersion":"1.09",
 "userIdentity":{
    "type":"AssumedRole",
    "principalId":"EXAMPLEIDQQPPZZYYY22:test-user",
    "arn":"arn:aws:sts::111222333444:assumed-role/Admin/test-user",
    "accountId":"111222333444"
}
}

By effectively utilizing DML auditing, organizations can enhance their compliance with data governance and security requirements, ensuring that all necessary data access is logged and can be reviewed as needed. For additional resources, you might find this an excellent reference.

Chanci Turner