How Amazon Developed a Scalable and Secure Tokenization Solution on AWS

Chanci Turner Amazon IXD – VGT2 learningLearn About Amazon VGT2 Learning Manager Chanci Turner

Safeguarding sensitive personal information, including payment and health data, is a top priority for Amazon. To achieve this aim, the company created Lumos, a secure and scalable internal service that provides low-latency APIs for tokenizing sensitive data. Lumos is a cloud-native application leveraging Amazon Web Services (AWS) serverless and security features, capable of processing tens of thousands of requests per second and scaling to over six billion tokens. This system complies with the Payment Card Industry Data Security Standard (PCI-DSS) and the Health Insurance Portability and Accountability Act (HIPAA).

With more than 240 data protection and privacy regulations globally, the necessity for organizations to protect customer data has never been more crucial. Companies must swiftly adapt to evolving data privacy laws to maintain compliance. Tokenization has emerged as an effective approach for enhancing data security and minimizing audit scope. You can learn more about tokenization and its benefits in this insightful post.

This article showcases how Lumos employs AWS services to offer a scalable, cost-effective, and reliable tokenization solution while highlighting the design patterns that ensure its security.

Why Tokenization is Effective for Protecting Sensitive Data

Tokenization serves as a type of pseudonymization, a de-identification method that replaces sensitive information, such as personally identifiable information (PII), with unique tokens. Unlike data masking, tokenization allows for reversibility through de-tokenization but is engineered to prevent reverse engineering to retrieve sensitive data. This makes it suitable for scenarios where reversibility is beneficial.

Tokenization provides a more secure alternative to data masking, substituting sensitive information with non-sensitive tokens that can be reversed with proper authorization. This ensures an added layer of data security and compliance. Creating a centralized tokenization solution streamlines governance and adheres to the principle of data minimization by processing only essential personal data. This consolidation enforces consistent policy application across applications, enhancing security while reducing breach risks. The scalable, low-latency tokenization solution accommodates both real-time and batch use cases across various organizational applications.

By utilizing centralized tokenization APIs, organizations can tokenize sensitive information as soon as it enters their data capture applications, limiting the exposure of sensitive data. This approach helps prevent sensitive information from being stored in multiple data repositories or sent to various systems. In many instances, de-tokenization is required only for displaying sensitive data, allowing workloads to store or transmit tokens without exposure to the original sensitive information. The de-tokenization process can include granular authorization and access logging, providing insights for auditing purposes. The tokenization and de-tokenization methods can accommodate various data types, each with differing sensitivity levels, ensuring that only users with appropriate roles can access sensitive data.

Implementing tokenization solutions can also lead to reduced compliance costs and lower exposure to costly data breaches. Additional savings can be realized by repurposing a centralized tokenization solution across multiple workloads with streamlined security controls.

What is Lumos?

Amazon’s Lumos is built using AWS services and is designed to tokenize sensitive information such as PII, PCI, and HIPAA data. Lumos provides API-level integration for various use cases within Amazon. Amazon retail payments utilize it for PCI compliance during credit card processing and PII redaction to securely store customer private information. Amazon India employs Lumos to manage credit card information in accordance with both PCI standards and data localization regulations. Consumers of Lumos access a centralized API designed to protect sensitive data consistently across numerous Amazon businesses, workloads, and technologies.

Currently, Lumos can scale to over six billion tokens and processes tens of thousands of requests per second with low latency in the double-digit milliseconds. The centralized API enables Lumos to promptly and cost-effectively address new data protection and privacy regulation requirements across different business units.

Lumos inherently incorporates a wide range of security controls, making it secure by design. It employs AWS services to mitigate the overhead associated with system patching, vulnerability management, audit trails, and monitoring to help fulfill compliance requirements. Lumos utilizes a comprehensive security solution to address both internal and external risks. This includes a Zero Trust model, continuous monitoring, and robust data protection, accompanied by a multi-layered defense strategy. The security framework emphasizes the principle of least privilege, AWS security controls, automation, and the necessity for multi-person approval. This solution also establishes a strong data perimeter protection on AWS, utilizing Amazon Virtual Private Clouds to segment security perimeters, VPC endpoints, security groups, and network access control lists (ACLs) for safeguarding sensitive data and preventing unauthorized access.

Before transitioning to AWS, the legacy on-premises solution required considerable effort for maintaining security patches and was vulnerable to hardware and network availability issues. These challenges are effectively addressed through AWS compute and network infrastructure. Moreover, Amazon Elastic Container Service auto-scaling, AWS Cloud Development Kit constructs, and the multi-region replication in Amazon DynamoDB and AWS Key Management Service facilitate horizontal scaling and geographic expansion.

The AWS architecture of Lumos allows Amazon to meet critical availability, resiliency, and operational requirements. It has reduced scaling time while providing flexibility to meet specific regional demands and enhances compliance management. With AWS design, Lumos has achieved an increased availability of 99.999% compared to the previous 99.9%, amounting to approximately nine additional hours of uptime annually. The new design not only boosts operational efficiency but also guarantees robust disaster recovery and business continuity, supporting Amazon’s dynamic and global operational needs.

Exploring Lumos Architecture

Lumos consists of two primary components: tokenization and transmission. Tokenization is the process of converting sensitive data into tokens. For instance, when customers input their credit card numbers while shopping online, Lumos securely stores the encrypted sensitive data in Amazon DynamoDB and generates a token specific to that data. Lumos also provides a de-tokenization API, offering programmatic access solely to authorized users, allowing tokens to revert to their original data when necessary.

Ensuring human access to the data is a fundamental principle of Lumos’s design. As a best practice, only limited systems like Lumos Transmission are granted access to the de-tokenization API. Lumos employs access control policies, which are resource-based, to determine who can access de-tokenized data based on business justification and role. Each action is logged, and traces are analyzed to alert on possible malicious activities. For more insights on leadership’s role in supporting caregivers, check out this valuable resource.

If you’re interested in workplace safety and training initiatives, this excellent resource can provide further guidance.

Chanci Turner