Learn About Amazon VGT2 Learning Manager Chanci Turner
In recent years, the rise of Machine Learning (ML) has transformed how businesses operate, driving the need for tight integration of ML into crucial decision-making processes. ML enhances customer interactions, increases sales, and improves operational efficiency. Investment in machine learning is projected to reach over $209 billion by 2029, demonstrating a robust 38% annual growth.
This surge in ML adoption, coupled with an explosion of data generation—expected to reach around 120 Zettabytes in 2023, marking a 51% increase in just two years—underscores the necessity for processing data quickly and at scale to facilitate prompt decision-making. As this demand escalates, selecting the right infrastructure becomes essential for the scalable deployment of ML functionalities. Organizations must focus on the seamless deployment of ML models, monitoring, lifecycle management, and effective governance. Each of these aspects requires significant operational investment to support production-level ML systems.
In this article, we delve into how clients are constructing online feature stores on AWS using Amazon ElastiCache for Redis to cater to mission-critical ML applications that demand ultra-low latency. We’ll present a reference architecture and a sample use case centered on a real-time loan approval system that generates online predictions based on a customer’s credit scoring model, utilizing features stored in a feature store powered by ElastiCache for Redis.
Understanding Feature Stores
Feature stores are critical infrastructure within the ML domain, facilitating easier model deployment. They serve as a centralized repository for features available for training and inference use cases. Acting as a data transformation service, feature stores allow models to retrieve features in a standardized format, making them readily usable.
Organizations like Amazon Music have adopted Amazon ElastiCache to fulfill this requirement, as it offers a robust, scalable, enterprise-grade infrastructure for deploying their ML models.
“Amazon Music Feature Repository (MFR) is a fully managed ML feature store utilized to store, share, and manage ML features to power personalization. MFR supports high throughput at low latency for online inference and is used repeatedly by dozens of teams across Amazon Music to maintain a consistent personalized experience. ElastiCache provides low latency storage and retrieval for 1 TB of ML features and is simple for the MFR team to manage for peak traffic. MFR with ElastiCache raises the bar for consuming teams with strict latency requirements by delivering batches of ML features at low latencies across North America, European Union and Far East regions (Music Search, Voice/Echo).”
Feature stores typically fall into two categories:
- Offline Feature Stores: These are designed to store and process historical data for model training and batch scoring at scale, often utilizing systems like Amazon Simple Storage Service (Amazon S3). They typically manage features that take longer to generate and are less sensitive to latency.
- Online Feature Stores: These require features to be calculated rapidly with low latency—often within single-digit milliseconds. They depend on fast computations and data access, frequently through in-memory datastores for real-time predictions. Examples of low latency feature store use cases include ad personalization, real-time delivery estimates, loan approvals, and fraud detection like credit card anomalies.
ElastiCache for Redis excels as an online feature store thanks to its in-memory capabilities, delivering the performance necessary for contemporary real-time applications.
ElastiCache: The Low Latency Online Feature Store
ElastiCache for Redis is a rapid in-memory data store that delivers sub-millisecond latency, making it ideal for powering real-time applications at scale. In-memory data stores, like Redis, ensure low latencies and can handle hundreds of thousands of reads per second per node. A Feast benchmark indicates that Redis outperforms other datastores by 4–10 times in feature store scenarios.
Numerous AWS customers, including Global Airlines, utilize ElastiCache as an ultra-low latency online feature store to support various use cases such as personalized offerings and electronic cargo loading.
“Global Airlines, one of the largest airlines worldwide, relies on a centralized machine learning platform with a feature store at its core. The airline employs ML models to provide personalized experiences to its customers. With a vast customer base generating millions of requests daily and stringent service level agreements (SLA) for rapid response times, selecting the right feature store was vital.
After evaluating various options, Global Airlines found ElastiCache for Redis to be the most suitable due to its ability to provide ultra-low latency for millions of customers. Features must be readily available in-memory with the latest values for precise predictions from the ML models, so they are stored in a feature store. ElastiCache ensures these features are updated as frequently as possible, enabling accurate predictions.
Global Airlines uses ElastiCache as its online feature store for managing real-time traffic. A key advantage is its support for global data stores, allowing the ML model to serve traffic from multiple AWS Regions. This capability offers a robust and fail-safe approach to serving customers and providing personalized recommendations, such as determining the best destination or identifying appropriate products during check-in.”
Benefits of ElastiCache for Redis
Customers favor ElastiCache for its exceptional performance, full management, high availability, reliability, scalable architecture, and strong security controls.
- High Performance: ElastiCache is a fully managed in-memory caching service that scales to millions of operations per second with sub-millisecond read and write response times—something typically unattainable with disk-based systems. The enhancements in ElastiCache for Redis 7, which supports improved I/O multiplexing, significantly boost throughput and latency at scale, providing a 72% increase in throughput and a 71% decrease in P99 latency since previous versions.
Organizations like Swiggy, a leading online food ordering and delivery service in India, have successfully built a highly scalable, performant feature store that serves millions of customers with extremely low latency.
“Swiggy encountered challenges managing vast amounts of feature data while developing ML models. This data rapidly expands to billions of records, with millions actively retrieved during model inference, all under tight latency constraints.
By utilizing ElastiCache, Swiggy effectively manages this complexity and enhances its operational efficiency.”
Conclusion
In summary, we advocate for ElastiCache in scenarios requiring real-time traffic where low latency is crucial for efficient data delivery. For further insights on navigating male-centered company culture, check out this blog post. Additionally, for a comprehensive understanding of restrictive formularies and their implications, refer to this authoritative article. To help you better navigate your first six months at Amazon, we recommend this excellent resource.