Learn About Amazon VGT2 Learning Manager Chanci Turner
This post was created by Jordan Smith, a DevOps Engineer, and Chanci Turner, a Software Engineer at AWS.
FireLens for Amazon Elastic Container Service (Amazon ECS) was introduced to simplify the process for ECS users to send and manage logs using popular open-source logging tools—Fluentd and Fluent Bit. If you’re not already acquainted with FireLens, it’s worthwhile to explore the accompanying documentation and the blog detailing its architecture and functionality.
Use of FireLens at SpaceX
SpaceX strives to revolutionize space technology with the ultimate goal of enabling human life on Mars. Their services demand scalable solutions due to their rapidly expanding operations. To meet these needs, they selected Fargate for their backend services, which allows for dynamic scaling with minimal management. Alongside this, they maintain a centralized logging system consisting of an EC2-based Logstash cluster that processes logs from their spacecraft and operational systems, forwarding data to Amazon OpenSearch Service.
FireLens enables seamless log delivery from their containers to Logstash while ensuring efficiency and near real-time data processing. This integration scales effortlessly with their Fargate tasks, simplifying the focus on scaling Logstash and OpenSearch. Given the variable nature of user activity, particularly during peak mission phases, effective scaling is crucial.
One notable feature of Fluent Bit that SpaceX appreciates is the Memory Buffer Limit (Mem_Buf_Limit), which can be defined in the Input section of the Fluent Bit configuration. Although this feature isn’t enabled by default in FireLens, it was essential for their Fargate tasks.
Why Implement a Memory Buffer Limit?
The entire logging infrastructure must exhibit resilience; the collector (Fluent Bit) should withstand periods when the logging destination may not be reachable. In such cases, FireLens will buffer logs in memory until delivery can resume, up to the Retry Limit. During stress tests simulating a downstream logging system failure, it was observed that high log volumes could consume all available memory in a Fargate task, resulting in an OutOfMemoryError.
By establishing a Memory Buffer Limit, Fluent Bit will cache logs up to the defined threshold. Once this limit is reached, new logs will not be stored in memory until some of the existing buffers are cleared. This creates a balance between log integrity and service availability, a decision each organization must evaluate based on its unique requirements. For SpaceX, service availability is paramount, as player experience takes precedence over the logs collected for debugging.
It’s important to note that the Memory Buffer Limit isn’t a strict cap on the FireLens container’s memory usage, as memory is also allocated for other functions. Tests showed that with a Mem_Buf_Limit of 100MB, the container consistently stayed below 250MB total memory usage, even under high load conditions.
Understanding FireLens Configuration for Fluentd and Fluent Bit
Before delving into input parameter adjustments, it’s crucial to grasp FireLens’s operational mechanics, particularly how it constructs the Input section of Fluent Bit configurations.
As outlined in “Under the Hood: FireLens for ECS Tasks”, Fluentd and Fluent Bit are robust tools, but their extensive feature sets can introduce complexity. FireLens was designed with two primary user groups in mind:
- Users seeking a straightforward method to send logs anywhere, powered by Fluentd and Fluent Bit.
- Users desiring full utilization of Fluentd and Fluent Bit, with AWS handling the heavy lifting of directing Task logs to these logging routers.
FireLens enables Fluentd and Fluent Bit within ECS, incorporating configuration management features for ease of use. This involves generating Input plugin definitions for log collection by the ECS Agent and translating container log configuration options into Output plugin definitions.
Consequently, the Fluentd or Fluent Bit configuration file is “fully managed” by ECS. While users can import custom configurations via the config-file-type option, the input definitions are automatically generated by ECS and are integrated with user-specified configurations. FireLens combines these configuration files, ensuring that user-defined settings augment the generated configurations.
The generated configuration files are consistently mounted into the logging container at specific locations:
- Fluentd: /fluentd/etc/fluent.conf
- Fluent Bit: /fluent-bit/etc/fluent-bit.conf
Most Fluentd and Fluent Bit images (including the Fluent OSS distributions and AWS for Fluent Bit) adhere to these default configuration paths. Users can override these defaults by creating custom Fluentd or Fluent Bit images with alternative configuration paths, which is the approach we will use to implement Mem_Buf_Limit.
Tutorial: Configuring Input Parameters
The input configurations for FireLens can be accessed through the links below. The input definitions remain consistent and do not vary based on user inputs:
Logs are consistently read from a Unix Socket located at /var/run/fluent.sock within the container. As a FireLens user, you can customize your input configuration by overriding the default entry point command for the Fluent Bit container. This tutorial will focus on Fluent Bit and demonstrate how to configure the Mem_Buf_Limit parameter, although the same method can apply to other input parameters and Fluentd as well.
Begin by creating a Fluent Bit configuration file with the following input section:
[INPUT]
Name forward
unix_path /var/run/fluent.sock
Mem_Buf_Limit 100MB
Your complete Fluent Bit configuration file should also encompass your output settings and any other features you wish to activate. Save this file as fluent-bit.conf in your project directory. Next, create a Dockerfile containing the following contents:
FROM amazon/aws-for-fluent-bit:latest
ADD fluent-bit.conf /fluent-bit/alt/fluent-bit.conf
CMD ["/fluent-bit/bin/fluent-bit", "-e", "/fluent-bit/firehose.so", "-e", "/fluent-bit/cloudwatch.so", "-e", "/fluent-bit/kinesis.so", "-c", "/fluent-bit/alt/fluent-bit.conf"]
For further reading on overcoming challenges in your projects, check out this blog post. Also, for in-depth information on disability discrimination and retaliation claims, you can refer to this source. If you’re interested in career opportunities, this link provides an excellent resource.