Learn About Amazon VGT2 Learning Manager Chanci Turner
Amazon Bedrock has empowered users to craft innovative experiences for their clients through generative artificial intelligence (AI). As a fully managed service, Amazon Bedrock presents a selection of high-performing foundation models (FMs) from leading AI firms like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a unified API. This enables the development of generative AI applications while ensuring security, privacy, and responsible AI practices. With access to top-tier FMs through Amazon Bedrock, customers are innovating at an unprecedented speed. However, as businesses strive to operationalize these generative AI applications, the need for straightforward, prescriptive methods to monitor their health and performance becomes crucial.
In this blog post, we will highlight various features designed to provide quick and effective visibility into Amazon Bedrock workloads within the context of your broader application. Utilizing the contextual conversational assistant example found in the Amazon Bedrock GitHub repository, we will demonstrate how to tailor these views to enhance your visibility according to your specific use case. We will particularly focus on how to leverage the new automatic dashboard in Amazon CloudWatch to achieve a comprehensive view of Amazon Bedrock model usage and performance. By customizing dashboards with widgets, you can gain insights into components and operations such as Retrieval Augmented Generation in your application.
Introducing the Automatic Dashboard for Amazon Bedrock in CloudWatch
CloudWatch now provides automatic dashboards that allow users to swiftly gain insights into the health and performance of their AWS services. A new automatic dashboard specifically for Amazon Bedrock has been introduced to deliver insights into essential metrics for Amazon Bedrock models.
To access the new automatic dashboard from the AWS Management Console:
- Navigate to the Dashboards section in the CloudWatch console and click on the Automatic Dashboards tab. You will see an option for the Amazon Bedrock dashboard among the available dashboards.
- Choose Bedrock from the list of automatic dashboards to instantiate it. This will give you centralized visibility into key metrics such as latency and invocation metrics. Understanding latency performance is crucial for customer-facing applications like conversational assistants, ensuring that your models provide outputs in a consistent and timely manner for an optimal customer experience.
The automatic dashboard collects vital metrics across foundation models offered through Amazon Bedrock. You also have the option to select a specific model to focus on its metrics. The “Monitor Amazon Bedrock with Amazon CloudWatch” section provides a detailed overview of available metrics, including invocation performance and token usage.
With the new automatic dashboard, you have a consolidated view of important metrics that can help troubleshoot common challenges like invocation latency, track token utilization, and identify invocation errors.
Creating Custom Dashboards
Beyond the automatic dashboard, CloudWatch allows users to create personalized dashboards that consolidate metrics from various AWS services into application-level dashboards. This capability is essential not only for performance monitoring but also for debugging and implementing custom responses to potential issues. You can also leverage the custom dashboard to analyze invocation logs generated from your prompts, allowing you to gather information that metrics alone may not provide, such as identity attribution. By utilizing AWS’s machine learning capabilities, you can also identify and safeguard sensitive data in your logs.
For additional context, implementing Retrieval Augmented Generation (RAG) to customize models for specific use cases is a popular approach, allowing you to enhance models with domain-specific data. RAG architectures combine several components, such as external knowledge sources, models, and compute resources for orchestration and workflow execution. Monitoring all these components is vital for your overall monitoring strategy. In this section, we will guide you through creating a custom dashboard using an example RAG-based architecture that utilizes Amazon Bedrock.
This blog post builds on the contextual conversational assistant example to create a custom dashboard that provides insights into the core components of a sample RAG-based solution. To replicate the dashboard in your AWS account, follow the contextual conversational assistant instructions to set up the required example before creating the dashboard using the steps outlined below.
Once you have established the contextual conversational assistant example, generate some traffic by experimenting with the sample applications and trying different prompts.
To create and view the custom CloudWatch dashboard for the contextual conversational assistant app:
- Modify and execute this example of creating a custom CloudWatch dashboard for the contextual conversational assistant.
- Access Amazon CloudWatch from the console and select Dashboards from the left menu.
Under Custom Dashboards, look for a dashboard titled Contextual-Chatbot-Dashboard. This dashboard offers a comprehensive view of metrics related to:
- The number of invocations and token usage from the Amazon Bedrock embedding model used to create your knowledge base and embed user queries, as well as the Amazon Bedrock model that responds to user queries based on the provided context. These metrics help in tracking application usage anomalies and costs.
- Context retrieval latency for search and ingestion requests, which is key to assessing the health of the RAG retrieval process.
- The number of indexing and search operations on the OpenSearch Serverless collection created during knowledge base setup, allowing you to monitor the OpenSearch collection’s status and quickly identify potential RAG issues, such as retrieval errors.
- Invocation usage attribution to specific users, enabling insights into who is using how many tokens or invocations. For further details, see the Usage Attribution section.
- Tracking the number of throttles of the Lambda function that integrates the application, providing essential health metrics of the orchestrating Lambda functions.
For more information on how companies like Amazon train their new hires, check out this excellent resource.
By leveraging these tools and dashboards, you can significantly enhance your visibility into Amazon Bedrock’s performance and usage, paving the way for better decision-making and performance optimization in your AI applications.