How the Intel Olympic Technology Group Developed a Smart Coaching SaaS Application Using Pose Estimation Models – Part 1

The Intel Olympic Technology Group (OTG), a branch of Intel dedicated to integrating innovative technology into athletic training for Olympic athletes, partnered with AWS Machine Learning Professional Services (MLPS) to create a smart coaching software as a service (SaaS) application. This application leverages computer vision (CV) pose estimation models to enhance athlete training. Pose estimation refers to a type of machine learning model that employs CV techniques to identify key body points, such as joints. These points are essential for calculating biomechanical attributes (BMA) like velocity, acceleration, and posture, which are crucial for athletes’ performance.

The Intel OTG team aims to offer this technology in the context of smart coaching. The BMA metrics derived from pose estimation can enrich the guidance coaches provide to athletes and facilitate the monitoring of their progress. Traditionally, accurate data collection without CV relies on specialized IoT sensors that must be attached to athletes during their workouts. These sensors are often difficult to obtain, found mainly in elite performance centers, and can be cumbersome and costly—setting up a motion capture lab can exceed $100,000. In contrast, this new solution enables pose estimation at a fraction of that expense, utilizing standard mobile phone video for analysis. The ability to process video through simple, lightweight methods is a significant advantage.

In the first part of this two-part series, we will explore the design requirements and the construction of the solution on AWS with assistance from MLPS. Part two will provide a more detailed examination of each architectural phase.

A Versatile Video Processing Platform

Currently, the Intel OTG team focuses on track and field movements, particularly sprinting, while also experimenting with other athletic movements during the 2021 Tokyo Olympics. They aimed to create a flexible platform that could serve a wide range of users for video ingestion and analysis.

MLPS worked with Intel OTG to develop scalable processing pipelines that utilize AWS services to run ML CV model inference on athlete videos, catering to three distinct user groups, each with unique needs:

Developers: Use a Python SDK to facilitate application development, including submitting jobs and adjusting compute cluster settings. These capabilities can be integrated into larger applications or user interfaces.
Application Users: Submit videos and engage with a user-friendly interface, which could be a CLI with specific commands or a coaching dashboard linked to an inference processing compute cluster via an API.
Independent Software Vendor (ISV) Partners: Utilize infrastructure as code (IaC) to deploy the solution in their environments, seeking customization and control over their infrastructure.

Addressing Technical Design Requirements

Once the MLPS team grasped the business needs of the various user segments, they collaborated with Intel OTG to outline a comprehensive set of technical design requirements. The following diagram illustrates the proposed solution architecture.

Key design principles that shaped the final architecture included:

API, CLI, and SDK layers tailored to the access needs of different user segments.
Configurable IaC, such as AWS CloudFormation templates, allowing customization and independent deployment for various ISV users.
A maintainable architecture built on microservices, minimizing infrastructural overhead through serverless AWS services.
The capacity to optimize latency for pose estimation jobs by maximizing parallel processing and resource utilization.
Granular control of latency and throughput for different end-user tiers.
A portable runtime and inference environment both within and outside AWS.
An adaptable data model.

Ultimately, the MLPS team achieved these goals through a high-level process flow. An API layer powered by Amazon API Gateway serves as the intermediary between user interaction layers (CLI, SDK) and the processing backend. The AWS Cloud Development Kit (AWS CDK) was employed for swift code development and deployment, creating a straightforward framework for future resource deployments. The process encompasses several steps:

Upon video upload, AWS Step Functions orchestrate the workflow and manage job submissions through a series of AWS Lambda functions that invoke serverless AWS microservices.
Videos are batched and sent to Amazon Kinesis Data Streams to parallelize job processing through sharding.
Additional parallelization and throughput control are managed by individual consumer Lambda functions, which process each batch of video frames.
The inference engine, powered by an Amazon Elastic Kubernetes Service (Amazon EKS) cluster, allows for a flexible runtime and inference environment, encapsulating ML models and workflows within Kubernetes containers. An Amazon Aurora Serverless database supports a versatile data model to track users and submitted jobs, ensuring different user groups can access varying levels of throughput and latency.
Logging is handled using Amazon Kinesis Data Firehose, which captures data from services like Lambda and stores it in Amazon Simple Storage Service (Amazon S3) buckets. For instance, each processed frame batch is logged with timestamps, action names, and Lambda function responses saved to Amazon S3.

Incorporating Innovative Computer Vision Capabilities

The application’s primary benefit is providing athletes and coaches with critical biomechanical insights into their movements to enhance training and performance. A central focus for the Intel OTG team is to minimize obstacles in delivering this feedback, which means only requiring simple inputs, like 2D video footage from a cell phone camera, without the need for specialized equipment. This capability allows input and feedback to occur in real-time, whether on the field or in a training environment.

For further engagement, you can check out this another blog post on DIY Annual Reviews. To learn more about strategies for effective talent acquisition, check out SHRM’s insights. Additionally, if you’re looking for a comprehensive understanding of the onboarding process, this resource on Amazon’s new hire orientation is excellent.

How the Intel Olympic Technology Group Developed a Smart Coaching SaaS Application Using Pose Estimation Models – Part 1

A Versatile Video Processing Platform

Addressing Technical Design Requirements

Incorporating Innovative Computer Vision Capabilities

Related Topics: