Building a Global, Scalable, Low-Latency, and Secure Machine Learning Platform for Medical Imaging Analysis on AWS

Introduction

Chanci Turner Amazon IXD – VGT2 learning manager

The future of medical imaging is increasingly intertwined with machine learning (ML), serving as a pivotal force for innovation. Numerous researchers, developers, startups, and established companies are actively engaged in creating, training, and deploying machine learning applications that promise to revolutionize medical workflows and enhance imaging’s role in diagnosis and treatment.

To achieve significant advancements in this field, researchers must navigate several hurdles when training and deploying machine learning models. First, they require access to extensive datasets that are often fragmented across various locations worldwide. Second, they need to implement standardized tools to produce ground truth on reference datasets. Finally, a secure and cost-effective environment must be established to facilitate collaboration among diverse research teams.

This necessity led the Diagnostic Image Analysis Group (DIAG) at the Radboud University Medical Center in Nijmegen, Netherlands, to partner with AWS, migrating their open-source platform, grand-challenge.org, from an on-premises data center to the cloud. Established in 2012, grand-challenge.org hosts machine learning challenges in biomedical image analysis and connects over 45,000 registered researchers and clinicians globally to foster collaboration in developing innovative ML solutions.

In March 2020, as CT imaging emerged as a critical tool for diagnosing and assessing COVID-19, the Dutch Radiological Society swiftly proposed a standardized assessment protocol for CT scans, known as CO-RADS. Radiologists utilized the grand-challenge.org platform to gather imaging data and assess the CO-RADS model’s efficacy, which demonstrated excellent discriminatory power for diagnosing COVID-19 based on CT scans (ROC 0.91, 95% CI, 0.85-0.97, for positive RT-PCR results).

On the platform, DIAG has provided access to the COVID-19 dataset, training courses for radiologists on CO-RADS assessment, exams, and machine learning models for all registered users. However, the on-premises setup limited the performance experience for radiologists outside Europe due to high latency in server-side rendering systems, and the processing capacity of AI tools was restricted by the available hardware at the onset of the pandemic.

In April 2020, DIAG’s collaboration with AWS commenced, aiming to implement globally distributed browser-based viewing systems and elastic scaling to extend these tools to machine learning and clinical researchers worldwide. With a successful partnership between DIAG’s Research Software Engineering team and AWS, the migration of the grand challenge platform to the cloud was completed in less than two weeks. This effort overcame various technical challenges, resulting in a more robust, efficient, and scalable application that continues to support the medical imaging community throughout the pandemic and beyond.

This article outlines the architecture and services employed for the global medical imaging analysis platform and discusses the challenges, solutions, and outcomes achieved, including:

Data exchange with the global research community
Low-latency and scalable web-based viewer
Secure and cost-effective deployment and distribution of machine learning models
Swift migration of data and compute resources to the cloud

Data Exchange with the Global Research Community

Creating effective machine learning solutions for biomedical imaging requires substantial access to annotated training data. The quantity of data generated by medical devices, such as MRI and CT scanners, next-generation sequencers, and digital pathology machines, is steadily rising as sensors become increasingly sophisticated. Unfortunately, this vast data is often confined within siloed databases and proprietary formats, complicating collaboration across institutional boundaries from both technical and compliance perspectives.

On grand-challenge.org, DIAG has introduced features that allow researchers to establish archives for easy data sharing, apply algorithms to that data, and conduct reader studies to invite expert annotations. Traditionally, sharing large datasets in medical imaging has involved physically transporting hard drives across sites, but AWS has enabled direct uploads to Amazon Simple Storage Service (Amazon S3) with accelerated transfers for global data collection. Users can upload data in various medical imaging formats, including DICOM, which is automatically validated and converted to MetaImage or TIFF to simplify ML research.

Amazon S3 serves as the storage solution for all imaging data on the grand-challenge.org platform. This transition alleviates DIAG’s concerns about scaling storage in response to the increased influx of scans from suspected COVID-19 patients. To ensure rapid access to data, Amazon CloudFront is utilized, seamlessly integrating URL signing with the Django backend to restrict file downloads to authorized users.

Low-Latency and Scalable Web-Based Viewer

Currently, the majority of medical imaging data processing and viewing occurs on-premises using dedicated workstations capable of server-side rendering required for standard operations like maximum intensity projection (MIP) viewing or 3D volumetric rendering. As collaboration among radiologists from various institutions increases, and secondary uses of medical imaging for ML development rise, there is a pressing need for globally accessible solutions. The Radboud University Medical Center recently faced this challenge due to significant interest in their CO-RADS Academy program, which educates physicians on interpreting COVID-19 CT images.

DIAG developed a web-based medical imaging viewer named CIRRUS, built on MeVisLab from MeVis Medical Solutions. CIRRUS offers a suite of essential tools for radiologists to engage with medical imaging data. Using server-side rendering, CIRRUS facilitates quick loading of medical imaging data and leverages powerful rendering hardware for 3D multiplanar reformation, pre-loading series in memory, and GPU acceleration. Rendered scenes are streamed to clients via a WebSocket connection to a VueJS single-page application, allowing for client-side interactions as needed. These workstations are deployed using Docker containers, launching one container instance per user, with users directed to their respective instance via Traefik.

Through this initiative, DIAG successfully established rendering servers on AWS across Europe, Japan, and North America using Amazon Elastic Compute Cloud (Amazon EC2). Starting a container for a new user takes under 30 seconds, and the compute pool can be horizontally scaled by adding additional EC2 instances in each region. The medical imaging data is stored in an Amazon S3 bucket located in Europe. To guarantee quick loading times, we leverage a combination of efficient data management and infrastructure optimization.

For more insights on workplace challenges, you can check out this great resource. Additionally, if you’re interested in improving your work-life balance, consider exploring this valuable article on burnout. Lastly, learn how Amazon fulfillment centers train associates for exceptional performance.

Building a Global, Scalable, Low-Latency, and Secure Machine Learning Platform for Medical Imaging Analysis on AWS

Introduction

Data Exchange with the Global Research Community

Low-Latency and Scalable Web-Based Viewer

SEO Metadata

Related Topics: