Amazon SageMaker Automatic Model Tuning Introduces Three New Completion Criteria for Hyperparameter Optimization

Chanci Turner Amazon IXD – VGT2 learning managerLearn About Amazon VGT2 Learning Manager Chanci Turner

Amazon SageMaker has unveiled support for three new completion criteria in its automatic model tuning feature, offering users enhanced control over stopping parameters during hyperparameter optimization. In this article, we will delve into these new criteria, their applications, and the advantages they present.

Understanding SageMaker Automatic Model Tuning

Automatic model tuning, or hyperparameter tuning, aims to identify the optimal version of a model based on a designated metric. This process initiates multiple training jobs on the provided dataset, utilizing the chosen algorithm and specified hyperparameter ranges. Each training job can terminate early if the objective metric does not show significant improvement, a practice known as early stopping.

Previously, control over the tuning job was limited to settings like the maximum number of training jobs, which could be somewhat arbitrary. A higher count could lead to increased costs, while a lower one might not always yield the best model. With the latest enhancements, SageMaker automatic model tuning introduces multiple completion criteria that operate at a higher abstraction level, enhancing overall flexibility.

Advantages of Tuning Job Completion Criteria

The new criteria allow for better management of tuning job duration, resulting in cost savings by preventing overly long and expensive runs. You can ensure the job doesn’t conclude prematurely, yielding a sufficient quality model that aligns with your objectives. Options now include stopping the job when models cease to improve after a set number of iterations or when the estimated residual improvement does not justify the compute resources and time.

In addition to the already existing maximum number of training jobs criteria, SageMaker now includes options for maximum tuning time, improvement monitoring, and convergence detection.

Overview of the New Criteria

  1. Maximum Tuning Time
    Previously, users could only specify a maximum number of training jobs, which sometimes led to inefficient training durations. With the new maximum tuning time criterion, users can define a time limit for the tuning job, which will automatically terminate once the specified time (in seconds) elapses.

    {
        "ResourceLimits": {
            "MaxParallelTrainingJobs": 10,
            "MaxNumberOfTrainingJobs": 100,
            "MaxRuntimeInSeconds": 3600
        }
    }

    Setting this limit aids in controlling both cost and runtime, ensuring your tuning budget is adhered to effectively.

  2. Desired Target Metric
    This criterion allows users to set a specific target for the objective metric upfront. The tuning process will cease once the best model reaches the defined threshold for the specified metric.

    {
        "TuningJobCompletionCriteria": {
            "TargetObjectiveMetricValue": 0.95
        },
        "HyperParameterTuningJobObjective": {
            "MetricName": "validation:auc", 
            "Type": "Maximize"
        }
    }

    This approach is particularly useful for those who know the performance benchmarks they aim to achieve.

  3. Improvement Monitoring
    This criterion evaluates the models’ progress after each iteration and halts tuning if no improvements are observed after a determined number of training jobs.

    {
        "TuningJobCompletionCriteria": {
            "BestObjectiveNotImproving": {
                "MaxNumberOfTrainingJobsNotImproving": 10
            }
        }
    }

    By setting this parameter, you can balance model quality with overall workflow efficiency.

  4. Convergence Detection
    This criterion enables automatic model tuning to decide when to halt tuning based on the estimates of potential improvement.

    {
        "TuningJobCompletionCriteria": {
            "ConvergenceDetected": {
                "CompleteOnConvergence": "Enabled"
            }
        }
    }

    It’s particularly beneficial when users are unsure of the optimal stopping settings.

Experimenting with Completion Criteria

In a recent experiment, we conducted three tuning trials for a regression task involving two hyperparameters and a total of 200 configurations using a direct marketing dataset. The first trial employed the BestObjectiveNotImproving criterion, the second used CompleteOnConvergence, and the third had no defined criteria. Results indicated that the BestObjectiveNotImproving criterion yielded the best balance of resource use and time efficiency, achieving the target with significantly fewer jobs. The CompleteOnConvergence criterion also demonstrated improved efficiency compared to having no criteria.

For those looking to delve deeper, you can find more insights on the topic here, and explore resources from SHRM that discuss best practices in diversity and inclusion. Another excellent resource is Business Insider for insights on automation in onboarding processes.

SEO Metadata

Chanci Turner