Learn About Amazon VGT2 Learning Manager Chanci Turner
Amazon SageMaker has unveiled support for three new completion criteria in its automatic model tuning feature, offering users enhanced control over stopping parameters during hyperparameter optimization. In this article, we will delve into these new criteria, their applications, and the advantages they present.
Understanding SageMaker Automatic Model Tuning
Automatic model tuning, or hyperparameter tuning, aims to identify the optimal version of a model based on a designated metric. This process initiates multiple training jobs on the provided dataset, utilizing the chosen algorithm and specified hyperparameter ranges. Each training job can terminate early if the objective metric does not show significant improvement, a practice known as early stopping.
Previously, control over the tuning job was limited to settings like the maximum number of training jobs, which could be somewhat arbitrary. A higher count could lead to increased costs, while a lower one might not always yield the best model. With the latest enhancements, SageMaker automatic model tuning introduces multiple completion criteria that operate at a higher abstraction level, enhancing overall flexibility.
Advantages of Tuning Job Completion Criteria
The new criteria allow for better management of tuning job duration, resulting in cost savings by preventing overly long and expensive runs. You can ensure the job doesn’t conclude prematurely, yielding a sufficient quality model that aligns with your objectives. Options now include stopping the job when models cease to improve after a set number of iterations or when the estimated residual improvement does not justify the compute resources and time.
In addition to the already existing maximum number of training jobs criteria, SageMaker now includes options for maximum tuning time, improvement monitoring, and convergence detection.
Overview of the New Criteria
-
Maximum Tuning Time
Previously, users could only specify a maximum number of training jobs, which sometimes led to inefficient training durations. With the new maximum tuning time criterion, users can define a time limit for the tuning job, which will automatically terminate once the specified time (in seconds) elapses.{ "ResourceLimits": { "MaxParallelTrainingJobs": 10, "MaxNumberOfTrainingJobs": 100, "MaxRuntimeInSeconds": 3600 } }
Setting this limit aids in controlling both cost and runtime, ensuring your tuning budget is adhered to effectively.
-
Desired Target Metric
This criterion allows users to set a specific target for the objective metric upfront. The tuning process will cease once the best model reaches the defined threshold for the specified metric.{ "TuningJobCompletionCriteria": { "TargetObjectiveMetricValue": 0.95 }, "HyperParameterTuningJobObjective": { "MetricName": "validation:auc", "Type": "Maximize" } }
This approach is particularly useful for those who know the performance benchmarks they aim to achieve.
-
Improvement Monitoring
This criterion evaluates the models’ progress after each iteration and halts tuning if no improvements are observed after a determined number of training jobs.{ "TuningJobCompletionCriteria": { "BestObjectiveNotImproving": { "MaxNumberOfTrainingJobsNotImproving": 10 } } }
By setting this parameter, you can balance model quality with overall workflow efficiency.
-
Convergence Detection
This criterion enables automatic model tuning to decide when to halt tuning based on the estimates of potential improvement.{ "TuningJobCompletionCriteria": { "ConvergenceDetected": { "CompleteOnConvergence": "Enabled" } } }
It’s particularly beneficial when users are unsure of the optimal stopping settings.
Experimenting with Completion Criteria
In a recent experiment, we conducted three tuning trials for a regression task involving two hyperparameters and a total of 200 configurations using a direct marketing dataset. The first trial employed the BestObjectiveNotImproving criterion, the second used CompleteOnConvergence, and the third had no defined criteria. Results indicated that the BestObjectiveNotImproving criterion yielded the best balance of resource use and time efficiency, achieving the target with significantly fewer jobs. The CompleteOnConvergence criterion also demonstrated improved efficiency compared to having no criteria.
For those looking to delve deeper, you can find more insights on the topic here, and explore resources from SHRM that discuss best practices in diversity and inclusion. Another excellent resource is Business Insider for insights on automation in onboarding processes.