Amazon SageMaker Automatic Model Tuning now automatically selects tuning configurations to improve usability and cost effectiveness.

Amazon SageMaker Automatic Model Tuning has introduced Autotune, a new feature for automatically selecting hyperparameters on your behalf. This provides a faster and more efficient way to find hyperparameter ranges and can provide significantly optimized budget and time management for your model auto-tuning jobs.

In this post, we discuss this new feature and some of the benefits it brings.

Hyperparameter overview

When training any machine learning (ML) model, you generally deal with three types of data: input data (also called training data), model parameters, and hyperparameters. You use the input data to train your model, which actually learns the parameters of your model. During the learning process, your ML algorithms try to find the optimal model parameters based on the data while meeting the goals of your objective function. For example, when a neural network is trained, the weights of the nodes in the network are learned from the training and show how much influence it has on the final prediction. These weights are the parameters of the model.

Hyperparameters, on the other hand, are parameters of the learning algorithm and not the model itself. The number of hidden layers and nodes are some examples of hyperparameters you can define for a neural network. The difference between model parameters and hyperparameters is that model parameters are learned during the training process, while hyperparameters are set before training and remain constant during the training process.

Pain points

Automatic SageMaker model tuning, also called hyperparameter tuning, performs multiple training operations on your database using a set of hyperparameters that you specify. It can speed up your productivity by trying out multiple variations of the model. It automatically searches for the best model, focusing on the most promising combinations of hyperparameter values ​​in the ranges you specify. However, to get good results, you need to choose the right domains to research.

But how do you know which is the right domain to start with? By hyperparameter tuning, we assume that the optimal set of hyperparameters is in the range we specified. What happens if the chosen range is incorrect and the optimal hyperparameter is actually out of range?

Choosing the right hyperparameters requires experience with the ML technique you are using and an understanding of how its hyperparameters behave. It is important to understand the implications of a hyperparameter because each hyperparameter you choose to tune can increase the number of trials required for a successful tuning operation. You need to make an optimal trade-off between the resources allocated to the tuning work and the achievement of your goals.

The SageMaker Automatic Model Tuning team is constantly innovating on behalf of our customers to optimize their ML workloads. AWS recently announced support for new termination criteria for hyperparameter optimization; the maximum run metric, which is a budget control completion metric that can be used with respect to cost and run time. Desired target metrics, improvement monitoring, and convergence detection monitor model performance and help with early termination if models do not improve after a certain number of training runs. Autotune is a new automatic model tuning feature that helps save you time and reduce wasted resources to find optimal hyperparameter ranges.

The benefits of Autotune and how model auto-tune alleviates those pain points

Autotune is a new configuration CreateHyperParameterTuningJob in the API and HyperparameterTuner The SageMaker Python SDK, which mitigates the hyperparameter ranges, tuning strategy, objective metrics, or number of jobs that were required as part of the job definition. Autotune automatically selects the optimal configurations for your tuning job, helping to prevent wasted resources and speed up productivity.

The following example shows how many settings are not needed when using Autotune.

The following code creates a hyperparameter controller using the SageMaker Python SDK without Autotune:

estimator = PyTorch(
    entry_point="mnist.py",
    instance_type="ml.p4d.24xlarge",
    hyperparameters={
        "epochs": 1, "backend": "gloo"
    },
)

tuner = HyperparameterTuner(
    estimator, 
    objective_metric_name="validation:rmse",
    objective_type="Minimize",
    hyperparameter_ranges = {
        "lr": ContinuousParameter(0.001, 0.1),
        "batch-size": CategoricalParameter([32, 64, 128, 256, 512])
    },
    metric_definitions=[{...}],
    max_jobs=10,
    strategy="Random"
)

tuner.fit(...)

The following example shows how many settings are not needed when using Autotune:

estimator = PyTorch(
    entry_point="mnist.py",
    instance_type="ml.p4d.24xlarge",
    hyperparameters={
        "epochs": 1, "backend": "gloo", "lr": 0.01, "batch-size": 32
    },
)
tuner = HyperparameterTuner(
    estimator, 
    objective_metric_name="validation:rmse",
    objective_type="Minimize", 
    autotune=True
)

If you are using the API, the equivalent code would be:

create_hyper_parameter_tuning_job(
    HyperParameterTuningJobName=tuning_job_name,
    HyperParameterTuningJobConfig=tuning_job_config,
    TrainingJobDefinition=training_job_definition,
    Autotune={'Mode': 'Enabled'},
)

The code example demonstrates some of the key benefits of Autotune:

  • The main choice in tuning work is which hyperparameters to tune and their ranges. Autotune makes this selection for you based on a list of hyperparameters you provide. Using the previous example, the hyperparameters that Autotune can choose to adjust lr and: batch-size.
  • Autotune will automatically select hyperparameter ranges on your behalf. Autotune uses best practices as well as internal benchmarks to select appropriate ranges.
  • Autotune automatically chooses a strategy for how to select combinations of hyperparameter values ​​to use for the training run.
  • Early termination is enabled by default when using Autotune. When using early termination, SageMaker stops training jobs run by hyperparameter tuning jobs when they are unlikely to perform better than previously completed training jobs to avoid using additional resources.
  • The maximum expected resources consumed by the tuning job (parallel jobs, maximum runtime, etc.) will be calculated and defined in the tuning job record as soon as the tuning job is created. Such reserved resources will not be increased during the tuning operation; this will keep an upper bound on the cost and duration of the tuning job that is easily predictable by the user. A maximum of 48 hours of uptime will be used by default.

You can override any settings automatically selected by Autotune. For example, if you specify your own hyperparameter ranges, they will be used along with the default ranges. Any user-defined hyperparameter ranges will override similarly named inferred ranges;

estimator = PyTorch(
    ...
    hyperparameters={
        "epochs": 100, "backend": "gloo", "lr": 0.01, "beta1": 0.8
    }

tuner = HyperparameterTuner(
    ...
    hyperparameter_ranges = {
        "lr": ContinuousParameter(0.001, 0.01) # takes precedence over inferred "lr"
    }

Autotune creates a number of settings as part of the tuning job. Any customer-specified settings with the same name will override the settings selected by Autotune. Any customer-supplied settings (which are not the same as the named Autotune settings) are added in addition to the settings selected by Autotune.

Checking the parameters selected by Autotune

Autotune reduces the time you would normally spend determining the initial set of hyperparameters to tune. But how do you get an idea of ​​what hyperparameter values ​​Autotune has chosen? You can get information about the decisions made for you in the description of the current tuning job (in the response of the DescribeHyperParameterTuningJob operation). After submitting a request to create a tuning job, the request is processed and all missing fields are set by Autotune. All defined fields are reported DescribeHyperParameterTuningJob operation.

Alternatively, you can check HyperparameterTuner class fields to see the settings selected by Autotune.

The following is an XGBoost example of how you can use it DescribeHyperParameterTuningJob To check the hyperparameters selected by Autotune.

First, we create a tuning job with automatic model tuning;

hyperparameters = {
    "objective": "reg:squarederror",
    "num_round": "50",
    "verbosity": "2",
    "max_depth": "5",  # overlap with ranges is ok when Autotune is enabled
}
estimator = XGBoost(hyperparameters=hyperparameters, ...)

hp_tuner = HyperparameterTuner(estimator, autotune=True)
hp_tuner.fit(wait=False)

After the tuning job is successfully created, we can find out what settings Autotune has selected. For example, we can describe a tuning job by the name it gives hp_tuner:

import boto3 
sm = boto3.client('sagemaker')

response = sm.describe_hyper_parameter_tuning_job(
   HyperParameterTuningJobName=hp_tuner.latest_tuning_job.name
)

print(response)

We can then check the generated response to review the settings selected by Autotune on our behalf.

If the settings of the current tuning operation are not sufficient, you can stop the tuning operation.

hp_tuner.stop()

Conclusion

SageMaker Automatic Model Tuning allows you to reduce model tuning time by automatically searching for the best hyperparameter configuration in the ranges you specify. However, choosing the right hyperparameter ranges can be a time-consuming process and can have a direct impact on the cost and duration of your training.

In this post, we discussed how you can now use Autotune, a new feature introduced as part of model auto-tuning, to automatically select a pre-set of hyperparameter ranges on your behalf. This can reduce the time it takes to start the tuning process for your model. In addition, you can evaluate the ranges selected by Autotune and adjust them according to your needs.

We also showed how Autotune can automatically choose optimal parameter settings on your behalf, such as the number of training runs, the strategy for selecting hyperparameter combinations, and the default early termination capability. This can lead to significantly optimized budgets and timelines that are easily predictable.

To learn more, see Perform automatic model tuning with SageMaker.


About the authors

Jas Singh Is a Senior Solutions Architect helping public sector clients achieve their business outcomes by architecting and implementing innovative and flexible solutions at scale. Jas has more than 20 years of experience designing and implementing mission-critical programs and holds a master’s degree in computer science from Baylor University.

Gopi Mudiyala is a Senior Technical Account Manager at AWS. He helps customers in the financial services industry with their operations on AWS. As a machine learning enthusiast, Gopi works to help clients succeed in their ML journey. In his spare time, he likes to play badminton, spend time with his family and travel.

Raviteja Yelamanchili This is an Enterprise Solutions Architect with Amazon Web Services based in New York. He works with large enterprise financial services clients to design and deploy highly secure, scalable, reliable and cost-effective applications on the cloud. He brings over 11 years of experience in risk management, technology consulting, data analytics and machine learning. When he’s not helping customers, he likes to travel and play PS5.

Yaroslav Shcherbati is a machine learning engineer at AWS. He primarily works on improving the Amazon SageMaker platform and helping customers get the most out of its features. In his free time, he enjoys going to the gym, doing outdoor sports such as ice skating or hiking, and following new research in artificial intelligence.

Source link