(Re)introducing Amazon SageMaker Experiments

SageMaker Experiments get a major overhaul in 2022 - is the experiment registry finally usable?

Dec 19, 2022

SageMaker Experiments, and experiments registries in general, are basically ML-and-context-aware loggers. They're used to keep track what you're training - what parameters and data did you use, how did the training go and what results did you achieve. This way teams can contrast the performance and characteristics of different models that they own, often by comparing charts with ML metrics. They also allow you to reproduce experiments months after they were initially conducted.

With the newest version, not only the UI was modified, but also the SDK was rebuilt and new features were introduced. You can now do all basic (yet crucial) data science things without any UI-quirks or weird code additions. The SDK feels just ...natural to use now.

TL;DR of SageMaker Experiments in late 2022:

allows you to store and browse metrics, logs, metadata and artifacts about every training run you perform
can compare charts of multiple runs (pic related)
can store metadata from TrainingJobs, ProcessingJobs and TransformJobs
integrates with HPO and Pipelines automatically (transparently creates Experiments resources for you)
can be used outside of SageMaker realm - for example for logging experiments performed on local Jupyter notebooks

Note - they also introduced... pricing! The service was previously free, now it charges you per millions of metrics ingested, retrieved and stored. Good thing is that it has a free tier AND shouldn't ever acrue too many charges. This was a necessary "addition", as you can now track experiments without using other SageMaker services. Previously, it only worked with SageMaker Jobs and was kinda included in their pricing.

The change is backward compatible, meaning the previous code (that used Trials and TrialComponent) still works. However, you should eventually refactor the code to use newly introduced Run and RunGroups. Oh, and the smexperiments library is no longer necessary - they're part of standard SageMaker SDK now in sagemaker.experiments module.

IMO - before this change, SageMaker Experiments was quite subpar and I've seen many teams that used alternative solutions. This change should make the service on a par with competitors in the experiments tracking area.

A great news for teams that prefer to stay entirely in AWS. Could have easily been named SageMaker Experiments v2. 😅

To see it in action, read the blogpost created by AWS.

Data & AI on AWS and how you tame it

Discussion about this post