Don't reinvent the wheel: use Amazon SageMaker Projects to reuse code, patterns and ideas
One tiny service to maximise your company's SageMaker productivity.
How to reuse patterns, workflows, code and ideas between multiple Amazon SageMaker projects? How to bootstrap new projects quickly? How to stop reinventing the wheel and unify tech between teams?
Setting up a production-grade project in Amazon SageMaker becomes quite repeatable after a while. Most parts stay more or less the same - you've got model training pipelines, a CI/CD for endpoints, model registries, experiments tracker set up and so on. Only after configuring everything can you finally move to production (or begin the data exploration).
Can we speed this process up?
Yes - by leveraging a tiny service called Amazon SageMaker Projects. The Yeoman, Hygen or Cookiecutter for Amazon SageMaker.
What is a Service Catalog?
To better explain it, let's introduce a concept of a service catalog first. A service catalog is a place to store, catalog and maintain ready-to-use technical solutions for your company. Think of it as a place in which you store the blueprints for solving various tasks - for example “a template that deploys a containerized app with a Postgres database and Redis”, “a data lake with an ETL mechanism” or “a basic VPC networking configuration”.
An example of a service catalog.
Some teams provide and maintain these solutions to retain business and technical knowledge, others just consume and use them.
Enter SageMaker Projects
The concept of a service catalog was implemented as Amazon SageMaker Projects. The service lets you you use ML-specific blueprints of entire projects from within SageMaker. Two clicks and you're ready to begin work - with all the MLOps configuration automatically set up for you.
AWS has provided some useful example templates, like the one that has a SageMaker Pipelines set up with a CI/CD mechanisms deploying to a staging and production environments. Of course, you can customise these templates or build your own ones from scratch.
Under the hood, SageMaker Projects use AWS Service Catalog under a more generic service, not only bound to Amazon SageMaker. You create CDK or CloudFormation documents with your template, push them to AWS Service Catalog portfolio and then access them from SageMaker.
Why would I use SageMaker Projects?
Leveraging SageMaker Projects makes sense for most:
saves time when building new solution
mitigates errors by configuring hard parts
maximizes productivity by letting data scientists focus on the business (while still building a rock-solid solution)
unifies technologies between teams and helps sharing new concepts
can be used as a learning resource by showing you how to connect and integrate various SageMaker services
What premade SageMaker Projects templates are available?
Most of these existing, ready to use and maintained by AWS templates consist of model training, model evaluation and model deployment, in various combinations and flavours.
Look them up here.
Where can I see SageMaker Projects in action?
As usual - SageMaker Studio Immersion Day, a free, self-paced workshop comes to the rescue.
Lab 6 Option 3 of that workshop is built entirely on SageMaker Projects. You will immediately see how much stuff has been deployed and pre-configured for you in a matter of minutes.