AWS DevDays 2020 Build Train and Deploy Models with Amazon SageMaker

March 26, 2020
Traditional machine learning development is a complex, expensive, iterative process made even harder because there are no integrated tools for the entire machine learning workflow. Amazon SageMaker Studio solves this challenge by providing all components in a single, web-based visual interface. You can quickly upload data, create new notebooks, train compare results, and deploy models to production all in one place, making you much more productive. In this session, we will explain how it works including a demo. ⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️ Code: https://github.com/juliensimon/dlnotebooks For more content, follow me on : * Medium: https://medium.com/@julsimon * Twitter: https://twitter.com/juliensimon

Transcript

Hello everyone and welcome to this webinar on Amazon SageMaker. If you have questions during the session, you can submit them in the questions pane on the control panel, and I will answer them at the end. You will also find a copy of the slides in the handout tab on the control panel, and you will receive a copy of the recording in a follow-up email shortly after the event. Alright, let's get started. This webinar focuses on Amazon SageMaker, a managed service for machine learning. The first thing to explain is why we built this service. As you probably know, if you're doing machine learning today, machine learning workflows are quite complex. They involve many different steps from data preparation to building and experimenting with your first models, to training and tuning models, and, of course, the hardest part—deploying and managing models in production. Customers who work with machine learning today use a collection of tools to do this, some off-the-shelf and some bespoke, and it's not easy to have a smooth transition from experimentation all the way to production. This is why we built SageMaker, to make this process simpler, faster, and to help customers deliver high-performance models in production more quickly. SageMaker was released a few years ago, and over time we added lots of new capabilities. I will try to cover as many as possible in this session and dive deeper into some of these capabilities in later sessions today. SageMaker is a modular service that aims to cover the full scope of a machine learning project, from collecting and annotating data to preparing training data. We'll see which services let you do that. Then we have a collection of built-in algorithms and built-in frameworks that are available off the shelf to help you quickly experiment with your data and find the algorithm that fits your data best. We also have a collection of features and APIs for debugging, optimizing models, and comparing your experiments so that you can quickly find high-performance models and move on to deploying them in production. Models can be deployed in different ways, such as real-time endpoints or for batch processing. We also have monitoring and scaling capabilities. SageMaker is a modular service. Some customers find that they want to use the whole spectrum of capabilities. To make this even simpler, we launched Amazon SageMaker Studio at re:Invent, which is a web-based IDE for machine learning that includes all these capabilities and makes them really simple to use. However, some customers have more specific needs. Maybe they want to train on AWS, on SageMaker, and deploy on their own infrastructure. Maybe they have existing models that they want to import to SageMaker and then deploy on AWS. Whatever fits your project is fine. You can just pick the APIs and capabilities that you need and use only those. It's not a silo that traps you into just one way of doing things. Data preparation is a very time-intensive task. Here's an example of what it takes to annotate images for a computer vision problem, probably an autonomous driving problem. On the left, we have the raw images, and on the right, we have the annotated images. This is called semantic segmentation, where you have to assign each pixel in the image to a specific instance. The greenish stuff is a moving car, and the yellowish stuff is vegetation, etc. You have to do this for every single picture, so even if you had nice graphical tools, imagine how long it would take to do this for one picture and then multiply that by hundreds of thousands or millions of pictures. Clearly, this is a very time-intensive task. The same goes for annotating text. If you have to annotate a text file for entity extraction, it takes a lot of time, and these datasets tend to be huge. To help customers with this, we launched a capability called SageMaker Ground Truth, which lets you annotate your datasets at scale quickly and efficiently. The first step is to upload your dataset to S3, and then you create a workforce. This could be a private workforce of people from your own company who have domain knowledge and can annotate the data, a third-party workforce from partner companies, or Amazon Mechanical Turk to scale to tens or hundreds of thousands of human labelers. But it's not just about human labeling data. You can use a capability called active learning, where a machine learning model is automatically trained on annotated data. When the accuracy of that model exceeds human labelers, the model starts labeling at scale. For example, if you have 1 million images, human laborers might need to annotate only 10-20%, and the rest will be automatically labeled by the model. This is a great way to speed up annotation and save money. You can annotate images, text, and build custom workflows. This is a pretty fun service. Let me quickly show you what it looks like. You'll find Ground Truth in the SageMaker console, and you can define datasets by uploading data to S3. Here's a silly example I've uploaded—a little guitarist. Then you build a workforce, which can be a Mechanical Turk workforce, private workforce, or vendor workforce. The steps are slightly different, but in the end, you will have a group of people who can log in and label content using nice graphical tools with your instructions. Once the job is complete, you will find the annotated dataset in S3. Here are some examples I did myself for semantic segmentation, and you can view them in the console. You have this annotation information in a manifest file in S3, which you can then feed to your machine learning algorithm, in this case, a computer vision algorithm. If you want to know more about Ground Truth, I have an end-to-end demo on my YouTube channel, and I'll provide the URL on the final slide. Another way to prepare data is to run processing jobs. You might need to run feature engineering, ETL tasks, or cleaning tasks. Real-life data is not perfect, so it needs to be cleaned. Customers primarily use Scikit-Learn and Spark for this. To make their life easier, we added a capability to SageMaker called SageMaker Processing, which makes it very easy to run batch processing jobs on SageMaker using either Scikit-Learn or Spark. This is a time-saver; you don't have to build a framework to run those jobs. You can provide your own code and use our containers for processing or bring your own containers if you want to. Once you have a dataset ready, you can start building models. The first step is to inspect the data and run preliminary analysis. The preferred way to do this today is to use Jupyter notebooks. You could run your own Jupyter server or even run Jupyter on your own machine and use SageMaker APIs. However, to make your life easier, we built SageMaker notebook instances, which are managed instances pre-installed with everything you need, including Jupyter and environments for popular libraries. They also include a lot of security features, such as running inside your VPCs and encrypting storage. You can literally get to work in minutes by firing up a notebook instance, cloning your repo, and starting to run your notebooks. These instances range from very small to very large, including GPU instances. However, we don't recommend running very heavy workloads on notebook instances; these are meant for experimentation. If you need to train and process data at scale, we recommend using fully managed infrastructure, as we will see in a minute. Recently at re:Invent, we announced Amazon SageMaker Studio, which is still in preview. It's an integrated environment for machine learning based on JupyterLab, so it looks and feels familiar. You can run your notebooks, manage your experiments, and it's integrated with SageMaker Autopilot or AutoML capability, which I will cover in a later session. It's a fast and easy way to have all your machine learning workflow on a single pane of glass. It's currently available in the US East 2 region. SageMaker Studio has a nice dark theme for Jupyter, and you can run your notebooks, compare experiments, and build visualizations using all the model training parameters and metrics. When it comes to training, we have several options. The first is to look at the AWS Marketplace for Machine Learning, which is a collection of algorithms and models designed by AWS partners and vetted by AWS. You might find a model that solves your problem exactly or close enough to get your proof of concept going while you work on the final model. You can deploy these models in just a few clicks on SageMaker, and some are free or free for a while, while others are commercial models. Another option is to use SageMaker Autopilot, which is a newer capability from re:Invent. You can build a machine learning model with zero code, especially if you use Studio, which has a UI workflow that lets you upload your data and start an Autopilot job without writing a line of code. I'll show you that later today. If you have a more specific problem, you need to select an algorithm and maybe provide your own code. The next option is to look at built-in algorithms. We have 17 built-in algorithms available off the shelf in SageMaker, so you don't need to write machine learning code. You just need to write code that selects the right algorithm, defines the location of the data, and starts training. We'll see examples of this later today. If you're using open-source frameworks like PyTorch or TensorFlow, these are supported, and all you have to do is bring your own code. I'll show you in the demo how to use Keras and TensorFlow, taking existing Keras code and running it in the built-in TensorFlow environment. If you need something else, such as R or C++, or custom Python, you can run it on SageMaker as well. You need to build your own container for storing your training and prediction code, and you can train and deploy on SageMaker. You can literally run anything on SageMaker, but I highly recommend looking at the marketplace first. You might save yourself from a six-month machine learning project, and if that doesn't work, you can consider other options. The common ground is that whatever you do, all your training will run on fully managed infrastructure. You will never have to manage a single server, and you can use Spot instances to optimize your training costs. You can focus on the machine learning problem and ignore all the infrastructure concerns. Here's the list of built-in algorithms. The color code is orange for supervised learning and yellow for unsupervised. You can see the usual suspects of statistical machine learning, such as linear regression, factorization machines, KNN, K-means, PCA, and some deep learning algorithms for computer vision, NLP, and time series. Just grab them, send them data, and you're on your way. No machine learning coding required. If you use built-in frameworks, you don't have to build those environments. You can use the built-in container for TensorFlow and the built-in container for PyTorch. These are open-source containers, so you can grab the container definitions on GitHub, build them, run them, customize them, and do anything you like. You can use local mode, which is a way to train and predict with that container on your local machine, whether it's a notebook instance or your laptop. This is not as fast or scalable as using large instances on AWS and distributed training, but it's convenient for small-scale experimentation. I'll show you how to run local mode. It's very easy. Another feature is script mode, where you can take existing framework code and, with minimal changes, run it inside one of those containers. There's no learning curve here. If you have existing TensorFlow or PyTorch code and want to train it on SageMaker, you can adapt it in five minutes. I'll show you how to do that. I mentioned Autopilot. We'll cover this in another session today, but in a nutshell, it's a white-box AutoML service that can train classification and regression models. All you have to do is upload your data to S3, and Autopilot takes it from there. It will automatically figure out which problem you're trying to solve, come up with feature engineering and data processing steps, identify candidate algorithms, and launch training and tuning jobs to give you the best performance. You can also see what's happening, which is why I call it white-box AutoML. Autopilot will generate notebooks that show you the processing pipelines and training steps, so you can run that code in a notebook and reproduce the work Autopilot did. You can keep tweaking if you want to. If you're interested in AutoML, don't miss that session. Once you have a model that fits your problem, it's time to train at scale and optimize. How do you work with SageMaker? SageMaker is based on a Python SDK, which we call the SageMaker SDK. It's very easy to learn and is a high-level SDK. You don't deal with infrastructure objects; you deal with algorithms, training jobs, and deployment jobs, focusing on the machine learning workflow and not on infrastructure. There's also a Spark SDK, which I won't cover today, but feel free to ask questions. Spark for ETL and SageMaker for machine learning are a great combo. Of course, you also find service-level APIs, just like S3 and EC2 APIs. These are low-level APIs great for scripting and automation if you want to finely control your VPC configuration. For machine learning and experimentation, you should use the Python SDK, not the AWS SDK. We added another layer on top of the training and deployment API at re:Invent called SageMaker Experiments. SageMaker Experiments is a way to organize, track, and compare all your experiments. Over time, you will end up training hundreds, maybe thousands, of models, and you want to compare them. There's no better way than to visualize everything, which is exactly what Experiments does. It lets you track all your training and job metrics, organize them, and compare them. We'll see this in later sessions today. When it comes to optimizing models, SageMaker supports automatic model tuning, which we will cover in detail in a future session. Automatic model tuning is a clever way to quickly figure out which hyperparameters deliver the high-performance model. It uses machine learning algorithms to find the best parameters for your training job and will quickly converge to high-performance models. Another capability added at re:Invent is SageMaker Debugger. SageMaker Debugger lets you inspect model state, capturing model parameters and metrics during training and storing them in S3. There's an SDK to load that data and explore and visualize it. Debugger also lets you configure debugging rules to identify unwanted conditions during training, such as loss not decreasing, exploding gradients, or vanishing gradients. It will alert you if such a condition is detected, and you can implement your own custom rules if needed. When it comes to deploying and managing models, once you have a model that performs well and answers your business question, you want to deploy it and manage it in production. Models are stored in S3, and you can deploy them in different ways. The first is to deploy to a real-time endpoint, which is one line of code. This creates a fully managed HTTPS endpoint backed by at least one fully managed instance, but you can have several and implement auto-scaling. You can post data to that HTTPS endpoint and receive predictions. It's a vanilla endpoint, so you can use any tool or language to invoke it. The second way is to use batch transform. Some customers don't need real-time predictions; they might need to predict 10 gigabytes of data every week. In this case, you can transform a dataset in S3 into predictions, and all of this is fully managed. If you don't want to deploy on SageMaker, you can deploy on container services like ECS, EKS, Fargate, or your own Docker cluster on EC2. The model is in S3, so you can grab it, load it into a TensorFlow AWS deep learning container, or use your own container, and run it on a container service. All these models are vanilla models, so if you train with TensorFlow, you get a TensorFlow model. You can copy it from S3 into your favorite container and deploy it anywhere. You can even run it on your laptop, an EC2 instance, or a server in the closet. We believe in freedom, so you can pick whatever option works best for you. If you use endpoints or batch transform, it's one line of code, fully managed, and includes scaling, monitoring, and logging. If you do it another way, you have to manage those aspects yourself. An extra capability added at re:Invent is SageMaker Model Monitor. Model Monitor lets you capture data automatically and save it in S3. It will build a baseline from your training dataset and compare incoming data with that baseline. If your incoming data starts drifting or looking different from the training data, it will alert you with anomaly reports. This is useful for identifying missing features, mistyped features, or drifting statistical properties in your features, which can impact the quality of your predictions. Model Monitor makes it easy to set up this kind of monitoring and get alerts. I'll show you a detailed example later today. As you can see, there's a lot to SageMaker. Let's run a demo to highlight the basic workflow. Later today, we'll dive into other capabilities like Autopilot, model tuning, debugger, and model monitor. This is SageMaker Studio. You can launch different notebooks, create, open a terminal, and I've already cloned my repo here. All my files are available, and I see running notebooks and the experiments I've already run. I also see an endpoint running. For this demo, I'm going to use TensorFlow 2.0 and the Keras API to build an image classifier for the Fashion MNIST dataset. If you're not familiar with Fashion MNIST, it's a drop-in replacement for MNIST, with the same number of samples and image size but dealing with fashion articles instead of digits. The goal is to classify these images into the right class. First, I update my SDKs, import the SageMaker SDK, and download the Fashion MNIST dataset, which is a standard dataset in Keras. It's already split into training and validation sets. Next, I look at my actual Keras code. This is vanilla Keras code for TensorFlow 2.0, creating a simple convolutional model with a convolution block and a dense layer to classify images into 10 classes. I load the dataset, do basic processing like normalizing pixel values and creating batches, create the model, compile, train, and save it. This is as simple as it gets. Now, let's look at script mode. This is how SageMaker will invoke my code inside the TensorFlow container and run it like a Python script. I need to pass hyperparameters as command-line arguments and read environment variables that tell me where the training set, validation set, and model directory are, and how many GPUs I have. I also need to save the model in the right place. This is all it takes to adapt your existing framework code to script mode in SageMaker. The next step is to upload the dataset to S3. I use a default bucket provided by SageMaker and configure my training job using the TensorFlow estimator from the SageMaker SDK. I specify my training script, an IAM role, and request one P3.2xlarge GPU instance. I specify the TensorFlow version, use Python 3, use script mode, and pass one hyperparameter, training for 10 epochs. I call `fit` with the location of the training and validation sets in S3. The training log shows the creation of the P3 instance, pulling the TensorFlow container, copying the dataset, and running my code inside the container. The command inside the container shows how hyperparameters and the model directory are passed as command-line arguments. The training log shows that we achieved 91.7% validation accuracy, which is not great but acceptable. We trained for 469 seconds and are billed for 469 seconds. The GPU instance shuts down automatically, so you never overpay for training. I'll show you later how to use Spot instances to save money on training. Now, let's deploy the model. We create a unique endpoint name and call `deploy` to deploy the model to an m4.xlarge instance. After a few minutes, the API is ready, and you can see it in the SageMaker console. If you want to use your Java app or Node.js app to send predictions, this is the URL you would use. I'll keep using the SageMaker SDK to call the `predict` API, ensuring the prediction request has the TensorFlow format. I grab five random images from the validation dataset as a NumPy array and push them to the endpoint using the `predict` API. The predictions are printed out. We see a few mistakes, but that's expected with 92% accuracy. Some images are difficult even for the human eye. Once you're done, you can delete the endpoint to stop paying for it. You can deploy it again later if needed. This is the basic workflow for SageMaker: grab some data, put it in S3, create an estimator, specify the infrastructure, launch your training job, and deploy and predict using the HTTPS endpoint. How do you get started? You can experiment with SageMaker in the free tier, which allows limited training. You can also install the SDK on your local machine and work with Jupyter on your local machine. The `ml.aws` page has information on all our machine learning services, including descriptions, documentation, and customer stories. The SageMaker page has the SDK, Spark SDK, and a repo with hundreds of SageMaker notebooks showing you how to get started with built-in algorithms, frameworks, bringing your own, Autopilot, and more. I recommend reading the SageMaker documentation first and then diving into notebooks to familiarize yourself with the SDK. Lastly, check out my collection of notebooks and YouTube channel for more SageMaker and AWS machine learning content. That's it for this session. If you have questions, please ask them now. Thank you for listening, and stick with me for plenty more SageMaker content today. Thank you.

Tags

AmazonSageMakerMachineLearningDataPreparationModelDeploymentAutoML

About the Author

Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.

With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.

Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.

Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.