Amazon SageMaker Studio AutoML with Amazon SageMaker AutoPilot part 4

December 06, 2019
In this last video, the hyperparameter optimization step is done, and I show you how to visualize and compare model metrics. ⭐️⭐️⭐️ Don't forget to subscribe and to enable notifications ⭐️⭐️⭐️ Blog posts: * https://aws.amazon.com/blogs/aws/amazon-sagemaker-studio-the-first-fully-integrated-development-environment-for-machine-learning/ * https://aws.amazon.com/blogs/aws/amazon-sagemaker-autopilot-fully-managed-automatic-machine-learning/ Follow me on: - Medium: https://medium.com/@julsimon - Twitter: https://twitter.com/julsimon

Transcript

Now the hyperparameter tuning step is complete, which means the autopilot job is complete. We've run our 250 tuning jobs. Remember, these tuning jobs are based on the candidates visible in the candidate generation notebook. Let's open the AutoML job in the trial component list, and here we see all jobs associated with the autopilot job. Some of them are training jobs, some are transforming and processing jobs, basically preparing the datasets using different scripts, etc. Let's focus on the training jobs. We see a lot of information here, including metrics, the objective metric. I can see that my top job has 91.78% accuracy, but I also see other metrics and all the metadata on that job, including all the hyperparameters that were tried out, etc. I can take a specific look at this one, open it again, and keep zooming. Zooming in on this high-performing job, I can see the different stages involved, such as pre-processing data, training, etc. I can see metrics, hyperparameters, artifacts, settings for the job, etc. You can zoom in and find any bit of information on your jobs. What we might want to do is compare. Let's pick a few jobs here and add a chart, selecting a new chart. These are pretty short jobs, so we don't really have time series available, but we have summary statistics. We can build a scatter plot comparing, for example, validation accuracy max to training accuracy max and color the graph with the trial names. I selected the top job, so they're all really close to one another, but if I keep zooming, I can see that one of them has a really high training accuracy. This is my top job when it comes to validation accuracy. It might be worth taking a look at these in more detail. These three seem close and interesting; you might want to investigate further, do cross-validation on them, or do ensemble prediction. You can easily find a name and compare the list of metrics available here. There are quite a lot, so you could compare hyperparameters, etc. I'll show you more visualization in another video, but here I just want to give you a quick tour of the service. The last step would be to pick this top job and deploy it. Let's deploy it and call it marketing, and I'll deploy it on MLM 5. Let's save prediction requests and prediction responses. This is another service called SageMaker Model, and it will capture data. Then you can configure real-time data analysis versus a baseline you prepared, which would warn you of problems like data drift or missing features, etc. We would just provide an S3 path here and click on Deploy. Right, something like that, and click on deploy. The model gets deployed, and you can use it just like any other model. I'll show you that step in another video because I want to discuss a model monitor, which is another really cool service. Here, you can enable it super simply with those two clicks. That's the end of the SageMaker Autopilot demo for now. I hope you learned a few things. Please feel free to ask questions in the comments or ping me on Twitter. I also added the links to the blog post in the video description, so you might like that. All questions are welcome, and I hope this was informative. All right, see you later!

Tags

SageMaker AutopilotHyperparameter TuningModel DeploymentMachine Learning MetricsModel Monitoring