SageMaker Fridays Season 4 Episode 5 Predicting with a music recommendation model

September 04, 2021
Broadcasted on 03/09/2021. Join us for more episodes at https://pages.awscloud.com/SageMakerFridays ⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️ In this episode, we revisit the music recommendation example for S04E01, and we show you how to: - run a bias analysis on the trained model - deploy a model on a real-time endpoint, - predict new data with a real-time endpoint, - configure model monitoring. *** Notebook https://github.com/aws/amazon-sagemaker-examples/tree/master/end_to_end/music_recommendation

Transcript

Hi everybody and welcome to this new episode of SageMaker Fridays. My name is Julien and I'm a principal developer advocate focusing on AI and machine learning. As usual, please meet my co-presenter. Hi everyone, my name is Sigolen and I'm a senior data scientist working with the AWS Machine Learning Solution Lab. My role is to help customers get their ML project on the right track in order to achieve business value as fast as possible. All right. Thanks again for being with us. So where are we in this season? This is the first episode of the automation sequence. Today we're going to start discussing ML operations, deploying endpoints, and monitoring models. We have four episodes dedicated to automation, a super important topic. After these, we'll dive into AutoML for the end of the season. So we still have quite a lot of ground to cover. We are revisiting the music recommendation example, which we covered a few weeks ago. Sigolen, can you tell us a little bit about that? For those who didn't watch the previous episode, go and watch it, okay? We're revisiting the example and zooming in on different things. So, yes, exactly, Julien. This week we are working on a music recommendation problem. During the first episode, we focused on the data science and machine learning aspects, covering data preparation, training, and explainability. This week, we are revisiting this example from the operations angle, discussing deployment, monitoring, and more. We're picking up where we left off last time, which was at training, model debugging, and explainability. We'll summarize those steps. If you watched the episode, this is where we are. We're starting to deploy. If you didn't watch it, don't worry, we'll summarize. If you want to... So, I run this example. It's available on GitHub, of course, and this is the repository. Take a screenshot. I'll give you a few seconds, and we'll try not to forget to show it again at the end of the presentation. Okay? Let's start looking at this example. Here it is. You may remember this is organized in a series of notebooks. Last time we ran notebooks 1, 2A, 2B, 2C, and 3. So, data prep and training. We'll quickly summarize what we did last time. Sigolen, please refer to the data prep and training to refresh our memory on what this problem was from a machine learning perspective. So, the idea is to frame this recommendation system as a regression problem. We use a dataset to recommend songs to users. Each track has features like energy, speechiness, tempo, and so on. We prepare the data, aggregate features, and determine user preferences. Thanks to Data Wrangler, we concatenate the track features with the aggregated features of the user. Finally, we use the review rating as a numerical value to predict. The basic idea is to compute aggregated stats for each user to figure out their preferences. These become features. We then join these user preferences with song features to create our dataset. We know the user preferences and song features, and we train on this to predict the review rating. The dataset is reasonably large, with about 140,000 different zones for 250 users, resulting in almost 500,000 user rating events. We used Data Wrangler and SageMaker Processing to automate the execution. The notebooks 2A, 2B, and 2C cover the different datasets for user preferences, tracks, and ratings. Once we have the processed datasets, we push them to the Feature Store. What happens next? Athena? Yes, we use Amazon Athena to query the three feature groups: ratings, tracks, and user preferences. We join these three datasets, drop duplicates, and split them for training and validation. We train the model as usual and use SageMaker Clarify for explainability. We also used SageMaker Debugger to find issues in the training job. We trained a model, ran bias analysis, and now we want to deploy. Let's move on. I'll close those notebooks. We start from notebook number four. Once we have a trained model in SageMaker, we can use it in different ways. We can deploy it to a real-time endpoint, an HTTP endpoint where we can send data for prediction. We can also use batch transform, where we run predictions in batch mode by placing data in S3 and running a batch job. Today, we'll focus on real-time prediction, but it's important to understand the difference between real-time and batch. Some use cases don't require real-time prediction, such as predicting 10 gigabytes of data once a week or once a month. In these cases, batch mode is more efficient. For real-time applications, like autonomous driving or connected services, real-time prediction is essential. For many business applications, batch prediction is sufficient. First, we set up some parameters and technical details. Here, we're deploying a model trained in a different notebook. In a typical workflow, we would have a single notebook for data prep, training, and deployment. The estimator object in the SageMaker SDK is accessible throughout. However, since the estimator object was created in a different notebook, we need to create a model object. We pass the name of the container used for training (XGBoost) and the location of the model artifact in S3. This allows us to redeploy a model. Once we have the model object, we can call deploy. The parameters include the instance type, the number of instances, the model name, and the endpoint name. SageMaker manages the infrastructure, creating instances, load balancing, and an API. When the endpoint is in service, we have a prediction API. To predict, we need to send an HTTPS request in the correct format to the endpoint. We can use any HTTPS library, but here we'll use the SageMaker SDK, which returns a predictor. If you have an existing endpoint, you can create a predictor object by passing the endpoint name. For user 11005, we use the Feature Store to retrieve user preferences. The Feature Store has an offline store (S3) and an online store for low-latency feature retrieval. We use the GetRecord API to retrieve features for a given user. We then query the offline store using Athena to get a random sample of 1,000 songs. We join the song features with the user preferences to create the prediction request. We convert the data to CSV format and send it to the endpoint using the predict API. The result is a JSON response with 1,000 ratings. We can then use Pandas to find the top 50 songs and display them in a music app. The vanilla workflow is to create an estimator, train it, deploy it, and use the predictor to make predictions. However, in real life, you might want to reuse a model or an endpoint. SageMaker provides flexibility for these scenarios. For advanced configurations, you can have different model variants for A/B testing, multi-model endpoints for cost optimization, and dynamic model loading. In the previous episode, we ran bias analysis and explainability pre-training using SageMaker Clarify. SageMaker Clarify can also perform these analyses post-training. It automatically deploys a temporary endpoint for the analysis and then tears it down. This allows you to check for bias and feature importance after training without manually deploying the model. Now, with a live endpoint, we can predict and recommend music. However, things can go wrong in production. Model monitoring is crucial to understand how the model performs with real data. SageMaker Model Monitor helps you capture and analyze real-life data. You can enable data capture on the endpoint, store the data in S3, and set up a baseline using the validation set. The monitoring schedule compares the captured data to the baseline and generates violation reports if there are discrepancies. This helps you identify and fix issues with the data. The next step is to automate the entire workflow. In the next episode, we'll walk through an end-to-end pipeline that links all the steps and builds a cool, visualizable pipeline. Thank you, Sigolen. Thanks everyone for watching, and we'll see you next week with another episode dedicated to pipelines. Bye-bye.

Tags

SageMakerML OperationsReal-time PredictionModel MonitoringAutomation

About the Author

Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.

With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.

Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.

Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.