SageMaker Fridays Season 4 Episode 8 Automating an end to end workflow for retail recommendation

September 27, 2021
Broadcasted on 24/09/2021. Join us for more episodes at https://pages.awscloud.com/SageMakerFridays ⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️ In this episode, we revisit the retail recommendation example from Episode 4, and we show you how to automate it end to end with SageMaker Data Wrangler and SageMaker Pipelines. *** Notebook https://github.com/aws/amazon-sagemaker-examples/tree/master/use-cases/retail_recommend

Transcript

Hi everybody and welcome to this new episode of Sage Makeup Friday Season 4. My name is Julien and I'm a Principal Developer Advocate focusing on AI and Machine Learning. As usual, please meet my co-presenter. Hi everyone, my name is Sigelaine, Senior Data Scientist working with AWS Machine Learning Solutions Lab. My role is to help customers get their ML project on the right track in order to create. Thank you again for your help. So where are we in this season? This is episode eight. Yes, it's the last one in our automation series. And this week we are revisiting episode four. So please remind us what that episode was about and what we are adding today. During the last episode, we worked on a recommendation use case specialized for retail application. Starting from an online retail data set, we trained a model to predict the quantity of items a customer is likely to buy. We covered the data science aspect last time, and now we are going to see how to automate deployment and more pipelines. We keep exploring and showing you different use cases, different examples, and different flavors of SageMaker pipelines. Let me show you the notebooks we are using today. You can run all this stuff yourself, of course. We'll see that again at the end of the episode. This is what we've done so far. We started from that data set, trained the model, and now we're going to look at deployment and automation. Let's jump straight into our notebooks. This is what we did last time around. So, tell us a little bit about the data set, just quickly, so we remember what we did. We had a data set containing all the transactions between 2010 and 2011 for a UK-based online retail store. We had about 500,000 transactions and a dataset about the user as well. We have information on customers and transactions. It's mostly a B2B dataset, which is why you see large quantities of items. This is the kind of dataset where you might see, for example, six white metal lanterns or six red woolly hats. It's B2B, so you see large numbers and lots of transactions from the same customers. What we're trying to do here is predict the items a certain customer would be interested in based on the number of items purchased. It's a recommendation, but we're trying to predict the number of items. We covered this in detail in episode four, so go and watch episode four if you need more details. Today, we will discuss how to automate this process. We have some cleaning code, removing negative quantities, and some pandas stuff, one-hot encoding, and factorization. We discussed the sparsity problem in episode four. The dataset is sparse because you have a large number of different items and customers. If you build a matrix with customers as rows and items as columns, and put the quantity in the cells, most cells will be empty. Storing the data in that format is inefficient. If we build that matrix, it's 99.9% sparse. So, we use a sparse matrix object, which is a compact representation of sparse data. We store it in Protobuf 4, a very efficient format for serialization. We end up with files for the training set and the test set stored as protobuf encoded sparse matrices. We trained the model using the factorization machines algorithm, which we discussed in detail last time. We created our estimator and set hyperparameters, using the regression mode because we're predicting quantities. We trained the model and got a model. Now, we're moving on to deployment, automation, and looking at executions. Deployment here is simple. We call the deploy API. In previous examples, we trained in one notebook and deployed in another, using different APIs. Here, we use the vanilla workflow: create the estimator, train, and deploy. We deploy on an M4 Xlarge instance, and SageMaker provisions the instance, creating an HTTPS API we can invoke. We specify the input and output format, using a custom serializer because the factorization machines algorithm expects data in a particular JSON format. For the output format, we use plain JSON. Predicting is straightforward. We call the predict API, but the prediction data needs to be in the same format as the training data. This means reprocessing the data using the same transformations as the training set. This is fine for development and testing, but it has issues. We are duplicating code, which can lead to bugs or inconsistencies. It's also not efficient because it's Python code, adding latency. A better way is to use inference pipelines. An inference pipeline is a sequence of models deployed as a single unit on a single endpoint. You can have up to five models in the pipeline. For example, you can train a data feature engineering model using scikit-learn or Spark, and then use the factorization machines algorithm. The incoming data is automatically processed through the sequence of models, and you get your output. This is a production-grade solution, especially at scale, and it avoids code duplication. Now, let's talk about automating data processing. We used Python code for processing, which we duplicated for prediction. One way to automate this is to move the processing code to a script. This avoids versioning issues. Another option is to use SageMaker Data Wrangler, where you can manually apply transforms and export the processing flow as Python code or a SageMaker Pipeline notebook. The pipeline notebook automatically defines the processing step, making it easy to get started with SageMaker Pipelines. We chose to copy-paste our notebook code into a processing script for the processing step. We could have used the Python code from Data Wrangler or the pipeline export. The processing step involves uploading artifacts to S3, defining compute resources, and specifying inputs and outputs. The training step reuses the estimator and inputs from the processing step. We use the model registry to register the model, specifying the content type, response format, and approval status. We also have a manual deployment step for testing, but the more reasonable way is to register the model and let another team or CI/CD toolchain run checks and deploy. We create the pipeline, set parameters, and run it. We can see the pipeline execution in the SageMaker console, including logs and lineage information. The lineage information is automatically built by SageMaker Pipelines, showing the order of steps and the artifacts used at each step. This is useful for traceability, especially in compliance scenarios. We can retrieve the lineage for any execution, making it easy to understand the origin of a model. To wrap up, we revisited our retail recommendation example, discussed deployment and feature engineering at prediction time, and explored automation with SageMaker Pipelines. We built an end-to-end pipeline, reused code from previous notebooks, and looked at lineage and traceability. Starting next week, we'll dive into AutoML using SageMaker Autopilot and Autogluon. It should be fun. Thanks, Sigelaine. Thanks, everyone. See you next week to discuss AutoML. Bye-bye.

Tags

SageMakerMachineLearningAutomationDataProcessingInferencePipelines

About the Author

Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.

With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.

Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.

Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.