Hi everybody, this is Julien from Arcee. In this video, I would like to show you how to use MLflow to deploy machine learning models on Amazon SageMaker. MLflow is an open-source library that lets you track and organize your machine learning projects, and you can also deploy to different locations. MLflow supports all kinds of frameworks and algorithms, which is pretty cool. I'm going to use XGBoost here. Let's get to work.
First, I need a dataset, of course, and I'm going to use a dataset I've used before for direct marketing. It's a very simple dataset that you can easily download from the web and it has about 41,000 samples, 20 features, and a label telling you whether a certain customer has accepted a marketing offer or not. So it's an easy one, and we're going to use XGBoost to build a binary classification model. Here's my code to load it. Basically, just grab the file from the web, read it as a pandas DataFrame, do very basic processing here, just one-hot encode categorical variables, and then split the DataFrame into features and labels. After that, split the dataset for training and testing.
Those lines are MLflow APIs that let you log all kinds of information, pretty much anything you want, on your experiment. We'll see how this is useful when we're training the model. As you can see, I'm logging the path to the dataset, its shape, some of the splitting parameters, and I just flagged the fact that I used one-hot encoding on categorical variables. So that's just my loading function here. Nothing weird.
Now, what about training? As I said, I'm going to use XGBoost. The first thing I'm going to do is create an experiment in MLflow. An experiment is just a project where we're going to store information on as many training jobs as we want. Before I do that, I need to start MLflow so we can easily do this by starting the UI, and then I can just open that window here. That's a local UI, but you can also use a remote server if you want to. Now let's look at the code. The code is very straightforward. Start a run inside that experiment, load the dataset, which is the piece of code we just saw, and then build an XGBoost classifier using the AUC metric. Train, score, print out the metric, log that metric to MLflow, and then log the model to MLflow. This is useful for saving the trained model so that we can deploy it later, and then end the run.
Let's just run this code. Creating a new experiment, loading the dataset, and training; it should take just a few seconds. Okay, and we see the AUC is 91.7. Now, if we go to the UI and reload, we can see our training job here. I can see the parameters that I stored. When loading the dataset, I see my metric. It's not really graphing here because I logged it only once, but imagine you were logging a metric after each epoch or something, you would see graphs. Again, very simple job, but I can see this. If I run it again, then I'm going to see a different run, and I could try different hyperparameters and keep track of all that stuff inside the UI. So that's pretty neat.
Now we have a model, so let's see how we can deploy it. We have two options. One option is to deploy locally, and that's one of the cool features in MLflow, or we could deploy to SageMaker. Let's do both. Of course, let's start with local deployment. If we take a look at the MLflow documentation, we can see that we have SageMaker APIs and APIs for all operations, including local deployment. But there's also a CLI that we can see here. You can use the one you like best. I'm going to use the CLI, and that's the `run local` command that I want here. What it needs is basically the model path and the local port. So I defined some environment variables to make your life easier here. I'm going to use the path to the model I just trained and deploy locally to port 8888. Then I just need to run this command here. Okay, and we see that the model is being deployed locally, and it's downloading the required packages according to my training script. All right. So let's just pause the video for a second and I'll be back.
Okay, so the model has been deployed locally. Now the only thing I have to do is just load my test set, grab the first 10 samples, and use the `requests` library to HTTP POST to that local model. This is really standard Python code here, just posting in JSON format and printing out results. Let's run this. And I can see the 10 predictions for this model. Of course, these are probabilities between zero and one because we built a binary classification model. So that's local deployment.
Now, what about deploying to a SageMaker endpoint? As you know, we need a container for this. We need to build a prediction container that gets pushed to Amazon ECR, the Docker registry service. And this is what SageMaker will use to create the endpoint and load the model. Building that container is not always easy, but fortunately, MLflow provides a super simple way to do this. Again, you can use an API or the command line. I'm going to use the command line. The first thing we need to do is build a container because SageMaker is based on Docker containers. We need to have a container that gets deployed to an endpoint on a fully managed instance, and inside this container, SageMaker will load the model.
How do we build that container? Well, it's actually super simple. The only thing you need to do is run the command `mlflow sagemaker build-and-push-container`. This will build a Python environment inside a container, push it to one of your ECR repositories, and you can just use that to deploy to the endpoint. Nothing more complicated. I've run this before, so this is going to be super fast. If you run this for the first time, it's going to take a few minutes, and you only need to do it once. You don't need to do this every time because it's a vanilla container where your specific libraries and requirements will be installed as you deploy the endpoint. So don't run this all the time; it's not needed. I should be able to see this repo, so let's just describe it using `describe-images`. I can see MLflow created an MLflow Python repo and pushed this image inside of it. So again, you just need to do this once, and then you're good to go.
Now we can move on to deploying, and deploying is a one-liner. You'll only need to do this: `mlflow sagemaker deploy` with the app name, which is really the endpoint name, the model path, the role (the IAM role), so you can just use your ARN or your SageMaker role, and of course, the region you want to deploy. I can just run this now. Here we go, and this is going to take a few minutes, so let's just pause the video, and I'll be back after a few minutes.
The endpoint is up, so let's check the SageMaker console. Yep, I can see the endpoint is here and seems to be okay. By default, this is deploying to an M4 instance. I didn't specify an instance type when I deployed the endpoint, but of course, you can pass extra parameters like instance type, instance count, etc. If we look at the log, we see packages being installed, just like I said. This is a generic container, and it will automatically install using Conda the packages that are required by the training script. Then we see Gunicorn starting, and we see the health check, etc. It's all good.
Now let's predict. In order to do this, we're back to SageMaker business as usual. Grab a Boto3 client, describe the endpoint, check it's fine, load the test set once again, grab the first 10 samples, then send them to the endpoint in a single request, and print results. Let's run this bit of code, and we should see predictions. Yep, so we can see the endpoint in service, and once again, I get my 10 predictions. This is really the exact same code you would use on SageMaker because it is a SageMaker endpoint.
So there you go. This is how you deploy models using MLflow. What I like about it is you can work locally. As you saw, I used PyCharm and Docker and was strictly working on my laptop here, debugging my script, checking the different experiments in the MLflow UI, running and testing locally. And then when I'm happy with the model, I can just deploy that to managed infrastructure on SageMaker with minimal fuss. The container is taken care of automatically. No matter what I use, I will end up with a proper container on SageMaker. Creating the endpoint is super simple. Of course, you can do 100% of this using APIs. I used a mix of APIs and CLI, but if you want to automate these workflows completely, stick to the Python APIs, and you can automate that stuff very, very easily.
Well, that's it for today. Hope you liked it, and I'll see you soon with another video. Bye-bye.