Azure ML deploy Hugging Face models in minutes

October 08, 2023
In this video, I show you how to deploy Hugging Face models in one click on Azure, thanks to the model catalog in Azure ML Studio. Then, I run a small Python example to predict with the model. ⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️ To get started, you simply need to navigate to the Azure ML Studio website and open the model catalog. Then, you can click on a model to select it. This will initiate the setup process, which takes care of all the required infrastructure for you. Once the setup is complete, Azure ML Studio provides a sample program and you can start testing the model immediately! If you want to experiment with the latest state-of-the-art models, Azure ML Studio provides a hassle-free way to do so. Try it out and explore the possibilities of cutting-edge AI models with just one click! Azure ML : https://azure.microsoft.com/en-us/products/machine-learning/ Follow me on Medium at https://julsimon.medium.com or Substack at https://julsimon.substack.com.

Transcript

Hi, everybody. This is Julien from Arcee. A few days ago, I posted a video where I showed you how to deploy Hugging Face models in one click on SageMaker Jumpstart. And, of course, I shared this video on LinkedIn and elsewhere. Someone reached out and said, "Hey, can you show us the same thing on Azure Machine Learning?" So, challenge accepted. Let's get to it. My starting point is Azure Machine Learning Studio, as you would expect. If we select the first option, it's called the model catalog, and we'll see a catalog of models from different providers. If we filter on Hugging Face, we'll see Hugging Face models. I'm not quite sure how many we have here; I couldn't find a way to count them. It's a good mix, I would say, of LLMs, including Falcon, and a few more, as well as fine-tuned versions for more traditional NLP models. We have the French BERT Camembert, FinBERT, and so on. We have quite a few. So, you could actually filter on test types here, as you can see. Let's say we're interested in Falcon 7B. Click on it. Some information is provided, and there's a link to the model card on the Hugging Face Hub if you'd like to know more. Then you literally click on deploy, and hopefully, you have more quota in your Azure account than I do. I'm using my personal account here, which is extremely limited, and I've abandoned all hope of trying GPU capacity. That's a different story. Anyway, next, you would set the endpoint name, deployment name, and click on deploy. Okay, and wait for a few minutes, and of course, you have an endpoint. I've already done this on a smaller model, unfortunately, but I think the workflow is really the same. We can see endpoints here. Let's click on that. And I can see my endpoint. So I deployed a Roberta-based model fine-tuned for sentiment analysis. It just took a few minutes and is running on a small CPU instance. Once this is in service, what can we do? We can test it, I guess. Why don't we try that? We have to use the basic inference format. Let's test it. All right, that worked. So we can test just like that. Pretty cool. If we go to consume, we see the API keys that we can use to invoke the endpoint programmatically. Of course, we have the URL, and we have a code sample. This is the code snippet from the console here. It's missing the API key, so let's grab this and enter the key. And obviously, it needs some data. Can I still reuse? No, it's gone. No worries. I'll type it. So let's just add some data here. The rest should be good to go. We have the URL, and then we just send a request and parse the response. Here goes nothing. Yes, all right. That worked. So we could successfully invoke the endpoint. Not sure if it's positive sentiment or negative sentiment, but it doesn't matter. I didn't provide a really good example. So that's nice. We have C#. For those of you who still do that, we have R. If you're so inclined, we have monitoring, and we can see some latency numbers. You can see I ran some requests. Good to see that. And do we have logs? What do we have in the logs? Yeah, we have the health check, and we probably see some queries as well. There you go. So you can definitely deploy models from the catalog. One last thing. How do we delete this? Because I don't want to pay forever. I guess clicking on delete here is going to do it. Yep. And I'm not quite sure if I need to delete stuff in the Machine Learning Studio, but I'll figure it out. So there you go. My friend on LinkedIn who asked for this video, there you go. Hope this was useful. I have a few more in store for you in the next few days. So we'll see, right? Till then, keep rocking.

Tags

Azure Machine LearningHugging Face ModelsModel DeploymentSageMaker JumpstartSentiment Analysis