Hi, everybody. This is Julien from Arcee. In a previous video, we used AutoNLP to fine-tune models on a movie review dataset to build a sentiment analysis application. After a little while, we got a pretty good model with 94.08 accuracy. We took a look at this model on the hub, and I quickly showed you we could deploy it in different ways. Of course, we could use it with the Transformers library and run it in a notebook. But let's go one step further and build a prototype web app that uses this model. This way, we can show how the model works to business stakeholders or clients, and get feedback to iterate on. A web application gives a better sense of how the model works and what it could look like in an app, rather than showing them notebooks, which don't tell the full story.
So, how do we do this? First, I'm going to go to my HuggingFace account and create a new space. A space is a web app implemented with Streamlit or Gradio, displaying a user interface where we can enter data and predict with our model. It's very little code, which is great because I'm not good at writing front-end code. I'm hoping this is easy and friendly enough for me to use. I'm going to name this space "imdb emospace" and use Gradio, which I think is the simplest way. I can either make this public or private, so I'll keep it public and create the space.
We haven't done much so far; we've just created the space. Now, we need to add some code in a GitHub repo. Let's create this empty repo here. There's just a readme file, which doesn't say much. We need to add our code to the repo. There's a placeholder file here, and we can commit and push to see what happens. We can add dependencies with a requirements.txt file. In the interest of time, I've already written the code, so let me show you what it looks like. As promised, it's really simple.
We import PyTorch, NumPy, Gradio, and the Transformers library. Then we grab the tokenizer and the model from the hub. We write a small function to predict. This function will receive a movie review, tokenize it with the pre-trained tokenizer, predict using our model, and output raw predictions. I want to show probability-style results, so I apply the softmax function to make the two outputs for the positive and negative classes add up to 1, making them look like probabilities. After applying softmax, I convert the torch tensor to a NumPy array. Now I have a NumPy array with two numbers: the first for the negative class and the second for the positive class. I want the larger one, so I use NumPy's argmax to find the index of the largest number in the array and grab that score. Then I return a text string that says the review is XX% positive or negative, depending on the prediction.
This code is simple and can be tested on your laptop, completely independent of the UI. The only UI code needed is for creating an interface with two text boxes: one for input and one for output. The input will be the movie review, and clicking a button will call the predict function, displaying the prediction in the output window. That's it. Is this even UI code? Probably not, but I'm happy with it.
Let me copy this file to my repo. I also have a requirements file with Torch and Transformers. In this repo, I have these two new files. I'll add them, commit, and push. After maybe 30 seconds, the page is up and running. Let's try some reviews, starting with a negative one. I input the review, click submit, and the output window displays the result. Pretty cool. Let's try a positive one. If you think Jar Jar is the most amazing character in the Star Wars universe, it's okay. It's like the pineapple on pizza debate; it's fine if you believe it. This is a positive review, and we see the result.
We see our files here, and this is how much code we wrote. The predict function is something you'd already have if you were experimenting with your model, and the UI bit is just two lines. This is a cool way to build a quick web page to show stakeholders, customers, or for internal model testing. If you go to the huggingface.co spaces page, you'll find lots of public spaces built by the community. There are many cool examples, and I'm sure every single one looks better than my example, but you should be writing the great apps.
Remember how we started. We used AutoNLP, which is as simple as filling in a few bits of information or using the AutoNLP API for automation. We built models, took the best one, and used it in a notebook and more importantly, in Spaces to build a small web page to test it. The longest part was the NLP process, which took about two hours, but you can do other tasks in between, like writing the Spaces app. In just a couple of hours, you can go from dataset to a web page you can show. This is a productive way to do machine learning and NLP. Give it a try, leave comments, ask questions, and reach out on Twitter or anywhere you can find me. If your company wants to know more about AutoNLP for production workloads, get in touch. That's it for today. I'll see you soon with more videos. Until then, have a good time and keep learning. Bye-bye.
Julien Simon is the Chief Evangelist at Arcee AI
, specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.
With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.
Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.
Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.