Hugging Face the story so far

Transcript

Hi everybody, this is Julien from Hugging Face. When I meet with customers or speak at conferences, there are a bunch of questions that keep popping up. What does the name mean? Where does the logo come from? How did the company start? What kind of problem is it solving? And how did it become so successful in such a short period of time? All these are good questions, and I actually did a talk at a conference last week on this particular topic. It was well received, so I thought why not record this as a video and share it with all of you? Hopefully, I can answer all those good questions in one go. I'll take you through the history of Hugging Face from the very early days until now, highlighting what I think are the critical steps that made Hugging Face successful. If you like this video, please give it a thumbs up, consider joining my channel, and don't forget to enable notifications so you won't miss anything in the future. Also, why not share this video on your social networks or with your colleagues? If you enjoyed it, chances are others may enjoy it as well. Thank you very much. Before we talk about Hugging Face per se, let's set the scene. I have to take you back to 2015, 2016. When we discussed artificial intelligence or deep learning back then, everybody was obsessed with chatbots. This was the most striking application that customers envisioned, and everyone wanted to build a chatbot for their mobile app, web app, business app, whatever. You couldn't open a business newspaper or tech blog without reading about chatbots. We started to see companies launching some services. Looking back, and even back then, my opinion was that they were all pretty terrible. They were probably a step forward compared to rule-based chatbots, but honestly, they were really, really bad. That was the starting point. In 2016, Hugging Face wanted to build a chatbot. They built a friendly chatbot and launched it, aiming to be friendly, safe, and like a nice friend you would chat with. They looked for a name that would reflect that friendliness and settled on the hugging face emoji, which certainly looks friendly enough. That's where the name of the company comes from. The logo is very easy to remember, and often when I meet folks, they say, "Oh yeah, you're the emoji folks, the funny yellow thing." Yes, that's us. Hugging Face is the name of the emoji. Hugging Face launched their chatbot and took a bit of angel money to deliver this. They achieved that first milestone and learned quite a few things about natural language processing, conversational apps, and AI in general. Even though the chatbot wasn't a massive success, it was a good first step in structuring the company and exploring the natural language processing landscape. In June 2017, something really important happened: the release of the "Attention is All You Need" paper, which introduced the self-attention mechanism and how it could be used to build a new architecture of deep learning models called transformer models. This came out of Google, and it was a massive breakthrough. Transformers literally overnight improved all the state-of-the-art benchmarks on natural language processing tasks like translation, classification, question answering, etc. Very quickly, transformers made previous deep learning architectures like recurrent networks or convolutional networks obsolete. This was a massive breakthrough, but it was still in the research territory, and only the people in the know noticed something was happening. A year later, this innovation became more concrete as Google released the now-famous BERT model, the original transformer model. This was based on the attention layer and their previous work. They published the paper, pre-trained models trained on large quantities of text, and some benchmarks that were very good. This was a really exciting moment. I remember going to the GitHub repository for BERT a few days or a week after the model release. They were releasing the model code, but I was more interested in the models. They had pre-trained checkpoints for BERT-base, BERT-large, and fine-tuning scripts replicating experiments from the paper on well-known benchmarks. I wanted to start predicting and applying it to some data I had. However, I completely failed at doing that. It was probably my own limitation, not knowing enough about TensorFlow, not being able to figure out the jargon in the paper, repo, and code. No offense to the authors, but they didn't do a good job making it developer-friendly. Unless you knew what you were doing with TensorFlow and transformers, it was really hard to work with. In the meantime, something else happened. The frustration with TensorFlow was shared by many. Starting late 2016, PyTorch began to gain traction. If you look at the Google search trends, you can see PyTorch's popularity grew very quickly from the first release to maybe a year or a year and a half later. Andrej Karpathy's quote is extremely funny and reflects exactly what we were all feeling back then: TensorFlow was awesome, but it was also hard to work with. A friendlier, more Python-like alternative was more than welcome. PyTorch rose very quickly, and the first V1.0 version came at the end of 2018. The Hugging Face folks were looking at all of this and saw a new generation of models rising but still very difficult to work with due to TensorFlow and Google not going the extra mile to deliver developer-friendly tools. PyTorch was becoming that developer-friendly tool for deep learning applications. The spark happened, and the first visible flame was on November 2018 when Hugging Face released an open-source library, the 0.something version called PyTorch Pre-trained BERT. It was exactly what the name means: a PyTorch-based library providing a PyTorch implementation or re-implementation of BERT using the weights shared by Google. This was clever because PyTorch was easier to use and growing more popular. You were still using the weights from the original model but with a simpler, friendlier library. If you ask me what made Hugging Face successful, it's exactly this: simplifying access to state-of-the-art models and making it possible for less expert people to work with them. For example, I had to dig a little to find this historical code, but it's still out there. This is an example from PyTorch Pre-trained BERT 0.11, showing how you would download BERT-base, uncased, the tokenizer, and the model, and predict with PyTorch. For those who are not so technical, this is reasonable Python code that many more people could understand compared to the previous TensorFlow insanity. The one-line access to model artifacts hosted by Hugging Face was pretty nice because you didn't have to chase models living in different places and repos. You just downloaded them in one line of code. Very quickly, other transformer models popped up, like OpenAI GPT and GPT-2 (back when the "open" part of their name actually meant something), Transformer XL, XLNet, XLM, etc. It's funny to look at the release history for this library because you see new models being added almost every day or week. The model collection grew by the minute. Within a few months, they renamed the library to PyTorch Transformers and achieved a 1.0 milestone. A few months later, they added more models and TensorFlow support, and renamed it to Hugging Face Transformers. September 2019 is where the Transformers name really took off. Adoption started to grow, and investors became more interested in the company. Series A happened at the end of 2019. In the next year and a half, the ball kept rolling. Tens of models became hundreds, and before you knew it, it became a thousand models and never stopped from there. New features were added to the libraries, and new libraries were implemented, like datasets and tokenizers. Hugging Face was on its way. Another major milestone was the first partnership with a large tech company, AWS. Hugging Face stands for open source, and we help the open source community work with state-of-the-art models. For enterprise and commercial customers, there's always a need to simplify things and accelerate their path to production. A lot of machine learning and AI runs in the cloud, so it was natural to talk to these companies and partner with them to build a friendly Hugging Face ecosystem inside machine learning services. AWS was the first large tech company to realize the importance of open source AI and how fast it was growing. In March 2021, Hugging Face started collaborating on integrating Hugging Face libraries and models into Amazon SageMaker and generally collaborating with AWS across the board. This partnership led to Series B, $40 million, a major step in developing the company. In parallel, transformers were taking over. 2022 was the year where transformers dominated the AI landscape. Models for natural language processing, computer vision, and speech outperformed legacy architectures, which started to fade away quickly. Adoption grew very fast. The number of GitHub stars, a vanity metric, represents how popular a project is in the open source community. Hugging Face's adoption of the Transformers library and other libraries happened consistently. Comparing this to previous technology waves, big data was represented by Hadoop, and Spark for real-time analytics. PyTorch's popularity and the Transformers wave are bigger, faster, and higher than anything else. It's crazy how this project has become potentially the most popular data project in just a couple of years. This is a strong sign that it's real. People ask me if it's hype, but I show them this and ask if they think hundreds of thousands of people are using it because it's hype. I don't think so. Investors also believed something was happening, leading to Series C, $100 million, in May 2020. Fast forward to today, and it's fair to call Hugging Face the focal point of open source AI. We have about half a million pre-trained models hosted on the Hugging Face Hub for natural language processing, computer vision, audio, speech, multimodal, and generative AI. We have 100,000 datasets, and all these models and datasets are open source. You can download them in seconds and use them with our libraries or in the cloud. Last summer, we welcomed a team of investors, and it's fascinating that the largest tech companies in the world want to be at the table. They believe in open source AI and want to partner with Hugging Face. The best way to partner is also to be an investor because they realized the future of AI is open source. Their customers want to use open source models, and we need to work with these companies to give customers the best possible experience. Looking at the big picture, we have 100,000 datasets and half a million models available on the public hub. For enterprise customers, we have the enterprise hub with stronger security, compliance, and control, as well as exclusive collaborative features. We're still working on all our open source libraries, from transformers to diffusers and accelerate, which are the core of the Hugging Face experience. Along the way, we've built a few of our own services. The first is Spaces, a simple way for machine learning teams to host their models inside web applications on managed Hugging Face infrastructure. Customers use this as their internal demo platform, and there's a free tier and a commercial tier for larger, more powerful spaces. Another service we built is the Inference Endpoint, which lets you deploy any model from the hub with one click, running on AWS or Microsoft Azure. It's all open source, so you can grab the models, code, and deploy them anywhere. This is why people have been calling us the GitHub of machine learning, seeing us as a place to find models and datasets. While the analogy worked for a while, there's much more to Hugging Face. In the last couple of years, we've built our own models, like BLOOM, an open source LLM providing an alternative to GPT-3, released in 2022, and StarCoder, a code generation model, released in 2023. We've also built Hugging Chat, our open source answer to ChatGPT, 100% open source, UI, models, and backend, based on the best models out there. We work with cloud partners to help cloud customers run their Hugging Face workloads easily and cost-effectively across platforms. Last but not least, we work with hardware partners and the Optimum library to accelerate model training and inference. For customers who want to engage directly with us on consulting and professional services, we have the Expert Acceleration Program, where our engineers work directly with you to bring you to production quicker than you thought possible. You might wonder if we're still true to our roots. I've seen open source companies become successful and lose track of their original vision, starting to do community editions that end up being toys and enterprise editions with all the goodies. That's absolutely not what we're doing. We're still focused on helping everyone use open source models in the simplest possible way, anywhere they like. This is the 0.11 example revisited, showing how you would do it today. We've made it even simpler: just download the model from the Hugging Face Hub, prompt it, and get your answer. We've kept simplifying the developer experience and making it easier for non-experts to work with these models. For newer models, like Stable Diffusion XL, an amazing text-to-image model, the process is still simple. Download the model in one line of code, create a prediction pipeline, prompt it, create an image, and save it. Hugging Face has stayed true to the original vision and simplified things even further. Regardless of how complex the model or the underlying infrastructure is, it remains very simple. We remove anything unnecessary to make it as simple as possible. When it comes to using these models for commercial purposes in cloud platforms, deploying Mixtral, the latest and greatest model, is straightforward. You can deploy it and predict with it using the Transformers library on your own machine, or go to the page on the Hub, click on Deploy, select Amazon SageMaker, and we generate the SageMaker code to deploy the model in your AWS account. It's just a few lines of code. Our mission is to take the latest amazing models from the open source research community, integrate them into our open source libraries, and make them work seamlessly across clouds and hardware. There's no community edition or enterprise edition; it's the Hugging Face edition, meaning open source, state-of-the-art, simple, and works the same everywhere. An example of our hardware acceleration work is a 7 billion parameter chatbot running on a single Intel CPU. Through advanced techniques like quantization and hardware acceleration, you can run large multi-billion parameter models on CPU efficiently and cost-effectively. It's a bit slower than GPUs but faster than I can read, which is a benefit for me. As long as the chatbot generates responses quickly, I'm happy. The cost-performance ratio is very good, and this is the kind of work we do with hardware partners like Intel. Summing things up, Hugging Face models are now the de facto standard for AI apps. When I say Hugging Face models, I mean open source models built by the community and hosted by Hugging Face. We're not building all these models, but the huge majority of the half million model collection comes from tech companies, research organizations, etc. We're happy to be the flag bearer for all these community members. We are much more than the GitHub of machine learning. We have first-party integration on the most popular clouds and hardware, making it simple to run AI workloads wherever you want. If you need to remember one thing about Hugging Face, it's this: open source, state-of-the-art, simple. OSS, just like open source software. This is what I wanted to tell you today. I hope I answered all those popular questions and highlighted who we are, what we stand for, where we're going, and why we think open source AI is the best way forward. Thank you very much. I'll see you soon. Until then, keep rocking.

Hugging Face the story so far

Transcript

Tags

About the Author