LLMs from the trenches Closed model builders have decided for you
June 14, 2024
Excerpt from "Let's Build a Startup S2E2 - Anatomy of a Unicorn: Hugging Face with Julien Simon" https://www.twitch.tv/videos/2170990579
#largelanguagemodels #HuggingFace #MachineLearning #DeepLearning #AI #opensource
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos. Follow me on Medium at https://julsimon.medium.com or Substack at https://julsimon.substack.com. ⭐️⭐️⭐️
Transcript
You don't know how they've been aligned. That's the thing. No names, but all those closed models have amazing capabilities. Some I like more than others, but it doesn't matter which ones. Their creators have decided for you. They have curated the dataset, designed the alignment process, and written the prompts and system prompts. If that works for you, great; it saves you time. However, it might stand in the way or sometimes become extremely stupid. We saw a closed model from another company generate historically incorrect figures. It was good fun to watch, and it was obvious those were wrong. The problem is, it could be more subtle. The human element is crucial. Everybody wants to fight bias and have safe models. But if the workforce aligning the model is not diverse enough, they might introduce another bias. It's a complex problem. The road to hell is paved with good intentions. Bias, risk, and alignment are tricky. From one country to the next, the context is very different. For example, in Singapore and Dubai, the perspectives can differ from those on the US West Coast. People in other parts of the world might not like some of your answers. They might build their own models, which will give them better language support. When you travel, you realize the importance of respecting cultural differences. In Rome, do as the Romans do, or don't travel. This applies to AI models too.
There are differences across countries. Language support is an obvious one. Not everyone speaks English or one of the three major languages. For instance, in Indonesia, there are over 100 dialects. Similarly, India has many official languages. Big models can do an okay job with the most important languages, but for high-quality solutions in regions like APAC, India, or Eastern Europe, you often need to build your own models. A Singapore customer, for example, is working with AWS to build a regional model through AI Singapore.
Cultural differences are also significant. In some parts of the world, people don't want a Western or US West Coast view on their country. They want their own perspective on their culture, religion, or history. Models need to account for these differences. If a model doesn't sound like a human expert from the local context, it won't achieve mass adoption or the highest quality. Local users need to build their models. You can't expect people from the US West Coast to understand your country and culture as well as you do. Public sector stakeholders need to own this process and partner with tech companies to speed up their efforts, but they must be in charge of the data and evaluation. Decentralizing the development of these models is crucial. We shouldn't have just three companies in the world building them.
Tags
AI AlignmentCultural SensitivityBias in AILocal AI DevelopmentLanguage Diversity
Julien Simon is the Chief Evangelist at Arcee AI
, specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.
With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.
Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.
Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.