AI From the Trenches and the Pivot to Small Language Models SLM with Julien Simon of Arcee.ai
April 04, 2025
AI isn’t about hype—it’s about solving real problems. In this episode of The AI Space Podcast, host Sanjay Kalluvilayil is joined by Julien Simon, Chief Evangelist at Arcee.ai, for a wide-ranging conversation on building scalable, cost-effective, and domain-adapted AI solutions using small language models (SLM). @juliensimonfr
Together, they cut through the noise around automation, open source, and large-scale model performance, zeroing in on what truly matters for enterprises: ROI, responsible innovation, and solving business problems faster and smarter. From Julien’s experience at AWS and Hugging Face to his work now at Arcee.ai, you'll hear powerful insights on enterprise AI adoption, distillation techniques, and how to integrate open-source models into high-impact business workflows.
Whether you’re a founder trying to scale, a technologist looking for better tools, or a corporate team trying to figure out what’s real and what’s fluff in AI—this episode will help you understand how to build AI that actually works.
Want to learn more about AI and business growth? Subscribe for weekly insights!
#TheAISpacePodcast #ArtificialIntelligence #Enterprise #OpenSourceAI #SmallLanguageModels #SanjayKalluvilayil #JulienSimon #ArceeAI #AIstrategy #ResponsibleAI #ScalableAI #TechForBusiness #InnovationLeadership #huggingface #slm #ROI
00:00 The Value of AI in Business
02:52 AI as an Accelerator, Not a Replacement
06:14 The Role of the Technical Evangelist
08:57 Transitioning from AWS to Hugging Face
11:50 Understanding the Importance of User Experience
15:08 The Shift to Enterprise Solutions
17:57 The Role of Small Language Models
21:07 The Limitations of Current AI Models
23:54 The Future of AI in Enterprises
31:52 The Advantages of Small Language Models
36:14 Building and Training Small Language Models
43:43 Democratizing AI: Challenges and Opportunities
53:31 Bootstrapping and Funding Your AI Venture
56:54 Encouragement for Aspiring Entrepreneurs
AI, Artificial Intelligence, AI Podcast, AI News, Tech News, GPT
Join the AI Revolution - Artificial Intelligence Insights for AI Startups & Founders!
Website: https://stonehaasadvisors.com/
Linkedin: https://www.linkedin.com/in/sanjaykalluvilayil/
Linkedin Podcast: https://www.linkedin.com/company/the-aispace-podcast-with-host-sanjay-kalluvilayil/
Spotify: https://open.spotify.com/show/0qslUXvSicQgGezUPcIs88
Linktree: https://linktr.ee/theaispacepodcast
IHeart Radio: https://www.iheart.com/podcast/269-the-aispace-podcast-236280407/
Amazon Podcasts: https://music.amazon.com/podcasts/fb885ab0-1559-45ec-b3ed-5a3f15ef2553/the-aispace-podcast
Host Bio: Sanjay Kalluvilayil is a visionary leader with 30+ years of experience with a specialty in Go-to-Market Strategy & Execution. He is the Founder of Stonehaas Advisors, The AI Space Podcast, and working on some stealth AI start-ups.
Company Bio: Stonehaas Advisors provides a Comprehensive Portfolio of Consultancy & Fractional CXO Services to Guide Companies Across the Full Cycle of Business & Technology Initiatives. The company empowers Founders and businesses struggling with cash flows to grow, transform, and scale to new levels and dimensions.
Transcript
If you're firing everybody through your AI because you have all this automation, then you're a commodity. You're not adding any value. And so then you're not doing anything different. Because you will be using the same models as everyone else. Don't lie to yourself.
Welcome to the AI Space Podcast, where we strive to encourage and empower AI founders, innovators, technologists, startups, and businesses to drive innovation and scale their initiatives with game-changing insights. I'm your host, Sanjay Kululil, founder and CEO of Stonehouse Advisors, a consultancy dedicated to helping AI leaders struggling with cash flows to create value and maximize valuations to fund growth, successful capital raises, or strategic exits. This is our ninth episode, and we're thrilled to have Julien Simone from Arcee. He's the evangelist and joined in 2020-25. He transitioned from Hugging Face and is also with Amazon Web Services as the global technical evangelist for AI and machine learning. In his role at Arcee, Julien is focused on helping enterprise clients develop high-quality and cost-efficient AI solutions using open-source, small language models.
So in this agenda today, as always, we cover technology architecture, integrating AI into your technology stack. I'm super excited to learn more about small language models. We'll discuss business scaling, and we have an expert like Julien here to talk about cost-effective strategies to grow the business. We'll also cover ethical and responsible AI, balancing innovation with accountability. We'll talk about AI democratization, empowering businesses and others with unique perspectives through open source. Finally, we'll share some encouragement and thoughts for those grinding it out and trying to be successful in this space.
Julien, we're going to kick it off right away. Bienvenue. Welcome. You're coming from Paris, France. Bonsoir. Comment ça va? Ça va très bien. And thank you, Sanjay, for having me on the show. It's a pleasure to be here.
All right. Awesome. So what do you see as the hottest trends in AI for 2025? Oh, I'm not going to put my influencer cap on because I don't have one. The last thing I want to be is an influencer. I'm a practitioner. So I can tell you what customers tell me or ask me about. The main problem for them right now is, can I get ROI from my AI solutions? Yes or no, right? We're not in 2023 anymore. We're not in 2024 anymore. We're not in the sandbox anymore. People want to solve real business problems with AI. They want to see that it benefits the company or the organization. I'm a very pragmatic guy. To me, that means either you're saving money with AI or you're making money with AI.
Now, that's where the rubber meets the road. Are you making money, saving money? If not, why are you even doing this? You should invest your money, skills, and efforts into something else. If AI doesn't work for you, then it doesn't. Just go do something else. I think that's still number one, showing real value for customers, showing real value for internal users, and generally for all stakeholders. That's really number one. And it has a ton of consequences, and I'm sure we'll double-click on those.
Awesome. And I'll definitely get into that. I appreciate those real practical insights. So next, what insights have you seen in the last year that founders or businesses are using to scale up their businesses?
So, staying close to the ground, if you're going to put AI to work, it has to help folks on a daily basis. There's this fantasy that AI will replace people. Oh, software engineering jobs are down 80%. I keep seeing this post-op. It drives me mad because all the companies I talk to, all the folks I meet, they're still trying to find the best developers they can find. Software engineers, architects, data scientists. So I don't know what this means. AI is not going to replace developers. AI is going to be an accelerator. It's augmenting the best people in their company. It's never going to replace your best folks. Never. That's a pipe dream. If it's replacing folks, it means you don't have the right people working for you. A model is still fairly dumb, honestly. If you're replacing dumb folks, and that sounds horrible, but it's probably happening here and there, then why did you hire them in the first place? It's your fault.
So you still need to hire the best people you can find and make them X times more productive, whatever that means to you, with AI. AI is not something you can ignore. AI for productivity or automation is the number one use case. It was the number one use case two years ago. It still is. If you have great people working for you and equip them with the right AI solutions, they're going to go 5x, 10x productive. This stands for financial analysts, developers, doctors, teachers, and everyone else. That's the one thing you need to focus on if you ask me.
No, I completely agree with you. We've talked about this on this episode multiple times that you can't replace human creativity, spirit, soul, those types of things. And I like what you said too, because if you're firing everybody through your AI because you have all this automation, then you're a commodity. You're not adding any value. And so you're not doing anything different. Because you will be using the same models as everyone else. Do you think your prompts are going to make your company insanely more efficient and competitive? I mean, come on, seriously? I want to scream, laugh, or cry. Both. All of them.
So the prompts, if your intellectual property, if your so-called competitive advantage is your prompt library, man, you're in trouble. You are deceived. Or you've automated all your CRM and everything else, and you think you've done it. You're going to have a very hard landing. So no, that's not the way. You need the best people and make them 10x faster, 10x more efficient with AI. That's the winning strategy.
No, that's awesome. And before we continue, I want to invite everyone to follow Julien and, you know, we'll give you some information at the end. But definitely like, comment, and subscribe right now to the AI Space Podcast because this is why we bring on guests like Julien. We're trying to bring on people who are actually in this space for multiple years, grinding it out, learning the hard lessons. For those of you out there trying to figure this space out, grow, and innovate, we want people like Julien to come on and give you what's real. AI from the trenches, that's my job description.
I know a lot of folks feel the same. We're so tired of fluffy, smoke-and-mirrors presentations and LinkedIn feeds full of people posting papers they haven't written. Sometimes they haven't even read them. If LinkedIn could just add a feature where I could filter all that crap, I would actually pay for LinkedIn. I want to see real-life people posting about real-life experiences, not, "Hey, look at me. I'm posting this clever paper on reinforcement learning." Honestly, I don't know the slightest idea how it works or what it is, but hey, reinforcement learning. Go and follow my newsletter. Enough. Enough. So, more of people building real things.
Awesome. So we're going to get a little bit more about your background. You use the term evangelist, which comes from Koine Greek, the spread of the good news. So you think of the gospel of Jesus Christ. Unfortunately, today you may think of a TV evangelist. You're a messenger of good news for Arcee. Maybe explain your version of what evangelist means and tell me a little bit about how you got here. Talk a little about your Hugging Face role and Arcee. What led you into this foray into AI?
So, long story short, I started as a software engineer, worked across industries and different companies, ended up being a CTO and VP Eng in some different startups for about 10 years, got tired of it. That would be another great episode. How did you ever get tired of being a CTO? I should write the book. I joined AWS as a technical evangelist, stayed there for six years, then Hugging Face for three years as chief evangelist, and now Arcee chief evangelist. I've been doing that evangelist role for over 10 years now, and I think I'm still enjoying it.
The job title is weird. I've got funny anecdotes about explaining this in different parts of the world. There is no religious undertone whatsoever. I had to clarify that for some parts of the world, and then we had the good conversation. It has nothing to do with any kind of organized religion, except maybe technology, if you want to see it as a cult, maybe it's like the religion of science.
To me, it's about being in the shoes of the customer, the user, being what I call developer zero. Whether it was Hugging Face or AWS, I was always trying to get early access to everything and use it not as an AWS employee or Hugging Face employee but as Joe Developer or Jane Developer, trying to figure out what this new service or feature does, just reading the docs. I refrained myself for 10 years from going and actually reading the source code or any internal documentation. I wanted to have the first-run experience that you would have when trying out this new thing.
Most of the time, it wasn't fully baked. Even if the team thought it was fully baked, it wasn't. The docs were bad, it was buggy, and some features made no sense. That's what I call being developer zero. It's like, no, you have to fix this. The developer experience is horrible. We need more examples of that. We need more docs on this. So trying to, for the benefit of the users and customers, and obviously for the benefits of the company itself, bring something to market that actually makes sense, is usable, convincing, and solves problems.
I've often defined my job as wasting hundreds of hours on a launch so that you don't have to. To me, that's my infinite ROI because if I'm spending 50 hours debugging, testing, bickering with engineering teams on what needs to be fixed, and eventually convincing them to fix it, and if I'm saving even one hour of frustration or dark reading for thousands of people, to me, that is infinite ROI. That's my own job satisfaction, knowing that I saved folks a lot of time and frustration. Whatever product we release, they have an easier time testing, are more likely to adopt it, and it's just good business.
So you have to be developer zero if you're looking at developer relations and any kind of technical evangelism. Never be a marketing mouthpiece, never be a company shill, a corporate zombie. Companies have enough of that. You need to be on the other side of the fence and fight for your user community, your developer community, and make sure they get the best possible experience when they try your product. In the end, that's how you get adoption and generate business.
No, that totally resonates with me. The purpose of the show is to be authentic and show the real side, the dark and the good side of AI and different perspectives, what's real and what's not real. So I appreciate that. We also have a lot of people experienced in AI and many new people in the space. I know I have some young college students out there. You went from AWS, which almost everyone knows, to Hugging Face. Tell us a little bit about that decision and what Hugging Face is for those who don't know. Then, why did you go to Arcee?
I'm not a particularly great stock picker or investor in general. My timing is generally awful. However, I seem to have a knack for picking the right company at the right time and going to work for it. When I joined AWS in mid-2015, it was pretty much before the huge lift-off that AWS saw. To me, cloud was inevitable, unavoidable. I saw it and had done about 10 years of physical infrastructure and data centers. I was working on AI ML at AWS, and AWS was the first cloud company to partner with Hugging Face, which at the time was a 20 or 30-person startup building open-source libraries and hosting open-source models, the so-called transformer models and transformers library.
That started to see some traction. Funny enough, I did write the blog post announcing the partnership between AWS and Hugging Face. I started looking at them more and realized that open source was a thing. I've been a supporter for a long time. I saw that the new models and libraries Hugging Face was building to make it easy to fine-tune and deploy them were massively simpler and more powerful than anything that came before.
Natural language processing was something I'd been working on for a few years, grinding through TensorFlow and very complex tools and horrible GitHub repositories. All of a sudden, Hugging Face with their two lines of Python was eye-opening. I thought, okay, those guys are getting AI. That's the new wave of AI. Six years at Amazon was enough. I had done big tech for six years, survived it, got promoted, built a bit of a reputation, and done a few things. It was time to go back to startup land.
I did that for three years at Hugging Face and helped them the best I could. But striking the right balance between open source and enterprise is difficult. No offense to Hugging Face. I had a good time there as well. The customer of Hugging Face is the open-source community. I'm an enterprise guy. I love talking to enterprise people. I love talking to JP Morgan, Goldman Sachs, Pfizer, or whoever. These people have large-scale problems, complex business challenges, and very complex IT environments. We're not talking about deploying something and not caring if it's broken. It's not your three-person startup.
Those teams have security, compliance, budgeting, and risk management. So, the focus of Arcee is clearly enterprise, even though we still build open-source libraries and work with open-source models. Our main focus is to build enterprise platforms, and that's where my heart is.
Yeah, no, that's awesome. In 2025, 10 years ago, you got into artificial intelligence, machine learning, and natural language processing. The ability for computers to analyze, reason, think like humans, predict patterns, understand human voice and communication, and convert that into computer language. You went to Hugging Face, which has open-source models. You can go to Hugging Face and download Llama, DeepSeek, and different diffusion models to test out. But as you tie some of these models and try different knobs, if you're more refined and have an enterprise background, you can get frustrated because you have to spend a lot of time fine-tuning and tweaking to get the result you want.
From your story, you got to a point where you wanted a more polished version and worked with people with big problems and big solutions. You need more refined, fine-tuned, and polished models. More than the model, the model is the engine. The engine is great, but if you're running a data science team or a machine learning team at a large company, the model is just a fraction of the problem you have to solve. Finding the right model and maybe tweaking it, fine-tuning it, is one step. But what about deployment, monitoring, scaling, cost management, user experience, and how business apps and users work with the model?
So that's why Arcee is trying to do. Yes, we build models. Yes, we take the best open-source models available today and make them even better with our stack. But we also build SaaS platforms to make it super easy to run inference. We have an agentic platform that lets you build end-to-end workflows with model nodes and integration nodes. If you want to extract data from Salesforce, HubSpot, or any app, we have over 200 integrations to pull that into a model for processing and story writing, and then push it back to Google Docs or Gmail. Drag and drop, zero code.
That's what enterprise users want. They want the end-to-end solution. No one cares about a model with 64 attention heads and 8K context length. People want to solve real problems. For example, one of the largest mobile operators in the world spends $5 billion a year on call centers. Can they optimize that cost and give more money back to their shareholders? That's the problem they want to solve. If you're not interested, don't talk to them. Do R&D instead. In my world, we have to find a solution for that.
At the end of the day, it's going to be a technical discussion, but the technical discussion is not the starting point. The model is not the full discussion. It's a small part of the discussion. That's why I keep yelling at those influencers who post about the latest and greatest model released last night because it doesn't matter. It might be the best model today, but enterprise users are not going to adopt it tomorrow, next month, or even in six months. They're probably still working with Llama 3, which was released quite a while ago, and they're probably getting business value out of it because they're doing it right.
So that's the wake-up call. Stop obsessing about models. Yes, they're fun. Yes, I love to read the papers and test them. But in the enterprise world, this is a tiny part of the discussion.
Yeah, no, you're absolutely right. We talk about this on every episode: are you solving a real problem? Can people understand the problem you're going to solve for them? Do they believe you can deliver the results? They're not buying your model, consultancy, or whatever. They want to believe and understand that you can deliver the results they expect and have a mechanism to do that. They don't care if the mechanism is Pascal, Fortran, C++. AI is a tool. If it gets the job done at a cost-effective rate, and you can have the latest model, that's great. But if you can solve the problem without any machine learning, it's even better.
I've said for years, the best machine learning is no machine learning. Machine learning is complicated, sometimes unpredictable, and probably too fancy for your technical teams. The skill shortage is real. So if you can solve the problem with SQL and your favorite backend language, great. But there's a class of problems that cannot be solved with traditional techniques and have to be solved with machine learning and AI. You need to make sure you're zooming out. AI is just a tool in the IT toolbox. Don't force AI on a problem that isn't AI-related. That's the worst mistake you can do.
So about Arcee. Tell us a little bit about Arcee, the company, where you're based, your target segments, and use cases.
Arcee is a U.S. startup about 18 months old. We've gone through a Series A already. Our founders, two of whom are ex-Hugging Face, and I worked with them at Hugging Face. We stayed in touch, and the connection is important because being at Hugging Face helped us realize that enterprise users get more business value from small language models. We never believed the OpenAI propaganda or promise. We saw it not working for a lot of customers in terms of compliance, performance, business performance, or technical performance, and definitely not working in terms of cost as soon as they started scaling.
We're not philosophical about this. The open-source community can get a little too political sometimes. That's not us. Our beliefs come from observation and customer conversations. We know that open-source models and small language models are cost-efficient. They can be more easily adapted to particular domains and generally work better on enterprise use cases, which are narrow problems.
For example, if you're a mobile operator and want to build an efficient chatbot for customer support, that's the problem you want to solve. You don't want cooking recipes or astronomy questions. You're not writing poetry. If I lost my phone, a nice poem is not going to solve my problem. So 99.9% of the knowledge ingrained into those GPT-4s or anthropic models is irrelevant and can stand in the way because you never know what those models are going to say.
Working with smaller models makes it easier to focus on a particular problem. They can be adapted, fine-tuned, and are more scalable. They're smaller, so you don't need as much infrastructure to run them. You don't need huge GPU boxes or cloud instances. You may even run them on CPU if you're so inclined. You can host them yourself or in your private cloud. At the end of the day, you get more privacy, better compliance, better domain adaptation, and more ROI because cost performance is so much better.
That's the core philosophy Arcee builds upon. We build models, push some models on Hugging Face, which tend to score very high on the Hugging Face leaderboard, demonstrating the value of our training stack. We use those models to build inference platforms and agentic platforms for enterprise. The value of those small models becomes really obvious when you do that. If you start building workflows that involve two, three, five, or seven model steps, that's a lot of inference. If every inference takes 60 seconds, your workflow is going to be dog slow, and you don't want that. Smaller models run faster, your workflows run faster, and they will run cheaper.
If you need to run thousands of workflows in parallel, they scale better. Instead of using a single LLM, you can use the best model for each step in the workflow. Maybe if you need to do basic translation for a step, you can use a very small model. At the end of the workflow, if you're writing a report with all the data collected, you need a bigger model for better creative writing. It's not just using the right model for the job; it's using the right model for each step in the workflow, which is where having smaller models has a lot of benefits.
That's what we build and show to customers. The process is relatively simple but requires careful data selection and preprocessing to achieve the best results.
The analogy of distillation in AI is similar to removing impurities in a chemical process. I'm invited to speak at an event in Ireland about distillation, and I'll use that as a fun opening joke. When it comes to AI, the fundamental concept is neural networks, which mimic the human brain's neurons and pathways. Large language models need many parameters to break down and understand complex sentences. However, not every task requires 80 billion parameters. A smaller model with 8 billion parameters might be sufficient for specific tasks, much like teaching a kid to shoot a basketball correctly.
Now, let's talk about democratizing AI. The key challenges and opportunities in making AI accessible to various industries and backgrounds include understanding the maturity of the AI stack. A few years ago, it was about downloading a model from Hugging Face and figuring out what to do next. Today, organizations need to be honest about their skills and data infrastructure. Most companies lack ML or AI skills, and fine-tuning models is not a viable option for them. Instead, they should focus on practical, high-level APIs and services that can be integrated with minimal risk and cost.
Prototyping and iterating are crucial. Delay the technology decision until you have validated use cases that benefit the business. Try out different technologies and services to find what works best for your team. Start with something simple and improve it iteratively. Hiring a data science team from scratch without a clear problem to solve often leads to failure. The strategy might be right, but poor execution can derail the entire project. It's important to define the problem, assess the team's skills, and provide necessary training. Internal entrepreneurship is often underestimated, and companies should encourage and support innovative ideas.
In terms of scaling, always start from customer pain points. Customer obsession, as practiced at Amazon, is a valuable principle. Validate your problem and solution through customer interviews and feedback. Building something without customer input is a gamble. If you can't clearly articulate the problem and solution, you're likely to fail. Ensure you have customers willing to pay for your solution. Once you have a validated problem and solution, the rest will fall into place.
For those starting in this space, my advice is to go for it. Whether it's an open-source project, a business idea, or internal entrepreneurship, collect evidence, do your homework, and build a strong case. If your idea is torn apart, find another one. The prep work is fundamental and will serve you well. If your current company doesn't support your ideas, find one that does. Don't be afraid to take chances and leave a job where you're not growing or learning. Your time and energy are your most valuable investments. If you're not happy, you're wasting your potential.
To learn more about Arcee AI or connect with me, visit our website at arcee.ai. You can find me on LinkedIn and YouTube. I'm always happy to connect and answer questions. Check out the cool stuff we're building at Arcee; it might inspire your own efforts. The show has been amazing, and I hope you gained valuable insights to grow and transform your business using AI. Thanks for joining us, and see you next time.
Tags
AI Enterprise SolutionsCost-Efficient AISmall Language ModelsAI for Business ProblemsEthical AI Practices
Julien Simon is the Chief Evangelist at Arcee AI
, specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.
With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.
Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.
Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.