SLM in Action Arcee Agent A 7B model for function calls and tool usage

August 06, 2024
In this video, you will learn about Arcee Agent, a new state-of-the-art 7-billion parameter model created by Arcee.ai from Qwen2-7B. ⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos. Follow me on Medium at https://julsimon.medium.com or Substack at https://julsimon.substack.com. ⭐️⭐️⭐️ At the time of recording, Arcee Agent is one of the top models for function calls and tool usage, outperforming GPT-3.5 and current versions of GPT-4o. Arcee Agent was built on Arcee Cloud and you can learn more at https://www.arcee.ai/product/arceecloud. Here, I run the full precision model on my M3 MacBook and ollama to build a financial agent able to invoke the Yahoo Finance API to answer questions on listed companies: what's the stock price? Who's the CEO? What is this company doing? and more. Along the way, I also show you that you don't need Github Copilot to explain code and generate documentation. You can use Arcee-Spark locally instead! * Blog post: https://blog.arcee.ai/introducing-arcee-agent-a-specialized-7b-language-model-for-function-calling-and-tool-use-2/ * Model page: https://huggingface.co/arcee-ai/Arcee-Agent * Notebook: https://github.com/juliensimon/arcee-demos/blob/main/arcee-agent/yahoo_finance_assistant.ipynb 00:00 Introduction 00:44 Introducing Arcee-Agent 01:55 Running Arcee-Agent locally with ollama 03:00 Looking at the four functions implemented by our financial agent 06:50 Routing user queries to the appropriate function 08:25 Explaining and documenting our code with Arcee-Spark 10:02 Running inference with our financial agent Configuration file for ollama: FROM ./llama-spark-dpo-v0.3-Q5_K_S.gguf #ai #aws #slm #llm #openai #chatgpt #opensource #huggingface Sign up for Arcee Cloud at https://www.arcee.ai, and please follow Arcee.ai on LinkedIn to stay on top of the latest Small Language Model action! https://www.linkedin.com/company/99895334

Transcript

Hi everybody, this is Julien from Arcee. There's a lot of excitement around using agents to perform tasks that language models are not great at, such as math or generating API calls to call your own applications. Arcee recently released a model called Arcee Agent, which was specifically built for agent-based applications. In this video, I'm going to talk a little bit about Arcee Agent and what it is, and then we're going to run a demo where I use the model to generate API calls for Yahoo Finance to retrieve company information, stock quotes, etc. Let's get started. Arcee Agent has been specifically built for function calling and using external tools. It's based on the QAN2 7 billion model, which is already a very good model, and it was further trained and specialized for agent apps. This is the blog post. Of course, I will put all the links in the video description so you can learn more about the model. It can be used for API integration, database operations, code, and more. You can also see some benchmarks showing it ranking high, outperforming GPT-3.5, a recent version of GPT-4.0, and many other models. It is a really good model, and I encourage you to try it. As you would expect, it is available on the Hugging Face Hub, so you can go and grab it there, either the full precision model or the quantized version if you prefer to run smaller versions, maybe locally. Let's look at the demo now. In this demo, I'm going to run everything locally. I will run Arcee Agent full precision 7B with all ammo, and integrate through Langchain to run inference on this local model, using prompts to perform Python function calls to the Yahoo Finance API to retrieve stock prices, etc. We need a few dependencies for this: Langchain with the Olama integration, and the Yahoo Finance package. Let's import all of those. Next, we need to make sure we have the model locally. I've already done this, so it should be quick. Let's verify it's there. Yes, it's 5.4 gigs. Good. We should be able to run this locally now. Let's run this cell. Good. Now, let's look at the functions we'd like the model to perform. We have a prompt with four primary functions: checking the last price of a specified stock, finding the name of a company's CEO, finding what a company does, and answering specific questions about a company. The model should use the appropriate function based on the user query. The four functions are `getStockPrice`, `getCEOName`, `getCompanySummary`, and `answerGeneralQuestion`. `getStockPrice` takes the name of a company. For example, "What's the last closing price of McDonald's?" The model should automatically find the stock symbol, the ticker code, and call the relevant Yahoo Finance API to return the last closing price for McDonald's, which might be $250. `getCEOName` works similarly. I'll pass a company name, such as McDonald's, and the model should figure out the ticker code, call the appropriate Yahoo Finance API, and retrieve the name of the CEO, printing something like, "The CEO of McDonald's is [Name]." `getCompanySummary` also follows the same pattern. I'll pass a company name, and the model should output the ticker code, call the API, extract the long summary describing the company's activities, and print that out. The final function, `answerGeneralQuestion`, is a catch-all. If the model does not detect any of the three specific intents in the prompt, it will use this function to answer the question with its built-in knowledge. The instructions are simple: if the user asks about a stock price, use `getStockPrice`; if about a CEO name, use `getCEOName`, etc. This is my prompt. I didn't provide a lot of examples, just a few, to keep the prompt shorter and the calls faster. Now, let's look at how we're going to run these prompts through the model. We'll invoke the model with a function called `LLMPAC`. This function will take our input, including the system prompt and the user query, and pass it to the model. The model will return the name of the function to call, and we'll extract and run that function locally. The model will not run the function; it will return the function call, which we extract and execute. To explain this function, let's use another model, LamaSpark, our iteration on Lama3, 1.8 billion. Let's run this and ask it to explain what the code does. Wonderful. Now we have a clear, line-by-line explanation. It's a great alternative to GitHub Copilot. If you don't want to pay for Copilot, why not use this instead? Let's improve the notebook with some documentation. Now, let's run some examples and print out the responses, along with the API calls. "What's the stock price for Caterpillar?" The model figured out that "CAT" is the ticker code for Caterpillar, called the API, and got the job done. Let's try another: "Who runs Caterpillar?" This is a fuzzier question, but the model matched it to the CEO name function. Running a company generally means being the CEO, and the model extracted the information accordingly. Next, "What does Caterpillar build?" This is a general question about the company, so the model used the `answerGeneralQuestion` function. Finally, "Who are the main competitors of Caterpillar?" This is another general question, and the model used the `answerGeneralQuestion` function, listing a bunch of companies that seem to make sense. Let's try another company. "What's the stock price for Tesla?" "Who runs Tesla?" The model correctly identified Elon Musk as the CEO, with his full job title as Co-Founder Techno King. "Do they have any competition?" The model listed a bunch of automakers, which makes sense. That's pretty much what I wanted to show you in this video: the Arcee Agent model and a simple demo on using it to generate Python API calls. Keep in mind, you can do much more, such as SQL and other tasks. Keep exploring, and I'll come back to agent models in future videos. Thank you for watching, and you know what to do. Keep rocking.

Tags

Arcee AgentAPI IntegrationStock Market AnalysisMachine Learning ModelsLocal Model Execution

About the Author

Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.

With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.

Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.

Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.