ServiceNow’s annual Knowledge conferences are always an eventful time, but 2025 was a particularly interesting year. Hot on the heels of several new acquisition announcements, we’ve now also had the announcement of the AI Control Tower and AI Agent Fabric. These are just some of the exciting AI-related announcements that have become increasingly common in the ServiceNow ecosystem.
In this piece, we’re taking a deep dive into another of these announcements: An exciting new Agentic AI model from ServiceNow and NVIDIA. But what is it? What is it designed to achieve, and how does it compare against the growing number of competing models? Here’s everything you need to know.
What Is Apriel Nemotron 15B?
Apriel Nemotron 15B is a new large language model (LLM) developed through a collaboration between ServiceNow and NVIDIA. It was officially announced on May 6, 2025, at the ServiceNow Knowledge conference and has now been released for general availability.
ServiceNow describes the new product as a ‘high-performance reasoning model’, designed to understand context, solve complex problems, and make autonomous decisions.
In this sense, it’s very similar to competing agentic AI models like QWQ-32b, EXAONE-32b, and O1-mini. Indeed, the technical performance is fairly similar across all these products – as we’ll discuss in more detail below.
But what really sets Apriel Nemotron 15B apart is its emphasis on speed, efficiency, and low latency. To oversimplify, the goal of this model is not to introduce ground-breaking new AI technology to the market. Instead, it aims to help organizations build agentic AI into their workflows and effectively operate the technology at scale.
The model will be integrated into a range of existing ServiceNow tools and functions. In the meantime, an open-source version is available on Hugging Face, so developers can download, play around with it, and integrate it into their own workflows as they choose.
ServiceNow + NVIDIA: A Partnership in AI
“With this new Apriel Nemotron 15B reasoning model, we’re powering intelligent AI agents that can make context‑aware decisions, adapt to complex workflows, and deliver personalized outcomes at scale.
“But the model is just one part of the innovation. Our collaboration building a data flywheel – powered by Workflow Data Fabric and NVIDIA NeMo – enables a virtuous cycle of learning and improvement. This helps us build AI agents that are contextually aware, deeply personalized, and aligned to the real‑time needs of the enterprise.”
Jon Sigler, Vice President of Generative AI Software, NVIDIA
The release of this model is just the latest stage in the NVIDIA and ServiceNow collaboration – which was first announced back in 2023.
ServiceNow, of course, needs no introduction. But for the average reader, NVIDIA might well be a new name. The company is an AI research and infrastructure provider that offers a range of AI models. In particular, the company focuses on industry-specific AI products designed for complex sectors like manufacturing, gaming, and the public sector.
This partnership is fundamental to the new Apriel Nemotron 15B model, since the technology was built by NVIDIA. Essentially, ServiceNow brings workflow data and enterprise domain knowledge, while NVIDIA contributes the infrastructure (via DGX Cloud) and the LLM tooling (via NeMo).
Apriel Nemotron 15B in Detail
“Together with ServiceNow, we’ve built an efficient, enterprise-ready model to fuel a new class of intelligent AI agents that can reason to boost team productivity.
“By using the NVIDIA Llama Nemotron Post-Training Dataset and ServiceNow domain-specific data, Apriel Nemotron 15B delivers advanced reasoning capabilities in a smaller size, making it faster, more accurate, and cost-effective to run.”
Kari Briski, Vice President of Generative AI Software, NVIDIA
The new release aims to resolve a particular problem with recent developments in agentic AI: Many high-performing LLMs are simply too resource-intensive for the average enterprise to deploy at scale.
This is where Apriel Nemotron 15B really stands ahead of the pack: It delivers similarly advanced functionality as competing models, with roughly half of the processing power required. The small size of the model allows it to deliver fast responses with reduced inference costs – without compromising the depth and accuracy of reasoning.
The model is squarely aimed at enterprises, particularly those with automated workflows and complex decision trees, who need reliable, low-latency reasoning AI. It’s particularly well-suited for code assistance and generation, logical reasoning, Q&A functionality, and multi-step tasks.
Nonetheless, a caveat: agentic AI remains an emerging technology. It’s important to be careful of what decisions you delegate to AI and what the consequences are of any errors it may make. Put simply, this technology still isn’t robust enough for anything that has particularly high safety, compliance, or security implications.
How Does it Shape Up? Technical Comparisons
Apriel Nemotron 15B is just one of several Agentic AI models to have been released in recent years. In fact, several models now offer similar functionality, including those from organizations you’ll almost certainly have heard of, such as:
So we already know that Apriel Nemotron 15B offers significant improvements in terms of size and processing power. In fact, it’s about half the size of QWQ-32b and EXAONE-32b, and consumes roughly 40% fewer tokens than QWQ-32b. When it comes to processing costs, latency, and speed of response, there’s really no competition here.
But how does it compare in terms of performance? To see this in more detail, we can consult the detailed technical comparisons available on Hugging Face.
Here, the model has been tested against four other agentic AI models (the three above and another previously-released model from NVIDIA), across seven main areas. For those interested in the details, I’ll explain these seven areas at the bottom of the piece (see appendix). But TL;DR: Each is a test of the AI’s effectiveness in a different area, such as reasoning, coding, or mathematics.
As you can see in the image, there is very little meaningful difference in these areas between Apriel Nemotron 15B and its three main competitors. The only notable outlier is NVIDIA’s earlier model, which underperforms on most metrics. That’s to be expected from a less recent model, so it can be discounted here.

If you consider performance alone, there’s no particular reason to choose Apriel Nemotron 15B over its three main competitors – or vice versa. Some are marginally better at some tests, but the difference overall is fairly minuscule across the board.
But, again, outperforming the competition on performance isn’t really the goal. On the basis of its processing power, the new model clearly stands head and shoulders above the competition.
Final Thoughts: Where to Start With Apriel Nemotron 15B
For developers and other associated AI geeks, the model is open source and available for free from Hugging Face. Otherwise, you can expect to see Apriel Nemotron 15B powering ServiceNow AI products and features over the coming weeks and months.
Appendix: Definitions of Agentic AI Model Tests
- Mostly Basic Python Programming (MBPP): Benchmarks how effectively the model can write code.
- Business Function Call Language (BFCL): Evaluates whether the model can correctly trigger specific tools or systems (e.g., reports or logging issues), based on natural language commands.
- Retrieval-Augmented Generation (Enterprise RAG): Measures how effectively the model can fetch and use internal business documents to generate accurate and useful responses.
- Multi-Turn Benchmark (MT Bench): Checks how naturally and accurately the model can carry out longer, multi-step conversations, similar to customer service or complex internal queries.
- MixEval: Assesses the model’s general abilities across a wide range of tasks, including mathematics, instruction-following, and problem-solving, giving a broad measure of overall competence.
- Instruction-Following Evaluation (IFEval): Tests how well the model can understand and carry out tasks based purely on natural-language instructions, especially for practical enterprise use cases.
- MultiChallenge: Combines multiple difficult benchmarks to see how the model handles a wide variety of reasoning, language, and tool-using tasks all at once.