Defining AI Workloads
Defining AI Workloads
The vocabulary around AI has gotten messy. The same word often means three different things depending on who is using it, and most of these categories overlap. Below is a short reference for the workload types that come up most often in 2026, with a definition, a bit of context, and one example for each.
Foundation Models
Large models trained on broad data that can be adapted to many downstream tasks through fine-tuning or prompting.
The term was coined by Stanford in 2021 and covers the whole parent category of general-purpose models. LLMs are the text version, but image, audio, and multimodal foundation models also exist. Every other model type in this section sits inside this umbrella.
Examples: GPT-5, Claude Opus 4.7, Stable Diffusion, Whisper.
LLMs (Large Language Models)
Neural networks trained on massive text corpora to predict the next token, which is the engine behind almost every text-based AI product today.
LLMs split into two camps based on how they are deployed. Proprietary models (sometimes called first-party or closed-source) are built by a single lab and consumed via the vendor's cloud API, which is fast to integrate but means data leaves your environment.
Open-weight models (third-party in the sense that anyone can take them) publish their weights, so teams can self-host them on their own cloud or even on-premise hardware for privacy, cost control, or fine-tuning. Most enterprises end up using both: proprietary models for hard or low-volume work, open-weight models for high-volume or sensitive workloads.
Examples: GPT-5, Claude, Gemini (proprietary, cloud-only); Llama 4, Mistral, DeepSeek-V3, Qwen 3 (open-weight, self-hostable).
SLMs (Small Language Models)
Language models small enough to run on a laptop, phone, or edge device, typically under 10 billion parameters.
They trade raw capability for speed, cost, and the ability to run without sending data to the cloud, which matters for privacy-sensitive or offline use cases. Many production systems route easy queries to an SLM and only escalate to a frontier model when needed.
Examples: Phi-3, Llama 3.2 1B, Gemma 2.
Frontier Models
The most capable AI models that exist at any given moment, sitting at the leading edge of what AI can do.
"Frontier" describes a moving line, not a fixed tier. As new models ship, the frontier moves with them, so today's frontier model is mid-tier within a couple of years. Building one requires hundreds of millions of dollars in compute and a team capable of scaling it, which keeps the field down to a handful of labs. Almost every other AI product on the market is built on top of one.
Current examples include Claude Opus 4.7, GPT-5, and Gemini 3 Pro.
Reasoning Models
A subset of frontier models that allocate extra compute to deliberate through a problem before answering.
They usually generate an internal chain of thought the user does not see, which costs more per query and runs slower than standard models but produces better results on math, multi-step logic, and complex code work. Most labs now ship a reasoning variant alongside a faster general-purpose one.
Examples: OpenAI o-series, Claude's extended thinking mode, Gemini Deep Think.
Generative AI (Gen-AI)
AI that produces new content such as text, images, audio, video, or code, rather than only classifying or predicting from existing data.
Almost every category in this list is technically a subset of Gen-AI, which makes the term more useful as a market description than a precise technical one. It is the layer most non-technical buyers refer to when they say "AI".
Examples: Midjourney (images), ElevenLabs (voice), Suno (music), Runway (video).
Chatbots
A conversational interface where the user drives every turn and the model replies in natural language.
Modern chatbots run on top of frontier models and often include memory, file upload, and tool use, but the design pattern is reactive: the assistant waits for input and responds, rather than going off and completing tasks on its own.
Examples: ChatGPT, Claude.ai, Gemini, Microsoft Copilot.
SaaS AI Agents
AI agents built into a SaaS platform that act on data inside that platform to complete business workflows without per-step human prompting.
They can qualify leads, resolve support tickets, update records, or trigger workflows on their own. This category is reshaping SaaS pricing, with vendors moving from per-seat licensing to per-action or per-outcome billing.
Examples: Salesforce Agentforce, HubSpot Breeze, ServiceNow AI Agents.
Embedded AI Agents
AI features are shipped inside another product, where the agent lives within the host application rather than as a destination of its own.
The end user often does not know or care which model is powering the feature. The key distinction from SaaS agents is that these are delivered as a capability inside someone else's UI, not as a configurable workflow tool.
Examples: Notion AI inside Notion, GitHub Copilot inside an IDE, Adobe Firefly inside Photoshop.
Desktop / Endpoint AI Agents
AI agents that run on or control a user's local device by operating the OS the way a person would, clicking, typing, and opening apps.
The advantage is access to local files and any installed software, not just the web. The trade-off is security: an agent with full desktop access can do real damage if it misfires, which is why most products today operate in supervised or sandboxed modes.
Examples: Anthropic's Claude with computer use (an API capability that lets Claude operate a desktop via screenshots and simulated keyboard and mouse), Anthropic's Cowork desktop app, Microsoft Copilot on Windows.
Agentic Browsers
Web browsers with a built-in AI agent that can read pages, navigate across tabs, and execute multi-step tasks on the user's behalf.
They are similar to desktop agents but scoped to the browser, which is easier to instrument and generally safer to run. Most also carry memory across sessions, turning the browser itself into a workspace.
Examples: ChatGPT Atlas, Perplexity Comet, Claude for Chrome.
Coding Agents
AI agents specialised for software development tasks such as reading codebases, writing patches, running tests, and opening pull requests.
They span a spectrum from inline IDE assistants that suggest code as you type to fully autonomous agents that take a ticket and ship a PR without supervision. Most engineering teams use more than one because different shapes of work suit different autonomy levels.
Examples: Cursor (in the IDE), Claude Code (in the terminal), Devin (autonomous in the cloud).
Voice AI Agents
AI agents that hold real-time spoken conversations over phone or voice channels, combining speech-to-text, a language model, and text-to-speech under tight latency budgets.
They are used heavily for customer support, outbound sales, and appointment scheduling. The hard problems are not language but interruption handling, accent robustness, and sub-second turn-taking.
Examples: ElevenLabs Conversational AI, Vapi, Bland.
Did you find this article helpful?
Let the authors know by leaving a like or comment.
No comments yet
Be the first to share your thoughts!
