Large Language Model (LLM)

Also known as: LLM, foundation model, frontier model

A large language model (LLM) is a neural network trained on vast amounts of text to predict the most likely next unit of text (a token) given everything before it. By doing that one task at enormous scale, it learns to generate fluent language, answer questions, write code, and follow instructions. An LLM holds no memory between requests and no live access to your data unless you supply both at runtime, in the prompt.

How it actually works (and the part the demos skip)

Strip away the marketing and an LLM does one thing: given a sequence of tokens, it outputs a probability distribution over the next token, samples one from it, appends it, and repeats. Training is self-supervised — the model is shown ordinary text with the next token hidden, scored on how close its guess was, and corrected across billions of examples until its predictions track real language. There is no lookup table of facts inside it. What looks like knowledge is statistical regularity compressed into the weights. A sampling setting (often called temperature) controls how sharply it favors the single most likely token over a plausible alternative, which is why the same prompt can return different answers. Three consequences follow directly from this design, and they explain almost every LLM behavior you will fight with in production:

It is stateless. The model remembers nothing between calls. Every 'conversation' is the full history re-sent in the prompt each turn. That context window is finite because attention cost rises with its length, and every token in it is metered — so longer memory means higher latency and higher cost, not just more recall.
It is frozen in time. The weights stop changing when training ends. Anything after the cutoff — or anything that lives only in your CRM, your tickets, your contracts — it cannot know unless you place it in the prompt at runtime.
It is confidently approximate. Because it samples plausible text, it will produce fluent, well-formed answers that are wrong (a 'hallucination'). Fluency is not evidence of accuracy. A correct paragraph and a fabricated one come out of the exact same mechanism.

Why it matters for buyers and analysts

The executive takeaway is counterintuitive: in most business systems, the LLM is the least important and most swappable component. Raw model capability is largely a commodity that resets across vendors every few months, and it is usually the cheapest part of the system to change — often a one-line config swap. What is not commodity, and what actually decides whether the system returns value, is everything around the model: the data you feed it, the tools you let it act through, the guardrails on what it is allowed to do, and the measurement of whether its output moved a real number. A frontier model on stale or missing data loses to a modest model on clean, well-governed data, every time. There is a quiet cost lever here too — a smaller model that clears your quality bar can run several times cheaper and faster than the largest one, and at scale that gap is the difference between a pilot and a P&L line. So 'which model should we use?' is rarely the question that determines ROI. 'What does the model see, what can it do, and how do we know it worked?' is.

Where it fits in an AI system

An LLM on its own is a text generator, not a product. Useful systems wrap it. Retrieval (RAG) pulls the right facts from your data and places them into the prompt so answers are grounded in records instead of guessed from training. Tools and function-calling let the model trigger actions — query a record, file a case, send a quote — turning a writer into something that does work. Orchestration decides which model, which data, and which tools handle each step, and where a human signs off. The LLM is the reasoning layer in that stack; the value, and the liability, live in the connections to your systems and the discipline around them. That is the practical line between a clever demo and a system someone is accountable for in production — and it is where most of the real engineering goes.

LLM alone: drafts, summarizes, classifies — but invents specifics and knows nothing about your business.
LLM + your data (retrieval): grounded answers that cite the actual record instead of a plausible guess.
LLM + tools (agent): not just answers but actions inside your systems, which is exactly where outcomes and risk both concentrate.
LLM + measurement: the only configuration where you can tell whether any of it earned its keep — and the one most teams skip.

Frequently asked

Is a bigger LLM always better?

No. Larger models reason more capably but cost more, respond slower, and still hallucinate and still don't know your private data. For most business tasks, accuracy and ROI are capped by the data and guardrails around the model, not by the model's size. The right move is usually the smallest model that clears the quality bar on your specific task, not the largest one on the menu.

What is the difference between an LLM and an AI agent?

An LLM generates text. An AI agent uses an LLM as its reasoning engine but adds memory, access to your data, and tools it can act through — so it can complete a multi-step job rather than just answer a question. The LLM is one component inside the agent; the agent is the thing that does work in production and that someone has to own when it goes wrong.

Why does an LLM make things up?

Because it predicts plausible text rather than retrieving verified facts. When it lacks the information, it does not stop — it produces a fluent, well-formed guess. The fix is not a better model alone. It is grounding the model in your real data at runtime and constraining what it is allowed to assert and which actions it is allowed to take. If you want help drawing that line for your own systems, that is the kind of work SkySync does — you can start at /start.

Related terms

AI Agent

An AI agent is software that pursues a goal by deciding its own next steps: it reasons with a language model, calls tools or APIs to act on the world, reads the result, and decides again, looping until the goal is met or it stops. Unlike a chatbot that only returns text, an agent takes actions. Unlike a fixed automation, it chooses the path at runtime instead of following one you scripted in advance.

Agentforce

Agentforce is Salesforce’s platform for building AI agents — software that reasons over your business data, makes decisions, and takes actions inside Salesforce, governed by your existing permissions and audit trail. Unlike a chatbot that only replies, an agent can complete a task end to end.

Browse the full glossary

Ready when you are

Worth a
conversation?

Tell us one number you'd like AI to move. We'll show you how we'd do it, what it's worth, and how we'd tie our fee to getting you there.

Book a call hello@skysync.nyc