Learn AI for business
A plain-language reference for common AI concepts. We'll continue to add to this.
How AI actually works
The mental model that everything else in this page builds on. If you only read one group, read this one.
01
Large language models
#llm
Large language models
A large language model is the brain behind ChatGPT, Claude, and Gemini. Internally, it is a very large pattern-matcher — trained on a substantial portion of the public internet, books, and code until it learned, statistically, what words tend to follow what.
In plain language
Think of an experienced restaurant chef. After years of cooking, they can taste a dish and reach for the right spice without looking it up. They don't recite a recipe — they know. They've made the dish so many times that the next ingredient just feels right. A large language model works the same way: it has read so much human writing that the next word in a sentence often feels right to it.
That intuition is also where it gets fragile. A chef who has only ever cooked Italian food will improvise something that sounds Italian even when you asked for Thai. A language model does the same — when stretched past what it knows well, it generates an answer that sounds right whether or not it actually is.
This is why the model alone is never the whole product. The useful work happens around it: feeding it the right context for your specific business (see RAG), shaping its instructions (see prompt engineering), and checking its work (see evals).
Why this matters for your business
The model is the engine; what you build around it is the vehicle. The same model can produce a generic chatbot or a system that genuinely knows your customers — and the difference is almost everything else on this page.
02
Reasoning models
#reasoning-models
Reasoning models
Standard language models answer immediately — they reach for the next word. Reasoning models (OpenAI's o-series, Claude's extended thinking, DeepSeek's R1) stop and deliberate first.
They generate a hidden scratch-pad of intermediate steps — "if the customer paid Tuesday, and the refund window is 30 days, and the holiday extends business hours by…" — and only then write the final answer. The deliberation is the product.
In plain language
Imagine the difference between a junior accountant who fires off a quick reply, and a senior partner who steps away from her desk for ten minutes before answering. Same person could answer either way — but the deliberation produces a different quality of answer for the hard questions.
This deliberation costs real time and money — reasoning runs are typically 5-30× more expensive per question than direct answers, and noticeably slower. The trade-off is worth it for genuinely hard problems (multi-step math, dense legal reasoning, debugging a chain of cause-and-effect) and wasteful for simple ones (rewording an email, looking something up).
Why this matters for your business
Use reasoning models where the question is genuinely hard and the cost of a wrong answer is high. Use standard models for everything else. Mixing them well is most of the cost-control story in production AI.
03
Training vs. inference
#training-vs-inference
Training vs. inference
Two different moments in the life of an AI system: training is the long, expensive process by which the model learns. Inference is the cheap, fast process by which it answers a single question. When a vendor says "we run on our own infrastructure," they almost always mean inference — they did not retrain the model.
Full explainer coming soon. Let us know if this is the next one we should finish.
04
Tokens and context windows
#tokens-context-window
Tokens and context windows
A token is roughly three quarters of a word — the unit a model actually charges for and reasons over. The context window is how many tokens the model can hold in working memory at once. Most modern models hold somewhere between 100,000 and 2,000,000 tokens; older or cheaper ones hold much less. Larger isn't always better — long contexts get slower and more expensive, and quality degrades past a point.
Full explainer coming soon. Tell us this should be next.
05
Hallucination
#hallucination
Hallucination
When a language model doesn't know something, it doesn't usually say "I don't know." It produces an answer that sounds just as confident as a correct one. The technical term is hallucination: plausible-looking content that isn't grounded in fact.
It happens because the model's job, mechanically, is to produce the next likely word. "Likely" doesn't mean "true" — it means "this pattern of words tends to appear in writing." A made-up court case can read exactly like a real one if it follows the right pattern. A fabricated customer policy can sound exactly like a real one from your handbook.
In plain language
Imagine an unusually well-read intern who is genuinely trying to help, has no concept of professional caution, and physically cannot tell you when they're guessing. Their answers will be useful most of the time. Some of the time they won't be — and they will sound exactly the same.
There is no internal "I'm not sure" signal you can read off a model directly. Mitigating hallucination is an engineering problem, not a model-selection problem: give the model real source material to cite (see RAG), constrain what it can answer (see guardrails), and check its answers against a known reference (see evals).
Why this matters for your business
Any AI deployment that touches customer-facing answers, legal text, financial numbers, or commitments must be designed assuming the model will occasionally confabulate. The safety isn't in the model — it's in the system you wrap around it.
Making it useful for your business
Off-the-shelf AI knows a lot about the world. It doesn't know anything about your business yet. These are the patterns that close that gap.
06
RAG — retrieval-augmented generation
#rag-retrieval
RAG — retrieval-augmented generation
By default, a language model only knows what it was trained on. It doesn't know your company's pricing page, your customer's last support ticket, or your CFO's revenue forecast. RAG — retrieval-augmented generation — is the engineering pattern that fixes that.
Before the model answers, the system fetches the most relevant chunks from your own documents and pastes them into the prompt, then asks the model to answer using that.
In plain language
Think of passing a witness a folder before they take the stand. The witness can speak fluently — that's the language model. But the facts in their answer come from the folder you handed them, not their general experience. Swap the folder, and the same witness can answer questions in a different domain just as fluently.
This is the single most important pattern for making AI useful inside a business. It's how a model becomes your AI rather than a generic one. The hard part is the retrieval — choosing which paragraphs from which documents to include for a given question. That's where embeddings, vector search, and increasingly graph-based context come in.
Why this matters for your business
RAG is the difference between an AI that talks generically about "tax compliance" and one that talks specifically about your clients' filings. It's also the layer where privacy lives — your documents stay in your control; the model just gets to glance at the relevant ones.
07
Embeddings
#embeddings
Embeddings
An embedding turns a piece of text into a long list of numbers — coordinates in a high-dimensional space where similar meanings sit close together. "Cancelled my subscription" and "want to end my plan" end up near each other geometrically, even though they share no exact words. This is how a search system finds the right document without needing the user to type the exact phrasing.
Full explainer coming soon.
08
Fine-tuning
#fine-tuning
Fine-tuning
Fine-tuning takes an already-trained model and adjusts it on a smaller, narrower dataset to make it noticeably better at one specific kind of task — your tone, your format, your industry. It is usually not the right first move; better prompting and good RAG solve most problems for less. Fine-tuning matters when you genuinely need consistency at scale that prompting can't deliver.
Full explainer coming soon.
09
Knowledge engines
#knowledge-engines
Knowledge engines
A knowledge engine is the architectural layer that ingests your business's documents, extracts the facts and relationships inside them, stores them in a way an AI can search, and provides the answers (with sources) back. It's where RAG, embeddings, and graph data structures meet. The Autocomple Knowledge Engine is built on this pattern.
Full explainer coming soon. See the Knowledge Engine in action →
10
Multimodal AI
#multimodal
Multimodal AI
Multimodal models accept more than one kind of input. A single model can read a contract, describe a scanned form, transcribe a recorded meeting, and draft the follow-up email — all in one workflow. This matters less because it's impressive and more because it removes integration plumbing: one model replaces a stack of single-purpose tools.
Full explainer coming soon.
Telling it what to do, well
A capable AI poorly directed is worse than a less capable one used well. These are the disciplines around the model itself.
11
Prompt engineering
#prompt-engineering
Prompt engineering
Prompt engineering is the practice of structuring what you ask an AI so it answers consistently, accurately, and in the format you need. It's less "the magic words" and more "thinking clearly about the job."
In plain language
Prompt engineering is to AI roughly what brief-writing is to law, or what a well-written service ticket is to a help desk. A vague brief gets a vague output. A precise one — role, context, examples of good and bad outputs, constraints, format — gets a precise one. The model is fluent. It isn't a mind-reader.
The biggest shifts that separate okay prompts from great ones are not tricks. They're the same shifts that separate okay management from great management: give the model the relevant context, show it examples, tell it what not to do, and let it think before it answers.
Why this matters for your business
Prompts are the cheapest, fastest lever you have. A 20-minute investment in a sharper prompt regularly beats a year-long migration to a more expensive model. Treat prompts as living documents — version them, review them, and own them.
12
AI agents
#agents
AI agents
An agent is an AI system that can take action in the real world — call tools, run code, query databases, send emails, file tickets — and decide what to do next based on the results. Where a regular assistant answers your question, an agent uses your question as the start of a small project and works through it.
In plain language
The difference is the difference between asking your accountant "what's our cash position?" and saying "figure out our cash position and reconcile any line that's off by more than $100." The first question wants an answer. The second wants a small piece of work done — checking, deciding, acting, checking again. Agents do the second.
This is where AI starts to feel like a colleague rather than a search engine. It's also where it gets riskiest. An agent that can act can also act wrongly — sending the wrong email, billing the wrong amount, exposing the wrong record. Production-grade agents need clear scopes, audit logs, approvals on irreversible actions, and the same kind of supervision you would give a new hire.
Why this matters for your business
Agents are how AI gets to finish work, not just describe it. They're also the layer that needs the most thoughtful governance — speed without oversight is how AI gets a company in trouble.
13
Evals
#evals
Evals
Evals (short for evaluations) are how you measure whether an AI is actually good at the job, not just impressive in a demo. They're the test suite for a probabilistic system: a fixed set of inputs, expected behaviors, and scoring rules you can run every time something changes. Without evals, you don't have a system — you have a vibe.
Full explainer coming soon.
14
Mechanistic interpretability
#mechanistic-interpretability
Mechanistic interpretability
Mechanistic interpretability is the emerging field — championed by labs like Anthropic — that studies what's actually happening inside a neural network. Where older "explainability" tools only described outputs, this work is starting to identify the internal circuits that recognize a concept, decide on an action, or refuse a request. It is early science. It is also the most credible answer to "can we ever truly trust these systems."
Full explainer coming soon.
15
Guardrails
#guardrails
Guardrails
Guardrails are the rules — sometimes prompt-based, sometimes code-based — that bound what an AI is allowed to say and do. They handle off-topic questions, prevent leaks of sensitive data, flag escalation cases, and refuse actions a model shouldn't take on its own. A production AI deployment without guardrails is a customer-facing wager.
Full explainer coming soon.
Adoption, cost, and trust
The parts of an AI initiative that aren't technical — but determine whether the technical parts matter.
16
Privacy and data isolation
#privacy
Privacy and data isolation
The two questions every AI deployment should answer in writing: where does the data go when an employee pastes it into a prompt, and does the vendor use it to train their next model. Sound enterprise contracts say no on the second. Sound architectures keep the data isolated per tenant on the first. Both matter for GDPR, SOC 2, and customer trust.
Full explainer coming soon.
17
Vendor-agnostic AI
#vendor-agnostic
Vendor-agnostic AI
The AI market is moving fast — model leaders today are not necessarily model leaders next quarter. A vendor-agnostic architecture isolates the choice of provider behind a simple interface, so swapping from one model family to another is a config change, not a rewrite. This is also the layer where you can route different jobs to different providers based on cost, latency, or capability.
Full explainer coming soon.
18
Build vs. buy
#build-vs-buy
Build vs. buy
The honest version of this question isn't "build vs. buy" — it's "where is your real competitive advantage." If the AI workflow is your differentiator, you almost certainly need a custom build (or a thoughtful customization layer over off-the-shelf parts). If it's a commodity job — meeting notes, transcription, generic chat — buy it and move on.
Full explainer coming soon.
19
Total cost of ownership
#tco
Total cost of ownership
A useful working number: the cloud bill for an AI feature is typically 10-30% of its total cost of ownership. The rest is engineering, prompt and eval maintenance, oversight, change management, retraining, and the inevitable cleanup the first time it gets something wrong. Plan from the start for the whole stack, not just the API line item.
Full explainer coming soon.
20
Return on AI investment
#roi
Return on AI investment
The "we tried ChatGPT and people liked it" anecdote is not ROI. Real ROI on AI looks like baseline-vs-AI measurements on a specific task: response time, error rate, cost per resolved ticket, revenue per customer touched. Pick the metric before you ship the feature — measuring after the fact is much harder, and easier to deceive yourself about.
Full explainer coming soon.
Educate yourself on AI with Autocomple
Let's discuss where AI might be a good fit in your business.
Find your first AI win