RAG (Retrieval Augmented Generation)
An architecture that connects an AI model to external data sources at query time — retrieving relevant documents, records, or facts before generating a response. RAG is what allows an AI agent to answer questions based on your actual ERP data, contracts, and supplier records rather than its training data alone.
What is RAG (Retrieval Augmented Generation)?
Retrieval Augmented Generation (RAG) is an AI architecture pattern that combines a language model's reasoning capability with live retrieval from an external data source. When a query arrives, the system first searches for relevant information — from a database, a document store, an ERP, or an API — and passes that information to the language model as context. The model generates its response based on what was retrieved, not just what it learned during training.
This matters operationally because language models do not know your data. They know language patterns from training. A model asked "what is the contracted unit price for SKU-4821 with Supplier X?" cannot answer correctly without access to your contracts. RAG provides that access — at the moment the question is asked, using the current version of your data.
How RAG Works: The Architecture
A RAG system has three layers working in sequence:
Indexing: Your documents and data — supplier contracts, product catalogues, ERP records, historical invoices — are processed, chunked, and stored in a vector database or search index. Each chunk is embedded as a numerical representation that captures its meaning.
Retrieval: When a query arrives, the system searches the index for the chunks most semantically relevant to the question. "What is the payment term for Supplier X?" retrieves the relevant contract clause, not the entire contract.
Generation: The retrieved chunks are passed to the language model as context. The model generates its response grounded in that specific, retrieved information — with a dramatically lower risk of hallucination because the answer is constrained by what was actually found.
RAG in Operations: A Concrete Example
A procurement agent at a wholesale distributor needs to validate an incoming invoice from a supplier. The invoice shows a unit price of EUR 24.80 for a fastener component. The contracted price from the last framework agreement is EUR 23.40, with a permitted escalation clause of up to 3% annually.
Without RAG, the AI agent has no access to that contract. It cannot validate the price. A human controller must pull the contract manually.
With RAG, the agent queries the contract index at validation time. The retrieval layer finds the relevant pricing clause and escalation terms. The language model calculates whether EUR 24.80 falls within the permitted range (EUR 23.40 x 1.03 = EUR 24.10 — it does not). The agent flags the invoice as exceeding contracted tolerance, attaches the relevant contract clause as evidence, and routes it for human review — all automatically, in seconds.
This is the operational value of RAG: AI decisions grounded in your actual data, with a retrievable source for every conclusion.