Embeddings
An embedding is a numerical representation of a piece of text, image, or data as a vector — a list of numbers that captures its meaning in a form a machine can compare and reason about. Vectorizing is the process of creating that representation; the embedding is the result.
What are Embeddings?
When an AI model needs to understand language, it cannot work with raw text. Words and sentences must be converted into numbers first. Vectorizing is the process of converting a piece of content — a sentence, a product description, a supplier name — into a high-dimensional list of numbers called a vector. The resulting vector is the embedding.
The critical property of embeddings is that they preserve meaning. Two sentences that express the same idea — "invoice is overdue" and "payment is late" — will produce vectors that sit close together in the embedding space. Two unrelated sentences will sit far apart. This makes embeddings the foundation for semantic search, document matching, and classification tasks.
How Vectorizing and Embeddings Work
An embedding model reads input text and outputs a vector — typically 768 to 3,072 numbers. These numbers are not human-readable, but they encode the semantic content of the original text. Once you have embeddings for a large document set, you can:
Search by meaning: Query "late delivery from Dutch supplier" and find all relevant purchase orders, even if the exact words do not match.
Cluster similar items: Group supplier contracts with similar terms, or flag invoices that match the pattern of previous disputes.
Power RAG systems: Store embedded document chunks in a vector database so an AI agent can retrieve relevant context before generating a response.
Embeddings in Operations
In operational automation, embeddings are the technology behind intelligent document retrieval and matching. When a Lleverage agent needs to match an incoming purchase order against existing supplier contracts, or flag an exception that resembles past incidents, it is comparing embeddings — not doing keyword search. The difference is significant: keyword search breaks when terminology varies; embedding-based search finds the right document even when no single word overlaps. For manufacturing and wholesale companies dealing with thousands of SKUs, suppliers, and document types, this is the difference between search that works and search that does not.