Ontology Just Beat Vector Embeddings. Here Is What That Means for Your Small Business.

May 18, 2026

Editorial note. The publication date shown above may be in the future. That is intentional. Posts on this site are scheduled against an editorial calendar that aligns with product releases, book launches, and platform-signal timing; the datePublished reflects the date the post is slated to go public, which is also the date indexers and syndication partners should treat as canonical. If you are reading this before that date you were early — welcome.

In 2023 and 2024, when an AI tool wanted to read your business data, the recipe was the same in almost every product demo. Throw your documents into a vector database. Embed each chunk into a 1,536 dimensional vector. At query time, embed the question, fetch the nearest vectors by cosine similarity, paste the matching chunks into the model, ship the answer.

This worked well enough to become the default. It is now the default for the wrong reasons.

What changed through late 2025 and the first half of 2026 is that a different shape of AI memory started outperforming vector search on the questions that matter for a real business. Not research lab benchmarks. The boring questions a 12 person HVAC company actually asks about its own data. Who has a furnace under warranty that expires this quarter, who installed it, have we billed that customer for the related service visit. Vector search cannot answer that. A graph shaped knowledge structure can answer it in milliseconds.

This post explains what is happening, why it matters even if you do not run a tech company, and a workflow using Claude Code and Codex to migrate your own data into a shape AI can actually reason about.

What an ontology is, in plain words

The word ontology sounds harder than it is. Strip the philosophy out. An ontology is a written down list of:

The kinds of things your business deals with. Customers, properties, work orders, vehicles, invoices, payments, technicians, products, vendors.
The relationships between those things. A customer OWNS a property. A work order IS PERFORMED AT a property. A technician COMPLETED a work order. An invoice IS FOR a work order. A payment IS AGAINST an invoice.
The rules that hold things together. An invoice cannot be created without a work order. A payment cannot exist without an invoice. A vehicle is owned by exactly one customer at a time.

That is all. An ontology is the map of your business as data.

If you have ever made a list of the kinds of customers you have or drawn arrows on a whiteboard connecting your departments, you have done ontology work. You just did not call it that.

The reason it matters now is that AI tools have started reading ontologies natively. When the tool sees that a service visit is connected by an explicit relationship to a property and a technician and a warranty, it can plan an answer that walks that exact path. When the tool only has a pile of similar looking text chunks, it has to guess.

What vector embeddings are actually good at

I want to be fair to the vector approach before I criticize it. It is genuinely useful for one job.

Vector embeddings work when you want fuzzy semantic search across a pile of unstructured documents. Find me the paragraph in this 400 page PDF where someone discusses the warranty exclusion for compressors. Find me the email thread that talked about that vendor switch last fall. Surface the section of the policy manual that probably applies to this customer complaint.

For that kind of question, vectors win. They were designed for it. Nothing on the horizon replaces them for fuzzy what looks similar lookups.

What vectors are not good at, and never were:

Counting and aggregation. How many customers are past due as of last Tuesday is not a similar to question. It is a filter and count question. Vectors do not filter. They rank.
Multi hop reasoning. Which technicians serviced a unit installed by another technician before the warranty expired is a path traversal question. Vectors collapse the path. They do not walk it.
Hard constraints. No invoice without a work order. No payment without an invoice. Vectors do not enforce structure. They enforce proximity.
Reasoning about relationships you did not write down explicitly anywhere. The model can only retrieve text it has stored. If the connection between two things exists only in your head, no amount of similarity search will surface it.

If you have tried a chat with your documents tool and felt like the answers were technically related but missed the point, that is what was happening. The tool was finding similar sounding text. It was not following relationships through your business.

The graph approach, said simply

A graph shaped data structure stores your business as nodes (the things) and edges (the relationships). It looks identical to the whiteboard diagram you would draw if I asked you to sketch out how your data fits together.

Customer to owns to Property. Customer to has to ServiceContract. ServiceContract to covers to Property. ServiceContract to expires_on to 2026-08-15. WorkOrder to performed_at to Property. WorkOrder to completed_by to Technician. WorkOrder to references to ServiceContract. Invoice to for to WorkOrder. Payment to against to Invoice.

When an AI tool reads that, it has the actual structure of your business. When you ask which active service contracts cover properties where the last work order has gone unpaid more than 30 days, the tool walks those edges. It does not have to guess.

This is not a new idea. Databases have stored relationships this way since the 1970s. What changed in 2025 and 2026 is that AI tools learned how to read the graph, plan a traversal, and answer questions in natural language. Microsoft published the original GraphRAG research in April 2024. Anthropic, OpenAI, and Google have all shipped variants since. The pattern has a name now (GraphRAG) and a maturing set of open source tools (Neo4j with its LLM Graph Builder, LangChain's graph modules, LlamaIndex property graphs).

The practical outcome. Questions you used to need a developer to answer (write me a SQL query for X) are now answerable by typing the question in plain English to an AI tool that has been pointed at your graph.

What this means for a small business with no data team

Here is the part most articles skip. The benefits I just described are real. The path to getting them is the part people gloss over.

You cannot turn on graph mode in your AI tool. You have to:

Decide which entities and relationships you actually care about.
Pull your data out of wherever it lives. QuickBooks, ServiceTitan, Jobber, an Excel file, a Google Sheet, a folder of PDFs.
Reshape it into a graph form.
Point the AI tool at the graph.

In 2023 that work required hiring someone. In 2026 it does not, because Claude Code and Codex can do most of it for you, and the cost is the API tokens, not a contractor's hourly rate.

The rest of this post is how that actually goes.

Annie's HVAC, a worked example

I am going to use a made up business that is statistically typical for the kind of company that benefits. Annie runs an HVAC service business with 14 technicians, a $4M annual book, and a cluttered file cabinet. Her operational data lives in three places.

ServiceTitan holds work orders, customers, invoices, technician schedules. It is the system of record for the day to day.

QuickBooks Online holds billing, A/R, vendor payments, payroll.

A folder of equipment manuals and warranty PDFs sits on her office computer. The manufacturers send them. Her team consults them when a customer calls about a part.

Annie wants to type questions into her AI tool like:

Which customers have equipment installed in 2020 through 2022 that still has manufacturer warranty coverage on the compressor, and which of them have had at least one service call in the last 12 months.

For any active service agreement expiring this quarter, who is the assigned technician, what was the last visit, is the customer current on payments.

Which manufacturers' equipment costs us more in callback labor in the first year than our average.

None of these are vector search questions. Every one requires walking multiple relationships across her three data sources.

The two CLI migration workflow

Two terminals open. Claude Code in the left pane. Codex in the right pane. The split is the same one I described in the two CLI workflow post. Claude Code as the main work surface. Codex as the focused task sidecar.

Step 1. Entity modeling. About 45 minutes.

In the left pane, point Claude Code at the business and ask it to propose an ontology. The prompt looks something like this.

I run an HVAC service business. My core systems are ServiceTitan and QuickBooks Online, plus a folder of equipment warranty PDFs. Read the attached field exports (CSV samples from both systems) and propose a graph data model. List the entities, the relationships between them, and any business rules I should enforce as constraints. Keep it to 12 to 15 entity types maximum. Skip anything that is purely a UI concern.

Claude Code reads the samples, asks clarifying questions about a few fields it is unsure about, produces a candidate model. You read it. Push back on anything that does not match how you actually think about your business. Approve.

The model file is plain Markdown. It lists entities, fields, relationships, and constraints. Anyone in your company can read it without learning new syntax. That is the point.

Step 2. Extraction scripts. Focused tasks in the right pane.

Extraction is mechanical work. Codex's o-series handles it well. Three small jobs.

A Python script that calls ServiceTitan's API and writes work orders, customers, equipment, and visits into a JSONL file.

A Python script that calls QuickBooks Online's API and writes invoices, payments, vendors, and the chart of accounts into another JSONL file.

A Python script that uses Claude as the model to read each warranty PDF and extract a structured record. Manufacturer, model, customer, install date, coverage period, exclusions.

Each script is 50 to 150 lines. Codex writes them in under 10 minutes each because the work is well bounded and easy to test. You run them, look at the output, fix any obvious issues, move on.

Step 3. Loading into a graph. About 30 minutes.

Back in the left pane. Hand Claude Code your three JSONL files and your ontology model and ask it to write a loader that produces a single graph file. For a 14 technician HVAC business the entire graph fits in a local SQLite file with two tables, nodes and edges. You do not need Neo4j or any other specialized database. SQLite is fine for years of operational data.

Claude Code writes the loader, runs it against your sample data, prints a summary. How many customers, how many properties, how many work orders, how many warranty records, how many edges between them. You eyeball the totals against what you think your business is doing. If the count is off, you fix the loader. If the count is right, you have your first graph.

Step 4. The query layer. About an hour.

The graph is sitting in SQLite. It does not know anything about AI yet. Claude Code's job in this step is to write a small query layer that lets the AI tool ask questions of the graph in plain English and get answers back.

The shape is a tiny Python module with two functions.

query_graph(natural_language_question) takes the question, asks Claude to translate it into a SQL query against the graph schema, runs the query, returns the rows.

describe_graph() returns the ontology model as a single Markdown document, so the AI tool can see the structure without having to be told it again every time.

That is the entire query layer. About 80 lines of Python. Tested against a half dozen sample questions Annie cares about.

Step 5. Connect the query layer to the AI tool. 20 minutes.

Codex's job here is to wire the query layer to whatever interface Annie wants to use day to day. Claude Desktop with MCP. A small custom chat page on her intranet. A CLI for the technician dispatchers. The wiring is small and Codex handles it cleanly.

After this step, Annie can type her real questions in plain English and get real answers. The compressor warranty question that opened this section runs in about 400 milliseconds against her local graph.

What the migration actually costs

For Annie's size of business, the work above took one person about two and a half days of calendar time, with most of the writing done by the two CLIs. The marginal cost was almost entirely in API tokens.

Item	Estimated cost
Claude API tokens (Claude Code sessions across all steps)	$35 to $60
Codex API tokens (extraction scripts and wiring)	$15 to $25
Storage (SQLite file on her laptop, no cloud)	$0
Specialized graph database (not used)	$0
External vendor for migration	$0
Total	About $50 to $85, one time

Monthly ongoing cost to run the resulting system. About $5 to $10 in Claude API tokens for normal daily query volume. The graph itself updates whenever Annie reruns the extraction scripts, which she set to run nightly via a cron job Codex wrote.

The math is not a gimmick. A few years ago this migration would have been a $15,000 to $40,000 project with a small consulting firm. The reason it is not anymore is the two CLI workflow and the fact that the graph is small enough to live in SQLite.

What you should not do

A few warnings, because I have watched both my own clients and other people's pile into these pitfalls.

Do not start with a graph database. Neo4j is a fine product. You do not need it for your first migration. SQLite with two tables (nodes and edges) handles tens of millions of records on a laptop. The decision of whether to graduate to a real graph database can come later, after you actually have a graph and know how you use it.

Do not try to model everything. Annie's first graph has 11 entity types and 18 relationship types. That is plenty. The temptation to capture every field from every system at once is what kills these migrations. You can always add an entity later. You cannot easily delete one that has rotted into your data.

Do not skip the constraint rules. A graph without rules is a pile of nodes. The rule that an invoice cannot exist without a work order is what lets the AI tool answer questions accurately. Without it, you will eventually have orphan invoices and the AI will confidently quote you the wrong count of past due customers.

Do not throw vectors away. The warranty PDFs are still well served by a vector index for fuzzy text search. The right architecture for Annie is a hybrid. Graph for structured queries. Vector for show me the paragraph where the exclusion is described. Both, working together. The shift is not that vectors are dead. The shift is that vectors should not be doing the structured work they were doing by default.

Do not let the AI tool model your ontology in isolation. The first proposal Claude Code generates is a starting point. It does not know your business as well as you do. Push back. Rename entities to match what your team actually calls them. The graph is yours. The AI is the assistant.

What to do this week

If this is your first time taking this seriously, three small steps.

List your entity types on paper. A blank sheet of paper, ten to fifteen entries. Customer, property, work order, invoice, payment, technician, equipment, vendor, warranty, contract. Whatever the right list is for your business. Spend 20 minutes on it. Do not open a computer.
Draw the relationships. Connect the entries with arrows. Label the arrows. This is your ontology, written in pencil. Total time, another 20 minutes.
Open Claude Code in one terminal and Codex in another. Paste your hand drawn list as the starting prompt. Tell the tools what your business does. Ask them to propose CSV exports you should pull from your existing systems. Spend an hour exploring.

If after those three steps you decide to actually run the migration, the cost is the API tokens above and a long weekend of your own attention. You do not need to hire anyone.

Why this is happening now and not earlier

The graph approach to data has existed for fifty years. What is new is that the cost of building one for a small business dropped from hire a data engineer to one weekend with two CLIs and $80 of API tokens.

That is the broader pattern of the under $100 AI stack. The expensive parts of running a business are getting commoditized one at a time. The data layer is the latest one to fall. Last year it was content production. Two years ago it was customer support. Each time, the same thing happens. A category that used to be expensive and gated becomes a tooling problem any motivated owner can solve in a weekend.

I wrote The $20 Dollar Agency for owners who are watching this happen and want to be on the right side of it. The migration walkthrough above is not in the book. The framing of how a service business positions itself against this shift is exactly the kind of work the book is for.

If you run a business that has good operational data trapped in five different systems, this is the year to do something about it.

Fact check notes and sources

Microsoft Research, "From Local to Global: A Graph RAG Approach to Query Focused Summarization." Published 2024-04-24 on arXiv at arXiv:2404.16130. Open source implementation at github.com/microsoft/graphrag.
Neo4j LLM Graph Builder documentation at neo4j.com/labs/genai-ecosystem/llm-graph-builder/.
LangChain graph modules at python.langchain.com/docs/use_cases/graph/.
LlamaIndex property graphs at docs.llamaindex.ai/en/stable/module_guides/indexing/lpg_index_guide/.
Claude Code documentation at docs.anthropic.com/en/docs/claude-code.
OpenAI Codex documentation at openai.com/codex.
Annie's HVAC is a composite illustration, not a real client. The token cost estimates ($35 to $85 for the full migration) are derived from Anthropic's published pricing for Claude Sonnet 4.6 and OpenAI's published pricing for the o-series, applied to the rough token volume of the workflow described. Actual costs will vary by your data volume and how much back and forth your migration requires.
The claim that vector search is not designed for counting, aggregation, or path traversal reflects the design of vector similarity search and is widely documented. See for instance the Anthropic engineering writeup on Contextual Retrieval, which acknowledges the limits of pure vector retrieval and proposes hybrid approaches.

Ontology Just Beat Vector Embeddings. Here Is What That Means for Your Small Business.

What an ontology is, in plain words

What vector embeddings are actually good at

The graph approach, said simply

What this means for a small business with no data team

Annie's HVAC, a worked example

The two CLI migration workflow

Step 1. Entity modeling. About 45 minutes.

Step 2. Extraction scripts. Focused tasks in the right pane.

Step 3. Loading into a graph. About 30 minutes.

Step 4. The query layer. About an hour.

Step 5. Connect the query layer to the AI tool. 20 minutes.

What the migration actually costs

What you should not do

What to do this week

Why this is happening now and not earlier

Fact check notes and sources

Related reading

Send a Message