Before You Pay an Agency $3,500 a Month for 'Proprietary AI,' Open Your DevTools

May 14, 2026

Editorial note. The publication date shown above may be in the future. That is intentional. Posts on this site are scheduled against an editorial calendar that aligns with product releases, book launches, and platform-signal timing; the datePublished reflects the date the post is slated to go public, which is also the date indexers and syndication partners should treat as canonical. If you are reading this before that date you were early — welcome.

A vendor walks into a 30-person small business and pitches "proprietary neural retrieval architecture" for $3,500 a month. The owner has no way to evaluate the claim. The owner signs.

Six months in, the owner is paying $42,000 a year for what is, under the hood, a thirty-line script that calls OpenAI's API. The markup is somewhere between 75x and 1,000x on the underlying compute cost.

This is not a hypothetical. In November 2025, an engineer named Kusireddy reverse-engineered 200 funded AI startups by monitoring their actual network traffic, decompiling their JavaScript bundles, and tracing their API calls. The headline finding: 73% of them had a significant gap between their claimed technology and their actual implementation. The most common pattern? Companies that pitched "our proprietary large language model" were running plain GPT-4 calls with a system prompt instructing the model to "never mention you are powered by OpenAI."

You don't have to take this engineer's word for it. You can run the same check on any AI vendor pitching you, in about 30 seconds, with no tools you don't already have installed.

Here's the cheat sheet.

The 30-second check

Open the vendor's product page (or their demo, or a customer-facing AI feature on their site) in Chrome or Firefox. Press F12. Click the "Network" tab. Now interact with their AI feature: ask a question, run a query, click the "Try it" button, whatever the demo offers.

Watch the network traffic. If you see requests going to any of these hostnames, the vendor is a wrapper around someone else's model:

api.openai.com
api.anthropic.com
api.cohere.ai
api.together.xyz
generativelanguage.googleapis.com
bedrock-runtime.*.amazonaws.com

That doesn't mean they're a fraud. It means they're a wrapper. There's a meaningful difference between an honest wrapper and a dishonest wrapper, and we'll get to that. But it does mean their "proprietary AI" is OpenAI's or Anthropic's or Google's API plus a layer of UI and prompts.

If you don't see API calls to any of those domains in their demo, two possibilities:

The demo isn't actually live (they're showing you a pre-recorded video or a faked interaction). Ask them to run a real query while you watch.
They've routed the API call through their own backend so the browser only sees api.theircompany.com. This is normal architecture. It doesn't tell you what's underneath. You then have to look at the next signal.

The four secondary signals

When the network tab is inconclusive, here's the rest of the field guide. None of these by themselves prove anything; together, they paint the picture.

Signal 1. Response time

OpenAI's GPT-4 has a recognizable latency pattern: 200 to 400 milliseconds for most short queries, with a characteristic curve when the response streams. Claude has its own slightly different pattern. If every single query comes back in roughly the same time band, and it matches one of the big providers, the vendor is almost certainly routing to that provider.

If responses come back in under 50 milliseconds, it's either (a) a static lookup table dressed up as AI or (b) a much smaller model running on cheap inference. Neither is necessarily bad, but neither is "proprietary neural intelligence" either.

Signal 2. Leftover API keys in the frontend

Search the page source of the demo for these strings:

sk-proj- (OpenAI's project API key prefix)
sk-ant- (Anthropic's API key prefix)
openai
anthropic
claude
cohere

You won't find this in a careful vendor's bundle. But in the reverse-engineering study, the engineer found 12 funded companies that left their API keys directly in their frontend code. That's both a security issue (anyone who reads the JS can drain the company's API balance) and a tell that the "proprietary AI" is in fact a couple of API calls.

Signal 3. Vague vs. specific language

This is the matrix:

What they say	What it might mean
"Our proprietary large language model"	Wrapper. Almost always.
"Advanced neural retrieval architecture"	Wrapper with embeddings on Pinecone or Weaviate.
"Trained on millions of examples"	Could be real, could be fine-tuning on OpenAI's API (which is paying OpenAI to save your prompts).
"Built on GPT-4 with our own pipeline"	Honest wrapper. This is fine.
"Self-hosted model, version 7B, evaluated against the following benchmarks…"	Probably real. They can describe the architecture.
"Using OpenAI under the hood, here's why we're worth the difference"	Honest wrapper. This is fine.

The pattern: vague impressive language hides; specific technical language explains.

Signal 4. Could you build it in a weekend?

Honest question, no judgment either direction. If after you describe the vendor's product to a developer friend who's spent a weekend with the OpenAI API, they say "I could build that core in 48 hours, maybe two weeks for polish," then the vendor's price has to be justified by something other than the AI. UX, integrations, domain expertise, ongoing support, compliance work, data pipelines, all of these are legitimate reasons to charge for a wrapper. But the AI itself is not the reason. Don't pay AI prices for what is actually a UX-and-integrations product.

The honest math

Here are the underlying costs the engineer measured for the most common "AI startup" pattern (RAG, retrieval-augmented generation, the standard architecture for any "we'll answer questions about your data" product):

OpenAI embeddings: about $0.0001 per 1,000 tokens
Pinecone or Weaviate query: about $0.00004 per query
GPT-4 completion: about $0.03 per 1,000 input tokens, $0.06 per 1,000 output tokens
Average query (500 tokens in, 300 tokens out): about $0.033 in API costs
Common vendor price per query: $0.50 to $2.00
Implied markup: 15x to 60x

At enterprise scale (a million queries a month), the vendor's actual API bill is about $30,000. The customer pays $150,000 to $500,000. The vendor's gross margin is 80% to 94%.

There's nothing inherently wrong with those margins. Every SaaS has high gross margins. The question is whether you're being charged for the AI or for the wrapper around it. If you're paying $3,500 a month and you're a 30-person company doing maybe 5,000 queries a month, you should be paying for the wrapper (the UI, the integrations, the support) and the wrapper alone. The math on 5,000 queries is roughly $1.65 a month in API cost. The other $3,498 is paying for the rest. Decide if the rest is worth it.

The four kinds of AI vendors

Not all wrappers are the same. The honest taxonomy:

The transparent wrapper. Their homepage says "Built on GPT-4" or "Powered by Claude" right at the top. They're selling the workflow, not the AI. Examples: legal document automation, customer-support routing, content moderation. These are fine. Pay them for the workflow.

The dishonest wrapper. Same product as #1, but the homepage says "our proprietary AI" and the founder pretends to investors and customers that there's something special under the hood. They're not necessarily bad at the workflow. They're misrepresenting the technology, which means you're being charged premium prices for commodity infrastructure. Run the F12 check.

The real builder. They actually trained a model. You can see their AWS SageMaker training logs in their case studies, their custom inference endpoints, GPU instance monitoring. They can explain their architecture in detail when you ask. These are rare (the study found 7% of the 200 fit this pattern) and they tend to be in HIPAA-compliant healthcare AI, custom financial risk models, or specialized computer vision. If you need what they make, you'll know it. They're not pitching the 25-person landscaping company.

The smart wrapper. They use OpenAI or Anthropic under the hood, they're honest about it, AND they've built something on top that's actually hard. Multi-model voting systems for higher accuracy. Custom agent frameworks. Novel retrieval architectures over their own data. These are worth paying for. They tend to be honest about the stack because their differentiation is provable.

For most small businesses, the right vendor is either a transparent wrapper or a smart wrapper. The dishonest wrapper is the trap. The real builder is overkill.

What to do with the savings

If the F12 check shows you've been paying $3,500/month for a wrapper, your options:

Negotiate down. Tell the vendor you understand the architecture and ask what you're paying for besides the OpenAI bill. A reasonable vendor will explain the workflow, the integrations, the support, and the data pipeline. An unreasonable vendor will get defensive. Defensiveness tells you everything.
Switch to a transparent wrapper. There are vendors who sell roughly the same workflow at $200 to $500/month with "Built on Claude" right on the homepage.
DIY. If your team has any developer capacity, the same workflow is a weekend project. A Claude Pro account is $20/month. Add a Pinecone free tier (1M vectors free), a free Vercel or Netlify deploy, and you're looking at $20 to $40/month all-in for what you were paying $3,500 for.

The math on option 3, run honestly: most 25-person companies don't have developer capacity, and DIY-ing the wrapper takes the time of someone who could be doing actual work. The right play is usually option 2: switch to a transparent wrapper. The dishonest-wrapper tax is real and avoidable.

The audit tools that catch this for you

These are free, no signup, run-in-the-browser tools. Three of them are directly relevant.

API Secret Leakage Audit. Scans a vendor's frontend for leaked API keys (OpenAI, Anthropic, AWS, Stripe, others). If their key is sitting in the JavaScript bundle, that's both a security signal and a "they're a wrapper and they were sloppy" signal.
Third-Party Script Cost. Shows you which external services a vendor's product is calling, with the latency cost of each. Helps you see whether the "AI" is actually 12 different SaaS services patched together.
FBI Fraud Reflex Card for SMBs. A 60-second pattern for spotting AI-vendor overpromises. Pairs well with the F12 check; covers the language and contract red flags.

And if you've been quoted an AI implementation that costs more than a Claude Pro subscription:

AI Model Recommender. Compares the four major AI tiers (Claude, GPT, Gemini, open-source) against your actual workload and emits a recommendation with rough monthly cost. Use it as a sanity check on any vendor pitch that's an order of magnitude more expensive.
LLM Retrieval Cost Estimator. Gives you the actual API math for the workload the vendor is pitching. If the vendor wants $3,500/month and this tool says the underlying API cost is $40/month, you know the markup.

When the wrapper is the right pick

I want to be clear because this post can read as anti-wrapper, and that's not the argument.

The wrapper is the right pick when:

You need the integrations more than you need the model. If the vendor has a polished QuickBooks-and-Stripe-and-Slack pipeline that took them six months to build, paying $300/month not to build it yourself is a great trade.
You need someone to call when something breaks. The model providers (OpenAI, Anthropic) don't answer your phone. A wrapper vendor does.
You need domain expertise on top of the model. A wrapper built specifically for healthcare or law or insurance has invested in the system prompts, the compliance considerations, and the edge cases. That's worth paying for.
You need the workflow to be opinionated. An expert in your industry has decided how the AI should approach the task. Don't reinvent that wheel.

The wrapper is the wrong pick when:

The price is enterprise-tier and the underlying API costs are SMB-tier.
The vendor is hiding what's underneath. (If they won't tell you, that's the answer.)
The integrations are stuff you already have (e.g., a wrapper that mostly just calls OpenAI from a custom UI when your team is already paying for ChatGPT Team).

The honest take

Most AI startups in 2026 are wrappers. That's fine. The iPhone app you use is a wrapper around iOS APIs and nobody complains. The website you're reading this on is a wrapper around the browser's rendering engine. Wrappers are how software gets built.

What's not fine is paying agency prices for what is, technically, a $40/month API bill plus a $200/month wrapper. The information asymmetry is huge: vendors know exactly how much OpenAI charges them; customers usually don't even know to ask.

The 30-second F12 check closes most of that gap. Once you've done it on three vendors, you'll start spotting the pattern in their pitch decks before you ever open the browser.

The smartest small-business owners I know now run F12 on every "AI" demo they see. Five seconds. It changes the conversation completely.

The longer version

The $20 Dollar Agency ($9.99 on Kindle, part of the Digital Empire series) walks through the broader argument: that the gap between what AI vendors charge and what AI actually costs has collapsed faster than the market has noticed, and that small operators who close that gap themselves have a structural advantage over operators who pay for it. The wrapper-detection patterns in this post are the tactical version. The book is the strategic map.

Fact-check notes and sources

Kusireddy's reverse-engineering study (200 funded AI startups, network-traffic analysis, JavaScript bundle decompilation, methodology disclosed): I Reverse-Engineered 200 AI Startups. 73% Are Lying, published Nov 3, 2025, on Medium/Towards AI. The 73% figure, the API-key-in-frontend finding (12 companies), the markup math (75x to 1,000x), and the 7% real-builder count are his.
OpenAI pricing (GPT-4 and embeddings) confirmed at openai.com/api/pricing.
Anthropic Claude pricing confirmed at anthropic.com/pricing.
Pinecone pricing (per-query rate and free tier) confirmed at pinecone.io/pricing.

This post is informational, not legal, contract-review, or vendor-due-diligence advice. Mentions of OpenAI, Anthropic, Pinecone, Weaviate, Cohere, and other third-party services are nominative fair use. No specific named vendor is being accused of fraud; the patterns described are aggregate findings from the cited research. No affiliation is implied.

← Back to Blog

Before You Pay an Agency $3,500 a Month for 'Proprietary AI,' Open Your DevTools

The 30-second check

The four secondary signals

Signal 1. Response time

Signal 2. Leftover API keys in the frontend

Signal 3. Vague vs. specific language

Signal 4. Could you build it in a weekend?

The honest math

The four kinds of AI vendors

What to do with the savings

The audit tools that catch this for you

When the wrapper is the right pick

The honest take

The longer version

Related reading

Fact-check notes and sources

Send a Message