Perspective · May 2026 · 8 min read

The Case for Open Source AI, and Why Now.

Open weight models have caught up, the inference market has tripled, and enterprises are rewriting their AI stacks. This is the year the default flips.

Every generation of software has had its open vs. closed moment. Operating systems had it with Linux. Databases had it with Postgres. Browsers had it with Firefox and then Chromium. In each case, the open option started as the underdog, then quietly became the substrate the whole industry runs on. The closed option did not disappear, but it stopped being the default.

We think AI is now in that exact moment. Not next year. Not "soon." Now.

This post lays out why we believe open source AI has crossed the line from "credible alternative" to "rational default," what changed in the last 18 months to make that true, and what it means for the next decade of products, infrastructure, and enterprise software.

What changed first: the models actually got good

For most of the GPT era, open weight models were a respectable second place. Capable, community-built, often faster, but trailing on the hardest benchmarks and on the long tail of real-world tasks that decide whether a product works.

That gap has now closed.

The MMLU benchmark gap between leading open and closed models narrowed from 17.5 points in 2024 to 0.3 points in 2025, and on a growing share of evaluations, open models lead outright. Six major labs now ship competitive open weight families: Meta (Llama 4), Mistral (Small 4 and Large 3), Alibaba (Qwen 3.6 Plus), Google (Gemma 4), Zhipu AI (GLM-5), and OpenAI itself with gpt-oss-120b. Llama 4 Scout holds the record for the longest production context window of any model, open or closed, at 10 million tokens.

The pattern is consistent across reasoning, coding, vision, and multilingual tasks. For 80–90% of real-world enterprise use cases, open models now deliver equivalent results to the closed frontier, often at a fraction of the cost.

The technical reason this happened is less mysterious than it looks. Training recipes leaked, in the good sense. Better data mixtures, better synthetic pipelines, better RLHF signals. The same scaling laws that powered the closed labs now apply to a global community of open labs that can actually afford the runs. Mixture-of-experts architectures pushed hundreds of billions of parameters onto a single GPU, which collapsed the deployment economics. Apache 2.0 and MIT licences replaced the bespoke "research only" terms that used to make legal nervous.

The result is that "open" has stopped meaning "almost as good as closed." On cost-adjusted quality, open wins.

What changed second: the economics turned the corner

If parity were the only story, this would still be a polite engineering shift. The bigger story is what happens to the bill at the end of the month.

Closed proprietary models cost roughly 87% more to run than equivalent open models, on a per-token basis. The current gap is about $1.86 per million tokens against $0.23. On premium tiers like GPT-5 mini, the differential climbs to 10–30x.

For a small product team running a low-volume chatbot, this is a rounding error. For an enterprise running customer support, agent workforces, document pipelines, retrieval, and embeddings at production scale, it is the difference between AI being a margin-eroding line item and a margin-accretive one.

The macro numbers underline the same point. The AI inference market is on track to grow from roughly $106B in 2025 to $254B by 2030, a 19.2% CAGR. The open source segment of that market, specifically, is growing at 21.1% and projected to reach $50B by 2030. Inference is where the money goes. The model is a one-time cost amortised across years of usage, and the inference layer is where every architectural decision gets paid for, every day, forever.

The economic question stops being "which model is best on paper" and starts being "which stack lets us serve our users at margin." That is a different question, and open models answer it differently.

What changed third: enterprises stopped waiting

Twelve months ago, the typical large company AI strategy was "pilot on a closed API, plan to migrate later." Today, the plan is the migration.

A reliable enterprise survey this quarter put open source AI adoption at 89% of large organisations, with deployments reporting 25% higher return on investment than closed-only stacks. Open source deployment as a primary inference path grew from 23% to 67% in a single year. The European market in particular has shifted from "data residency" thinking to "technical sovereignty," meaning enterprises now want to control the model itself, not just the region the data sits in.

Regulation accelerated this. Most provisions of the EU AI Act became fully applicable in August 2026, with general-purpose AI obligations live since August 2025. Penalties top out at 7% of global annual turnover, which is higher than GDPR. The compliance posture that most easily passes audit is the one where the model is auditable, the data path is provable, and no third party can change the rules underneath you. Open weights, run on infrastructure you control, are the cleanest version of that posture.

There are also the smaller, sharper incidents. The default training-data policy changes at Atlassian, GitHub Copilot, and other large platforms over the last six months caused real procurement reviews to reopen. Mercor's March breach, with 40,000+ people exposed via a shared inference proxy, made the systemic argument legible to people who do not usually read about AI infrastructure. When the question becomes "what happens to our data if our vendor's vendor gets compromised," self-hostable open weights become a structural answer, not a philosophical one.

What does not change: the hard parts are still hard

It would be dishonest to pretend that switching from a closed API to open weights is free. It is not.

You still need inference infrastructure that scales, fails over, handles bursts, and gives you predictable latency. You still need eval harnesses to know whether a new model release is actually better for your workload. You still need observability, safety filters, rate limits, structured outputs, and tool-calling compatibility with the SDK your team already wrote against. You still need someone to wake up at 3am when a region degrades.

The closed labs have spent years building this surface. It is the reason an OpenAI or Anthropic key still feels like the easier choice, even when the bill is two-thirds bigger.

This is exactly the gap we are building BasedAI to close. Red Hat made Linux enterprise-ready by packaging, supporting, certifying, and standing behind the open thing. We think open weight AI now needs the same treatment. Production-ready inference, drop-in compatible with the tooling teams already use, at margin pricing, with the operational surface enterprises actually need. That is what BasedAPIs is, and it is why we started there.

What this means for the next decade

If you accept that the models are good enough, the economics are 5–10x better, and the regulatory wind is at your back, the conclusion is uncomfortable for one side of the market and obvious for the other.

The commercial layer on top of open weight models is where the next decade of software gets built. Not as a fallback. Not as a cost-saving measure. As the default substrate, in the same way Linux became the default substrate for everything that runs in a data centre.

There is room for closed frontier models inside that world. There always was room for proprietary Unix variants too, and there still is. But the centre of gravity has moved, and once a centre of gravity moves in software, it does not move back.

For developers and infrastructure teams, this is the moment to build on a stack you control. BasedAPIs gives you OpenAI-compatible inference against the leading open weight models, priced at margin, with the reliability and operational surface a production app actually needs.

For founders, operators, and small businesses, this is the moment to put AI to work without taking on a research project. Hirebase turns the same open weight foundation into an AI workforce you can hire today, with no model selection, no infra, and no prompt engineering required.

Either way, the substrate is open. The question is which layer you want to live on.