The AI Industry Has a Generosity Problem - And It's Getting Worse

Om
Share
Table of Contents

I've been thinking about this for a while, and I think the AI industry has quietly made a decision that nobody is really calling out loudly enough: the companies building the best models are also the ones making them the hardest to actually use. Not because the technology isn't there. Not because the compute doesn't exist. But because the product decisions being made at these companies are increasingly out of touch with the people who are paying for them.

Let me start with the open-source circus. Every few weeks, another lab drops a model with a blog post about democratizing AI, and within hours there's a wave of YouTube videos telling you to run it on your laptop. I get the appeal. Privacy, control, no usage limits - it sounds great in theory. But here's what those videos don't show you: the thermal throttling, the 8-token-per-second generation speed, the fact that your GPU is pulling 400W continuously just to get output quality that's two generations behind what you'd get from a cloud API call. Data centers run these models on hardware clusters purpose-built for the job, with batching, liquid cooling, and interconnects that a consumer PC will never replicate. The efficiency gap isn't small - it's enormous. Open-sourcing model weights has legitimate research value, but packaging it as a viable daily driver for regular users is just content farming.

The more important problem, though, is what's happening with the companies that are hosting models properly. Because even they can't seem to agree on how generously to treat the people paying them. Google runs Gemini on its own TPU infrastructure - hardware it designs and manufactures internally. That structural cost advantage means they can offer 100 Gemini 1.5 Pro prompts per day on their Pro plan and expand those limits regularly without breaking a sweat. OpenAI uses rolling 3-hour reset windows, which is genuinely smart product design: the cap exists, but it resets fast enough that a normal user never notices it unless they're doing something extreme. And then there's Anthropic - doing weekly blackouts with no usage dashboard, no warnings, and no middle ground between their entry plan and a $200/month tier that's priced for Silicon Valley engineering teams.

What makes Anthropic's situation particularly frustrating is that their model - Claude - is arguably the best for technical and coding work. The writing quality, the reasoning, the way it handles complex, multi-step problems - it's genuinely excellent. So the user who most wants to use Claude heavily is exactly the user most likely to get throttled. That's a painful irony. Claude Code in particular is an agentic tool that reads entire codebases, rewrites files, and loops through tool calls in a single session. It burns through tokens at 10x the rate of a normal chat exchange. Charging that against the same token pool as a casual conversation is architecturally wrong, and it means power users hit their weekly limit after just a few serious sessions.

The fix isn't complicated, and it doesn't require Anthropic to spend more on compute. Switch the reset window from weekly to a rolling 3-hour window. The total token budget stays the same - you're just distributing it in a way that doesn't punish users for having an intense workday. Add a usage dashboard so people can see where they stand before they're blindsided mid-session. And separate the token pools for chat and agentic coding work, because treating them identically makes no sense given how differently they consume resources.

The pricing tier problem is a separate but equally real issue. Right now there's a massive gap between the entry-level Pro plan - which is quietly designed for casual conversation users - and the Max plan at $200/month. That gap swallows every developer, student, and independent professional who actually wants to use AI seriously but can't justify enterprise pricing. A developer-tier plan at $35–40/month, built specifically for heavier technical use with no weekly blackouts, would capture an enormous segment of users who are currently either getting locked out or quietly switching to competitors. It's not a hard business case to make.

Regional pricing is the other thing nobody at these companies seems to want to address. Twenty dollars a month is a reasonable ask in the United States. In India, that's a meaningful chunk of a software engineer's discretionary budget, and what you're getting for it is often a degraded version of what US users receive on the same plan. Google has started doing purchasing-power parity pricing in some markets. Anthropic hasn't. The result is that users in high-growth markets - places where AI adoption is accelerating fast - are getting the worst value proposition relative to what they're paying. That's not just unfair; it's a bad long-term business decision.

The underlying issue tying all of this together is that the AI industry is still largely building for a narrow archetype of the "power user" - someone in San Francisco with a corporate card and a high-bandwidth connection who uses AI as a productivity tool between meetings. The reality of who is actually paying for these subscriptions is far more diverse, far more price-sensitive, and far more likely to be doing intensive, creative, technical work from places and budgets that don't fit that archetype. Until these companies start designing products for the actual distribution of their users - not the median Silicon Valley persona - the limits problem isn't going away. It's just going to keep generating frustration from the people who care the most about the technology.