Gemini user hits 5-hour usage cap after a single prompt, Google responds
At a glance:
- A Google AI Pro subscriber hit the five-hour usage cap after a single failed video-generation prompt using the avatar feature.
- Google's new compute-based limits factor in prompt complexity, making usage unpredictable for subscribers.
- Google has acknowledged the complaint and is investigating, amid broader user dissatisfaction with tighter quotas.
The triggering incident
A Google AI Pro user, Ashutosh Shrivastava, reported exhausting his five-hour usage allowance in mere minutes after attempting a single video-generation task. Shrivastava used Gemini's avatar-based video generation feature, which creates animated avatars from text prompts. According to his account on X, the prompt ran for approximately three to four minutes but ultimately failed, yet it consumed 100% of his rate limit. He supplemented his claim with a video as proof, showing the rapid depletion of his quota. This incident underscores a growing pain point with Google's recent shift to a compute-based allocation system, where even unsuccessful tasks can incur significant costs.
How the new compute-based system works
Google recently replaced its previous fixed-prompt limit with a dynamic, compute-based credit system for Gemini. This new approach calculates usage quotas based on multiple factors: the inherent complexity of prompts, the specific features employed (such as video generation or advanced reasoning), and the overall length of conversations. Under the Google AI Pro plan, these limits refresh every five hours until users eventually reach a broader weekly quota. While intended to distribute computational resources more efficiently during high demand, the system has introduced uncertainty. Users now struggle to gauge how much a single task will consume, as tasks like video generation—which require substantial processing power—can burn through credits quickly, as evidenced by Shrivastava's experience.
User frustration and historical context
Complaints about Gemini's updated quota system are mounting, with many subscribers voicing concerns on platforms like the Gemini subreddit. Users argue that the new limits feel significantly more restrictive compared to the predictable, fixed-prompt model of the past. The opacity of the compute-based system exacerbates frustration, as it's difficult to estimate usage in advance. This sentiment is echoed in multiple posts criticizing Google for tightening access without clear communication. Interestingly, Google has boosted usage quotas for Antigravity users—a specific tier or group—by as much as 9x compared to the immediate period after initial restrictions were implemented. However, for most regular Google AI Pro subscribers, the broader caps appear unchanged, leading to a perception of uneven treatment and unresolved pain points for the general user base.
Google's response and the path forward
Google has responded directly to the high-profile complaint. Josh Woodward, the lead of the Gemini team, replied to Shrivastava's post on X with a succinct acknowledgment: "Yikes, let us take a look!" This indicates that the company is aware of the issue and is conducting an investigation. However, the incident highlights a critical need for Google to address systemic problems with the compute-based limits. To restore user confidence, Google could consider making the system more transparent—perhaps by providing real-time usage estimators or clearer guidelines—or by loosening restrictions and increasing overall quotas for paying subscribers. After all, premium AI tools are expected to offer reliability and accessibility, not leave users stranded after a single prompt. The outcome of this scrutiny may set a precedent for how AI platforms balance resource management with user experience in an increasingly competitive landscape.
Broader implications for AI service reliability
The episode with Gemini reflects a wider challenge in the AI industry: how to monetize and scale powerful models without alienating users through unpredictable limitations. As AI services become more computationally intensive, providers like Google must navigate the tension between cost recovery and user satisfaction. For businesses and creators relying on tools like Gemini for video generation, sudden usage caps can disrupt workflows and erode trust. This incident may prompt Google to refine its metering approach, potentially offering tiered add-ons or more granular control over resource allocation. Monitoring user feedback and competitor strategies—such as those from OpenAI or Anthropic—will be crucial as the market evolves. Ultimately, transparency and flexibility could determine whether compute-based systems enhance or hinder adoption of premium AI offerings.
What to watch next
Key developments to monitor include Google's official findings from its investigation into the five-hour cap issue and any subsequent policy adjustments. Users should watch for updates on whether the company will introduce more predictable usage metrics or expand quotas for Google AI Pro subscribers. Additionally, the response from the broader community—especially on forums like Reddit—will signal if this is an isolated incident or part of a larger pattern of dissatisfaction. Competitors may seize on this as an opportunity to highlight their own usage models, so Google's handling of the situation could impact its standing in the AI platform race. Finally, any changes to the compute-based system will serve as a bellwether for how AI companies balance technical constraints with user expectations in the subscription economy.
FAQ
What caused the user to hit the five-hour usage cap so quickly?
How does Google's new compute-based usage limit system work for Gemini?
Has Google responded to the complaints about the usage caps?
More in the feed
Prepared by the editorial stack from public data and external sources.
Original article