Usage limits & fair use policy

innoGPT offers a fair flat-rate price - so that this works for everyone, transparent usage limits apply to ensure that the platform remains stable and fast for every user.

Why are there limits in the first place? Every request incurs real
costs behind the scenes: innoGPT pays the respective model providers (e.g., OpenAI or Anthropic) for each token used. At the same time, we offer you a fixed flat-rate price. The limits are set generously. To ensure this remains sustainable and the platform stays stable for all users, a Fair Use Policy applies.

What consumes a particularly large number of tokens?
Not all requests cost the same. Two factors are particularly significant:

Ultra and Premium models, such as Claude Opus or GPT-5.4 Pro, cost significantly more per request than leaner models. If you work exclusively with these models, you’ll reach the limit much faster.
Deep Research performs many individual requests in the background and therefore consumes a particularly large number of tokens per run—this adds up quickly.

What happens when the limit is reached?
innoGPTdoes not immediately impose strict restrictions. Instead, there are tiered measures: First, a temporary rate limit may be imposed. In the next step, computationally intensive models are temporarily restricted and redirected to more resource-efficient alternatives. In the event of a persistent violation, innoGPT will proactively reach out to help find a more suitable plan.

Tip for smart usage
: For simple tasks like text summaries, short answers, or standard research, leaner models are perfectly sufficient. You should use Ultra models specifically for complex queries—this way, you’ll get the most out of your quota.