Usage limits & fair use policy

innoGPT offers a fair flat-rate price - to ensure that this works for everyone, transparent usage limits apply to ensure that the platform remains stable and fast for every user.

Why are there limits in the first place?

Every request incurs real costs: innoGPT pays the respective model providers (e.g., OpenAI or Anthropic) for each token used. At the same time, we offer you a fixed flat-rate price. The limits are generously calculated. To ensure this works long-term and the platform remains stable for all users, a Fair Use Policy applies.

What consumes a particularly large number of tokens?

Not all requests cost the same. Two factors are particularly significant:

Ultra and Premium models, such as Claude Opus or GPT-5.5, cost significantly more per request than leaner models. If you work exclusively with these models, you’ll reach the limit much faster.
Deep Research performs many individual requests in the background and therefore consumes a particularly large number of tokens per run—this adds up quickly.

What happens when the limit is reached?

innoGPT does not take immediate drastic action. Instead, there are graduated measures: First, a temporary rate limit may be imposed. In the next step, computationally intensive models are temporarily restricted and redirected to more resource-efficient alternatives. If the limit is consistently exceeded, innoGPT will proactively reach out to help find a more suitable plan together.

Tip for smart usage

For simple tasks like text summaries, short answers, or standard research, leaner models are perfectly sufficient. You should use Ultra models specifically for complex queries — this way, you’ll get the most out of your quota.

Available Plans & Models

Depending on your plan, different model categories are available to you. Generally speaking, the higher the plan tier, the more powerful (and computationally intensive) models are unlocked.

Which plan includes which model categories?

Personal / Pro / Business / Partner / Family: Standard, Premium & Ultra
Go: Standard only
Trial — 7-day trial period: Standard

Model overview (as of May 2026)

🔗 Always up to date on the website — you can find all models, including hosting region, at innogpt.com/models

🟢 Standard models

OpenAI: gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.1, gpt-5.4-mini, gpt-5.4-nano, gpt-4o-mini, gpt-4.1-mini, gpt-4.1-nano, o1-mini, o3-mini, o4-mini
Anthropic: claude-4-5-haiku
Google: gemini-2.0-flash, gemini-2.5-flash, gemini-2.5-pro, gemini-3-flash-preview
Mistral: mistral-large, mistral-small, devstral-2
xAI: grok-4-1-fast-reasoning
DeepSeek: deepseek-r1, deepseek-v3, deepseek-v3.2
Meta: llama-4-maverick
Perplexity: sonar-deep-research
Moonshot: kimi-k2.5

🟡 Premium models

OpenAI: gpt-5-codex, gpt-5.2, gpt-5.2-codex, gpt-5.3-codex, gpt-5.3-instant, gpt-5.4, gpt-4o, gpt-4.1, o1
Anthropic: claude-3-5-sonnet, claude-4-sonnet, claude-4-5-sonnet, claude-4-6-sonnet
Google: gemini-2.0-flash-thinking-mode, gemini-3-pro-preview, gemini-3.1-pro-preview
xAI: grok-3
Cohere: cohere-command-a
Perplexity: sonar-pro

🔴 Ultra models

OpenAI: gpt-5.4-pro, gpt-5.5, gpt-5.5-pro
Anthropic: claude-4-6-opus, claude-4-7-opus

Usage Scope: What does “unlimited messages” mean?

innoGPT does not use strict message quotas per user, but rather a pooled usage budget per workspace across all users.

What’s included:

Standard messages

What is billed separately:

API usage
Add-ons such as PII, videos, podcasts

How does the Workspace budget work?

Expensive models (Premium/Reasoning models) consume more budget per request
Cheaper models consume correspondingly less
When the budget is reached, a soft cap kicks in: Premium models are restricted, while Standard models remain fully available

💡 What does this mean in practice? No one “runs out.” As soon as the Premium budget is used up, users can seamlessly continue working with the more efficient models — no hard cutoff, no blocked workflow.