Billing / Quota FAQ

Howdy folks! Great to see the new Billing FAQ page, but the answer about viewing the Quota has left me more confused than before.

The question states:

You can view your quota and system limits in the Google Cloud console.

The link goes directly to the quota page for my project. It looks something like this:

Which says there are 497 possible quotas being implemented. No indication anywhere which one of these I should care about. Not to mention that it is showing that there is no activity subject to quota limits being recorded.

If I filter it for “GenerateContent request limit per minute for a region”, it narrows it down to 40, but still says I’m not doing anything worthy of quota measurement.

I also am a bit concerned that this isn’t “by model” either. The equivalent Vertex AI quotas are limited per model, and the pricing page shows different quotas per model, so why isn’t this showing up here?

I tested this with both v1beta.streamingGenerateContent v1beta.generateContent, and v1.generateContent, against “gemini-1.0-pro-latest” sending roughly one request per second with about 20 tokens in the request.

I’m sure it is hitting the right project and that this project has billing enabled since it is showing up in the activity metrics:

Am I doing something incorrect here? It should be showing up on the quota page, right?

1 Like