Oleksandr Arsentiev • over 1 year ago
VertexAI: Quota exceeded error
Hello,
I'm experiencing an error when trying to call gemini-1.5-flash more than 5 times per minute:
429 Quota exceeded for aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model with base model: gemini-1.5-flash. Please submit a quota increase request.
According to the docs here, it should be 200.
https://cloud.google.com/vertex-ai/generative-ai/docs/quotas
I'm using quickstart code copied from the VertexAI Studio, so the endpoint should be correct. Please advise on how to increase the quota limit as 5 RPM is too little.
There is an active thread on this issue:
https://www.googlecloudcommunity.com/gc/AI-ML/Gemini-Pro-Quota-Exceeded/m-p/726747
Comments are closed.

2 comments
Shawni Devpost Manager • over 1 year ago
Hi there,
I am checking with the Google Cloud team and will get back to you as soon as possible. Thanks for your patience.
Good luck!
Shawni Devpost Manager • over 1 year ago
Hi there. This is the message from the Google Cloud team:
Hi,
Thanks for reaching out.
Can you confirm the quota assigned to your project by following this method: https://cloud.google.com/docs/quotas/view-manage#api_specific_quota
To increase quota, make a request as specified here: https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota
Can you confirm that you are using the Vertex AI Gemini endpoint and not the AI studio endpoint which usually has a lower quota.