Pricing | Generative AI on Vertex AI | Google Cloud
Prices are listed in US Dollars (USD). If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.
Generative AI on Vertex AI Pricing
This page covers pricing for Generative AI on Vertex AI. For all other Vertex AI pricing including ML Platform and MLOps services please refer to Vertex AI pricing page.
Multimodal Models
With the Multimodal models in Vertex AI, you can input either text or media (images, video). Text input is charged by every 1,000 characters of input (prompt) and every 1,000 characters of output (response). Characters are counted by UTF-8 code points and white space is excluded from the count, resulting in approximately 4 characters per token. Prediction requests that lead to filtered responses are charged for the input only. At the end of each billing cycle, fractions of one cent ($0.01) are rounded to one cent.
Media input is charged per image or per second (video).
Context Caching
With context caching, you can reduce the cost and latency of content generation by caching the context portion of your input text or media to Gemini models. The amount of time data is stored in the cache, which can be controlled by the user, determines the “Context Cache Storage” charges. Cache hits on input data are charged at a reduced rate, “Cached Input”, instead of the normal input cost. The data size for both storage and input is calculated in the same way as Gemini input pricing.
Image Generation
With the Image Generation feature of Vertex AI, you can generate novel images and edit images based on text prompts you provide, or edit only parts of images using a mask area you define along with a host of other capabilities.
Generative AI on Vertex AI charges by every 1,000 characters of input (prompt) and every 1,000 characters of output (response). Characters are counted by UTF-8 code points and white space is excluded from the count. During the Preview stage, charges are 100% discounted. Prediction requests that lead to filtered responses are charged for the input only. At the end of each billing cycle, fractions of one cent ($0.01) are rounded to one cent.
Partner Models
Partner models are a curated list of generative AI models developed by Google partners. Partner models are offered as managed APIs. For more information, see Overview of partner models.
The following table lists pricing details for Google partner models: