• AiNews.com
  • Posts
  • Google Launches Imagen 4, Veo 3, and Lyria 2 for Generative Media on Vertex AI

Google Launches Imagen 4, Veo 3, and Lyria 2 for Generative Media on Vertex AI

Three creative professionals collaborate in a well-lit studio space. A woman on the left uses a desktop computer to generate an AI image with Google’s Imagen 4, displaying a Mediterranean-style courtyard on screen. In the center, a man stands reviewing AI-generated video content from Veo 3 on a monitor, showing a person hiking through a scenic landscape. On the right, another man works on a laptop running Lyria 2, with a music waveform and playback controls visible. The workspace is decorated with a corkboard of scenic photos, a potted plant, and natural lighting, creating a warm, professional atmosphere that reflects real-world creative production.

Image Source: ChatGPT-4o

Google Launches Imagen 4, Veo 3, and Lyria 2 for Generative Media on Vertex AI

At Google I/O 2025, the company introduced its latest generative AI media models—Imagen 4, Veo 3, and Lyria 2—now available on Vertex AI. These models are designed to help creators, marketers, and media teams accelerate the production of images, videos, and music, all while maintaining high fidelity, safety, and creative control.

Google says the new models represent a step-change in media quality and usability. Users are increasingly moving from prompt to production in minutes—generating campaign assets, cinematic content, and custom soundtracks that once took days or weeks to create.

Imagen 4: More Accurate, Higher Quality Image Generation

Now available in public preview, Imagen 4 is Google’s highest-quality image generation model to date. Running on Vertex AI, it brings several upgrades aimed at improving realism, control, and global accessibility:

  • Stronger prompt adherence, especially for rendering detailed text within images

  • Higher overall image quality across visual styles and subjects

  • Support for multilingual prompts, expanding access for non-English users

Imagen 4 can be used via Media Studio or directly through the Google Gen AI SDK for Python, allowing developers and creative teams to embed it into existing workflows.

Prompt: Filmed cinematically from the driver's seat, offering a clear profile view of the young passenger on the front seat with striking red hair. Her gaze is fixed ahead, concentrated on navigating the dusty, lonely highway visible through her side window, which shows a blurred expanse of dry earth and perhaps distant, hazy mountains. Her arm rests on the window ledge or steering wheel. The shot includes part of the aged truck interior beside her – the door panel, maybe a glimpse of the worn seat fabric. The lighting could be late afternoon sun, casting long shadows and warm highlights across her face and the truck's interior. This angle emphasizes her individual presence and contemplative state within the vast, empty landscape.

Veo 3: Video Generation with Sound, Speech, and Greater Fidelity

Veo 3 is the newest video generation model from Google DeepMind and builds on the capabilities of Veo 2 with improved output quality and expanded media control. Currently in private preview, Veo 3 enables users to generate videos from text or image prompts, with added support for:

  • Speech, such as dialogue or voiceovers

  • Audio, including music and environmental sound effects

  • Improved quality when generating videos from both text and image prompts, with sharper visuals, smoother motion, and more coherent scene transitions.

Several companies are already seeing measurable results from integrating Veo:

  • Klarna has used Veo and Imagen to speed up campaign production, from b-roll to YouTube ads. According to CMO David Sandström, the tools are helping the company scale content creation while improving performance and engagement.

  • Jellyfish, through its AI marketing platform Pencil, collaborated with Japan Airlines to deliver AI-generated in-flight entertainment, citing a 50% drop in costs and time-to-market.

  • Kraft Heinz reports that creative cycles that once took eight weeks now take as little as eight hours, using the company's internal Tastemaker platform powered by Veo and Imagen.

  • Envato, which recently launched a new feature called VideoGen, found Veo 2 outperformed other video models in speed and quality. Early results show high usage and download rates among its creative user base.

Prompt: A medium shot, historical adventure setting: Warm lamplight illuminates a cartographer in a cluttered study, poring over an ancient, sprawling map spread across a large table. Cartographer: "According to this old sea chart, the lost island isn't myth! We must prepare an expedition immediately!"

Lyria 2: High-Fidelity, Customizable AI Music

Also now generally available on Vertex AI, Lyria 2 is Google’s latest text-to-music generation model. Built to complement video and multimedia projects, it enables creators to generate high-quality audio that can be tailored to emotional tone, pacing, and structure.

Lyria 2 supports:

  • High-quality music generation from text prompts

  • Creative control over instruments, tempo, and other musical elements

Real-world applications are already emerging:

  • Captions.ai, which allows users to create AI-generated talking videos, has integrated Lyria 2 into its Mirage Edit feature. Co-founder Dwight Churchill said Lyria composes music that aligns with the tone and rhythm of each scene—eliminating the need for stock music or manual syncing.

  • Dashverse, the company behind platforms like Dashtoon and DashReels, uses Lyria 2 to give users greater control over soundtrack generation for comics and video dramas. CTO Soumyadeep Mukherjee described it as a “storytelling amplifier” that adapts music to match mood, pacing, and transitions in real time.

Prompt: Sweeping Orchestral Film Score, Pristine Studio recording, London, 100-piece Orchestra, Majestic and profound. A blend of soaring melodies, dramatic harmonic shifts, and powerful percussive elements, with instruments such as french horns, strings, and timpani, and a thematic approach, featuring intricate orchestrations, dynamic range, and emotional depth, evoking a cinematic and awe-inspiring atmosphere.

Built-In Safeguards for Responsible Creation

Google emphasized that all three models—Imagen 4, Veo 3, and Lyria 2—are designed with safety and transparency as core principles. The tools include:

  • SynthID watermarking, which embeds invisible identifiers in generated content to indicate that it was AI-created

  • Safety filters that review both inputs and outputs against configurable guidelines, helping users stay aligned with brand values

  • Control over sensitive content, including person generation in visual outputs

These protections are intended to support responsible use at scale, particularly as AI-generated media becomes more mainstream in advertising, entertainment, and user-generated content.

Availability and Getting Started

  • Imagen 4 is in public preview on Vertex AI and available through Media Studio or the Gen AI SDK for Python.

  • Veo 3 is in private preview, with broader access coming soon. Interested users can request early access.

  • Lyria 2 is now generally available on Vertex AI via Media Studio and the model API.

Developers, marketers, and content creators can explore the full documentation for each model on Google Cloud: Imagen 4, Veo 3, and Lyria 2.

What This Means

Google’s latest AI media tools mark an aggressive expansion into the rapidly growing world of generative content production. With Imagen 4, Veo 3, and Lyria 2, Google is aiming not just to improve creative quality—but to shrink production timelines, reduce costs, and empower teams with tools that are both fast and flexible.

These moves land in an increasingly crowded field. OpenAI, Runway, and other players are also pushing forward in video and music generation, and Adobe continues to embed generative tools across its suite. But Google’s advantage lies in its deep integration with Vertex AI, which positions these creative tools within broader enterprise and developer ecosystems.

For brands, agencies, and creators, this means faster cycles, more control, and fewer handoffs between tools. Instead of stitching together separate services for visual, audio, and task automation, Google is offering a unified pipeline for production—from idea to final output.

The question going forward isn’t just how well these models generate media—but how responsibly, securely, and flexibly they scale. As the creative process becomes increasingly AI-assisted, the platforms that provide transparency, control, and cross-medium collaboration may define the next era of content creation.

Editor’s Note: This article was created by Alicia Shapiro, CMO of AiNews.com, with writing, image, and idea-generation support from ChatGPT, an AI assistant. However, the final perspective and editorial choices are solely Alicia Shapiro’s. Special thanks to ChatGPT for assistance with research and editorial support in crafting this article.