前几天的 2025 年 Google I/O 开发者大会上,谷歌发布了一系列先进的图像和视频生成工具,好玩的太多了,我还没来得及一一体验,今天先尝试一下最近超级火的 Veo 3 视频生成。后续再尝试下 Imagen 4 和 Flow 平台和大家分享。先简单介绍一下。
- Veo 3 是谷歌最新的视频生成模型,官方表示其具备更强的物理理解能力,生成的动画更加流畅、逼真。
- 原生 Veo 3 已经直接支持音频生成,包括环境声、音效,甚至角色对白,能够让 AI 生成视频更具沉浸感和真实感。
- 该模型对 AI Pro 和 AI Ultra 订阅用户开放。
我使用的平台是 Gemini,但是目前是只支持文生图,链接: https://gemini.google.com/
以及 Flow 也可以,可以支持图生图、首尾帧,链接: https://labs.google/fx/tools/flow
考虑周末(以及我确实想偷懒!),今天测试的大部分都是文生视频。这期很好玩,希望大家喜欢~!
最新AI视频资讯:
点击选择「文生视频」,然后输入提示词。
- 文生视频:通过描述性的文本提示词生成视频。
- 图生视频:支持首帧、尾帧或首尾帧参考,生成动态内容(250525 目前已经支持外部图片上传)。
- 元素组合生成视频:可提取多张图片的内容和风格,结合提示词生成视频。
注意这个设置:
在 Flow 中还可以在最后导出的时候,可以导出为 GIF 格式或 720P,1080P 需要点击超分处理后导出。Flow 还有延长视频在线剪辑等玩法,下次分享细说!
今天下方有 10 个视频,均使用 Veo 3 生成。视频主题如下:
- 最后的掌声
- 失落的连接
- 小猫宇航员
- 她不记得我
- 时间的倒影
- 夜班狐狸
- 待办事项列表
- 海中巨兽
- 外星打工人
- 狂奔的考拉
视频开始前,稍微讲讲我的大体思路。因为只有 8 秒,而文生视频让它的未知性更强,没有办法在最初就通过图片控制它的整体风格和主体,所以这是带有一定抽卡偶然性质的,很容易就崩掉,所以我的想法是:
- 在提示词中尽可能给更多内容和限制。提示词中包括但不限于视觉风格、故事概述,再尝试加入目前它可以实现的最先进的配音和字幕的提示描述。
- 8 秒很短,但是也可以做一些改变的内容,因为只是文生视频也不好续,我希望这 8 秒内能够快速传达某种感受,在提示词中尝试将 8 秒拆成 4 段,每两秒之间有一个场景变幻、情绪递进或者转折。
注意,这些提示并不是完全都可以实现的,这只是我理想化的情况,提示词中会写出 8 秒内的内容,实际实现能到 70%-80%就已经算不错了。
最后的掌声
Story Title: "The Final Applause" Visual Style: Black-and-white sketch style Rough lines outline the classical theater and the aging actor, while the robotic audience is depicted with clean geometric shapes. The stark black-and-white contrast creates a tension of coldness and solitude.
Story Overview: Seconds 1-2: The aging actor stands at the center of the theater, a spotlight illuminating his gaunt face, while the surrounding audience seats are completely empty. Seconds 3-4: The camera shifts to the audience seats, revealing rows of robots sitting neatly, expressionlessly analyzing data. Seconds 5-6: The actor bows for the curtain call, the lights flicker, and the robots rise in unison to applaud, their mechanical clapping sounding like raindrops striking metal. Seconds 7-8: The actor closes his eyes, smiling with tears streaming down, as the camera pans from above to reveal the grand yet hollow theater.
故事标题:“最后的掌声” 视觉风格:黑白素描风格,粗糙的线条勾勒出古典剧院和衰老的演员,而机械化的观众则用干净的几何形状描绘。鲜明的黑白对比营造出一种冷漠和孤独的紧张感。
故事概述:第 1-2 秒:衰老的演员站在剧院中央,聚光灯照亮了他消瘦的脸庞,周围的观众席完全空无一人。第 3-4 秒:镜头转向观众席,显示整齐坐着的机器人,面无表情地分析数据。第 5-6 秒:演员鞠躬谢幕,灯光闪烁,机器人齐声起立鼓掌,机械的掌声听起来像雨滴击打金属。第 7-8 秒:演员闭上眼睛,微笑着泪水流下,镜头从上方移动,展现出宏伟却空洞的剧院。
失落的连接
In the pitch-black pixel city, a lonely finger points at the flickering "CONNECT" button and gently presses it. A pink data beam carries anticipation across the city, illuminating countless empty windows. On the other end, the silhouette of a waiting pixel girl is lit by the beam. Suddenly, the screen glitches, color blocks tear apart, the signal bar plummets to zero, and the world falls silent. In the darkness, a real heartbeat echoes, scattered garbled codes converge spontaneously with the rhythm, forming two small hearts gazing at each other from afar. The heartbeat stops, the pixel hearts fade, and the screen goes completely black—perhaps the connection was already successful, simply because we are still searching for each other.
漆黑的像素城市,一个孤独的手指着闪烁的“连接”按钮,轻轻按下。粉色的数据光束带着期待穿过城市,照亮无数空荡荡的窗户。在另一端,等待的像素女孩的轮廓被光束照亮。突然,屏幕出现故障,色块撕裂,信号条骤降至零,世界陷入沉寂。在黑暗中,真实的心跳回响,零散的乱码代码自发与节奏汇聚,形成两个小心脏在远处互相凝视。心跳停止,像素心脏褪去,屏幕完全黑暗——或许连接早已成功,只是因为我们仍在互相寻找。
小猫宇航员
"The Last Rescue" Seconds 1-2: Inside a dark wormhole, flickering lights illuminate the scene. A cat in a spacesuit floats inside the control cabin, its two paws furiously typing on the keyboard (soft paw pads tapping rapidly). Its little face is intensely focused, with the light reflecting off its helmet twisting like a nebula. Seconds 3-4: The system blares: "Wormhole collapsing! Navigation failed!" It lets out a sharp meow, flips around, and kicks the engine start button with a paw. Seconds 5-6: The ship begins to spin. The cat clings tightly to the edge of the screen, its fur bristling, eyes wide open. The camera zooms in as it declares, "I can't give up... the galaxy still needs cats." Seconds 7-8: A beam of white light envelops the entire spaceship. In the final frame, a photo appears inside its helmet: the cat basking in the sun with its owner. Subtitles emerge: "For home, chasing the last ray of light."
《最后的拯救》 第 1-2 秒:在一个黑暗的虫洞内,闪烁的灯光照亮了场景。一只穿着宇航服的猫漂浮在控制舱内,双爪在键盘上飞快地敲打(柔软的爪垫快速敲击)。它的小脸专注得很,头盔上的光线反射出如星云般扭曲的光辉。 第 3-4 秒:系统响起警报:“虫洞崩溃!导航失败!”它发出尖锐的喵叫声,翻转过来,用爪子猛踢发动机启动按钮。 第 5-6 秒:飞船开始旋转。猫紧紧抓住屏幕边缘,毛发竖起,眼睛睁得大大的。摄像机拉近,它宣告:“我不能放弃……银河系还需要猫。” 第7-8秒:一束白光笼罩整个飞船。在最后一帧中,头盔内出现一张照片:猫和主人一起享受阳光。字幕出现:“为了家,追逐最后一缕光。”
她不记得我
Story Title: "She Doesn't Remember Me" Visual Style: Retro Cyberpunk Under the neon-lit rainy streets, an 80s CRT-style interface flickers. Characters wear old-fashioned metallic implants, complemented by a grainy film texture and red-blue halos. Story Overview: Seconds 1-2: A man walks into an abandoned memory restoration shop on a rainy neon night, holding a chip in his hand, his face weary. Seconds 3-4: A woman's image appears on the screen, her face familiar yet devoid of emotion. He softly calls her name. Seconds 5-6: She looks at him, blinks, and coldly says, "User identification failed." Seconds 7-8: He inserts the chip into his neck, the screen abruptly goes dark, and as the sound of rain echoes, he vanishes into the street's interplay of light and shadow. Key Line: "She doesn't remember me."
故事标题:“她不记得我” 视觉风格:复古赛博朋克 在霓虹灯闪烁的雨夜街道上,80 年代 CRT 风格的界面闪烁。角色佩戴着老式金属植入物,配以颗粒感的胶卷质感和红蓝光环。
故事概述:第 1-2 秒:一个男人在雨夜的霓虹灯下走进一家废弃的记忆恢复商店,手中握着一个芯片,脸色疲惫。第 3-4 秒:屏幕上出现一个女人的影像,她的脸熟悉却毫无表情。他轻声叫出她的名字。第 5-6 秒:她看着他,眨了眨眼,冷冷地说:“用户身份验证失败。”第 7-8 秒:他将芯片插入脖子,屏幕突然变黑,随着雨声的回响,他消失在街道的光影交错中。
关键台词:“她不记得我。”
时间的倒影
The camera focuses on a massive Time Mirror, reflecting the protagonist's youthful laughter. As her fingertips lightly touch the surface, the reflection instantly ages, the face marked with traces of time. The mirror slowly cracks, and time flows out like liquid through the fissures, seeping into reality. Eventually, the mirror shatters into a black-and-white image. The protagonist stands quietly, as the screen gradually displays the words, "Memory is the reflection of time" .
镜头聚焦在一个巨大的时间镜子上,映照出主角年轻的笑声。当她的指尖轻轻触碰镜面时,反射瞬间老去,脸上带着时间的痕迹。镜子慢慢裂开,时间像液体一样从裂缝中流出,渗入现实。最终,镜子碎成黑白影像。主角静静地站着,屏幕逐渐显示出“记忆是时间的倒影”这句话。
夜班狐狸
"Night Shift Fox" Visual Style: Futuristic Neon Aesthetic + 80s Retro Tech Vibes The city at night is interwoven with purple, blue, and red lights, with reflective, glimmering streets. The fox wears a tailored suit, its tail sweeping light trails across the ground. The overall scene feels sci-fi yet detailed and realistic, with cold, striking colors and a composition full of tension.
Story Summary: Seconds 1-2: The camera tilts down from an overpass, showing a fox carrying a lunchbox walking along an empty street. Behind it, neon advertisements flash wildly with the slogan "Efficiency Above All." Seconds 3-4: The fox sits on a street corner electrical box eating, surrounded by AI courier rabbits and robotic security dogs running past, with no one stopping to notice it. Seconds 5-6: It takes a bite of its sandwich, oil glistening at the corner of its mouth, then looks up at the virtual moon, pausing in silence. Seconds 7-8: It murmurs, "The city doesn’t sleep, so neither can I." The lights reflect in its eyes, faintly bright and slightly wet.
Key Line: "The city doesn’t sleep, so neither can I."
《夜班狐狸》 视觉风格:未来感霓虹美学 + 80 年代复古科技氛围 夜晚的城市交织着紫色、蓝色和红色的灯光,反射出闪烁的街道。狐狸穿着量身定制的西装,尾巴在地面上扫出光轨。整体场景感觉科幻而细致现实,冷峻而鲜明的色彩,构图充满张力。
故事摘要:第 1-2 秒:镜头从高架桥倾斜下来,显示一只拿着午餐盒的狐狸沿着空荡荡的街道走。它身后,霓虹广告疯狂闪烁,标语是“效率至上”。第 3-4 秒:狐狸坐在街角的电箱上吃东西,周围是跑过的 AI 快递兔和机器人警犬,没有人停下来注意它。第 5-6 秒:它咬了一口三明治,嘴角油光闪闪,然后抬头看向虚拟的月亮,静默片刻。第 7-8 秒:它低声说道:“城市不眠,我也无法入眠。”灯光在它的眼中反射,微微明亮且略显湿润。
关键台词:“城市不眠,我也无法入眠。”
待办事项列表
"To-Do List" Visual Style: Paper Craft Animation Style All elements appear as if crafted from real handmade paper, cut, folded, and collaged: characters are silhouette collages, the task list is a tearable sticky note, and the background uses textured paper to create a timeline, alarm clock, and calendar imagery. The camera slowly zooms in, with each frame resembling a framed artwork.
Story Overview: Seconds 1-2: A "Today's To-Do" list made of paper pieces gently falls onto the desk, densely packed with tasks like "Reply to Emails," "Meeting Recap," and "Health Check-In." Seconds 3-4: A paper silhouette character (the protagonist) busily moves around, tearing off one task at a time from the list with increasing speed as tasks are completed. Seconds 5-6: The last sticky note reads "Breathe." The paper figure pauses, hesitating as they look at it. Seconds 7-8: They gently tear off the "Breathe" note but instead of placing it in the completed tasks pile, they stick it to their chest and close their eyes. The entire screen freezes into a textured cover card.
Key Phrases (Text on Paper): Second 6: "Breathe" Second 8: "This, too, is worth completing." (Appears embossed on the cover)
《待办事项列表》视觉风格:纸艺动画风格 所有元素看起来仿佛是用真实的手工纸制作而成,经过剪裁、折叠和拼贴:角色是剪影拼贴,任务列表是可撕的便签,背景使用纹理纸创建时间线、闹钟和日历图像。摄像机缓慢拉近,每一帧都像是一幅装裱好的艺术作品。
故事概述:第 1-2 秒:一份由纸片制成的“今日待办”列表轻轻落在桌子上,密密麻麻地列出“回复邮件”、“会议总结”和“健康检查”等任务。第 3-4 秒:一个纸质剪影角色(主角)忙碌地四处走动,随着任务的完成,越来越快地从列表上撕下一个任务。第 5-6 秒:最后一张便签上写着“呼吸”。纸质角色停下,犹豫地看着它。第 7-8 秒:他们轻轻撕下“呼吸”的便签,但并没有将其放入完成的任务堆,而是贴在自己的胸前,闭上眼睛。整个画面冻结成一个有纹理的封面卡片。
关键短语(纸上的文字):第 6 秒:“呼吸” 第 8 秒:“这也是值得完成的。”(以压印形式出现在封面上)
海中巨兽
"Sea Monster" Visual Style: Hyper-realistic CG + Low-angle handheld perspective + Strong backlit composition The camera remains in a low-angle shot throughout, with extended focal length to emphasize the "endless height of the monster," akin to the "divine fear" depicted in works like *Godzilla*, *Snowpiercer*, and *The Mountain Giant*. Story Synopsis: Seconds 1–2: The torrential rain has just stopped, the night sky looms heavily. A crew member looks up at the distant horizon; the sea surface seems to bulge upward. The shot slowly tilts up from behind him—something begins to rise from the water. Seconds 3–4: The sea monster fully stands up from the ocean. Its fin bones, rock-like armor plates, flickering deep-sea luminescent spots, and partially translucent biological tissues are revealed under the moonlight. Seconds 5–6: The camera pulls back into a low-angle wide shot. The crew member appears as small as a sesame seed. He gasps and stumbles backward, muttering, "It stood up... It really stood up..." His voice begins to crack. Seconds 7–8: The monster's head finally emerges from the sea, its massive form nearly blotting out the sky. A partially folded wing unfurls, stirring up waves. The camera shakes violently amidst water vapor and glowing specks. The screen cuts to black just after a split-second overexposed flash, accompanied by the crew member's scream. Key Dialogue (whispering in terror): "It stood up... It really stood up..." Cinematography: Opening with a low-angle wide shot → Slowly pushing in closer → Mid-section shifts to a upward view of the full body → Ending with intense shaking + overexposed white flash + black screen
《海中巨兽》视觉风格:超现实 CG + 低角度手持视角 + 强烈逆光构图
摄像机始终保持低角度拍摄,使用长焦距来强调“怪物的无尽高度”,类似于在《哥斯拉》、《雪国列车》和《山巨人》等作品中描绘的“神圣恐惧”。
故事概述:第 1-2 秒:倾盆大雨刚刚停止,夜空显得沉重。一名船员抬头望向远方的地平线;海面似乎向上隆起。镜头从他身后缓缓向上倾斜——某种东西开始从水中升起。
第 3-4 秒:海怪完全从海洋中站起。它的鳍骨、岩石般的铠甲、闪烁的深海发光点和部分透明的生物组织在月光下显露无遗。
第 5-6 秒:摄像机拉回到低角度宽镜头。船员显得小得如同芝麻。他喘息着向后退,喃喃自语:“它站起来了……它真的站起来了……”他的声音开始颤抖。
第 7-8 秒:怪物的头终于从海中浮现,它庞大的身形几乎遮住了天空。一只部分折叠的翅膀展开,掀起波浪。摄像机在水蒸气和发光颗粒中剧烈晃动。画面在瞬间曝光的闪光后切换到黑屏,伴随着船员的尖叫。
关键对话(恐惧低语):“它站起来了……它真的站起来了……”
摄影:以低角度宽镜头开场 → 缓慢推近 → 中段转为俯视全身视角 → 以剧烈晃动 + 过度曝光的白光 + 黑屏结束。
外星打工人
"Alien Worker" Visual Style: Handheld interview-style camera + defocused zoom effect + cartoonish anthropomorphic aliens (soft, round heads with big eyes). Natural street lighting, aliens resembling a blend of Pixar and POP MART designs, dressed in Earth delivery uniforms or carrying food delivery backpacks. The overall visuals are realistic, but the characters feel "deliberately out of place." Story Summary (8-second structure): Seconds 1-2: The camera focuses on the street. A young reporter asks, "Which planet are you from?" The frame shakes slightly as the camera pans to the alien. Seconds 3-4: The alien, visibly exhausted and holding a drink bag, responds, "Sg’r’bl... On our planet, we only work 2 hours a day." Seconds 5-6: Drooping its antennae, the alien sighs, "I just planned to do a temp job on Earth... but now rent, utilities, social security... it's too much." Seconds 7-8: The alien gazes at the sky, mumbling softly, "I wanna go home... I miss my mom's plasma soup..." Background text appears: "Even aliens struggle with labor." Key Lines (adorable anthropomorphic tone): "Sg’r’bl... On our planet, we only work 2 hours a day."
视觉风格:手持采访风格的摄像机 + 虚焦变焦效果 + 卡通化的人形外星人(柔软、圆润的头部和大眼睛)。自然的街道光线,外星人看起来像是皮克斯和 POP MART 设计的结合,穿着地球的快递制服或背着外卖背包。整体视觉效果逼真,但角色感觉“故意不合时宜”。
故事摘要(8 秒结构):第 1-2 秒:摄像机聚焦在街道上。一名年轻记者问:“你来自哪个星球?”镜头稍微晃动,摄像机转向外星人。第 3-4 秒:外星人明显疲惫,手里拿着饮料袋,回答:“Sg’r’bl……在我们星球上,我们每天只工作2小时。”第5-6秒:外星人垂下触角,叹息道:“我本来打算在地球做个临时工……但是现在房租、水电、社会保障……太多了。”第7-8秒:外星人凝视着天空,轻声嘟囔:“我想回家……我想念我妈的等离子汤……”背景文字出现:“即使外星人也在为劳动而挣扎。”
关键台词(可爱的拟人化语气):“Sg’r’bl……在我们星球上,我们每天只工作2小时。”
狂奔的考拉
"Running Koala" Visual Style: Hyper-realistic 3D + Handheld Cinematic Feel
The visuals feature intense camera shakes, rapid focus shifts, and a backdrop of a volcanic eruption with interwoven red, black, and gray tones. The koala’s fur is damp and muddy, its eyes filled with fear and struggle. The overall color palette is dominated by fiery orange-red glows and deep gray ash, reminiscent of "The Revenant" + "Dante's Peak."
Story Overview: Seconds 1-2: Amid shaky camera movements, the focus locks on a koala sprinting along the edge of a volcanic forest. Behind it, the ground cracks open, spewing magma, with the erupting volcano looming in the distance. Seconds 3-4: The camera rapidly zooms in for a close-up of the koala’s face—it glances back, its eyes a mix of terror and defiance. Ash drifts down from the sky as the camera shifts and blurs with its frantic movement. Seconds 5-6: Suddenly, from a low-angle shot, a burning tree trunk crashes down behind the koala, the firelight illuminating its silhouette in red. Seconds 7-8: The koala bursts out of the forest edge, leaping toward the camera. The screen goes black upon impact, and text appears: "Will you run toward hope, or into the flames?"
Key Dialogue (Subtitle): "Will you run toward hope, or into the flames?"
《狂奔的考拉》 视觉风格:超现实主义 3D + 手持电影感
视觉效果:包含剧烈的相机抖动、快速的焦点切换,以及火山喷发的背景,交织着红色、黑色和灰色的色调。考拉的毛发湿漉漉且泥泞,眼中充满恐惧与挣扎。整体色彩调色板以炽热的橙红色光辉和深灰色火山灰为主,让人联想到《荒野猎人》和《但丁峰》。
故事概述:第 1-2 秒:在抖动的镜头运动中,焦点锁定在一只沿着火山森林边缘奔跑的考拉身上。它身后,地面裂开,喷出岩浆,远处火山正在喷发。第 3-4 秒:镜头迅速放大,特写考拉的面孔——它回头一瞥,眼中充满恐惧与反抗的混合情绪。灰烬从天空飘落,镜头随着它的疯狂移动而变得模糊。第5-6秒:突然,从低角度镜头拍摄,一根燃烧的树干在考拉身后坠落,火光将它的轮廓照亮成红色。第7-8秒:考拉从森林边缘冲出,向镜头跃去。撞击时画面变黑,出现文字:“你会奔向希望,还是走进火焰?”
关键对话(字幕):“你会奔向希望,还是走进火焰?”
尝试完这组 Veo 3 文生视频后,感慨一下它确实非常棒。首先视频质量上,画质清晰,一般物理和动作模拟都非常自然流畅;同时因为有了音频生成和唇形同步,视频真实感大大增强了,非常有代入感;对复杂的提示也有了较强的理解能力,场景的切换都能处理比较好,没有复杂的提示,自己发挥也可以在 8 秒内生成切换多次镜头。总之单单从文生视频方面它都已经 Next Level,期待其他 AI 视频生成工具早日跟进(把价格打下来呜呜呜)。
当然目前存在的问题也还是有比较多的。
- 细节问题。偶尔出现奇怪的音效;超过一个对象后语音与角色不匹配;以及物理动态模拟问题依然存在,穿模、变、还有复杂的肢体动作和面部微表情、情感流露等,在动态的真实感上还有提升空间,以及经常弄不清方向。
- 对复杂的提示词理解有偏差,提示词较多且分镜切换的时候,出现的结果可能与提示词要求不符。
- 对相对宏大的场景的细节上的控制能力还有待提升。
好,今天的视频就分享到这里啦,大家最喜欢哪一条?期待评论区交流分享~
复制本文链接 文章为作者独立观点不代表优设网立场,未经允许不得转载。
发评论!每天赢奖品
点击 登录 后,在评论区留言,系统会随机派送奖品
2012年成立至今,是国内备受欢迎的设计师平台,提供奖品赞助 联系我们
品牌形象设计标准教程
已累计诞生 722 位幸运星
发表评论 为下方 6 条评论点赞,解锁好运彩蛋
↓ 下方为您推荐了一些精彩有趣的文章热评 ↓