Wan 2.6 Text & Image to Video camera movement example: studio push-in
This Wan 2.6 Text & Image to Video text to video example shows studio push-in. It highlights audio-enabled output and camera motion control with 10-second timing · 16:9 · 720p output.
Prompt breakdown
Text-to-video prompt used to generate this render.
Subject
Wide 16:9 full-body unboxing video in a clean studio/kitchen setting. A person is fully visible (head-to-toe or at least head-to-knees) standing behind a minimalist tabletop. They unbox a small generic gadget from a pla…
Workflow
Text to video
Camera
Push In
Output
10s · 16:9 · 720p
Audio
Enabled
Constraints
Text To Video, Audio Enabled, Push In
Show full promptHide full prompt
Wide 16:9 full-body unboxing video in a clean studio/kitchen setting. A person is fully visible (head-to-toe or at least head-to-knees) standing behind a minimalist tabletop. They unbox a small generic gadget from a plain matte cardboard box: peel the seal, open the lid, remove the inner tray, take out the device and accessories, and lay everything neatly on the table. The person occasionally lifts the item toward the camera for a closer look, then places it back down. Realism requirements: natural body proportions, stable identity, realistic skin and clothing fabric, no face warping, no unnatural limb bending. Hands must be highly realistic: correct finger count, natural grip, believable pressure/contact with the box and device, consistent shadows, no extra fingers, no “floating” objects. Keep object geometry stable, no wobbling background, minimal temporal flicker. Camera: single continuous shot, tripod-stable, slight cinematic push-in (very slow), eye-level or slightly above table height. Natural soft daylight, clean shadows, realistic materials and textures. No logos, no brand names, no watermarks. No subtitles. Optional on-screen title at the top (perfectly readable and stable, no jitter): "UNBOXING — FIRST LOOK"
Why Wan 2.6 Text & Image to Video fits this shot
Wan 2.6 merges text, image, and reference-to-video in one card with multi-shot prompting and 720p/1080p tiers.
Text prompts
Image input
Reference video
Key frames



Related examples
View all examples
Wan 2.6 Text & Image to VideoWan 2.6 rainy neon thriller sequence example
This Wan 2.6 page shows a rainy neon thriller prompt with multi-shot direction, smooth camera work and audio-enabled pacing.
Wan 2.6 Text & Image to VideoKling 3 Pro Mars terraforming sci-fi video example
This Kling 3 Pro watch page shows a 16:9 text-to-video sci-fi terraforming scene on Mars, using a dolly-in, slow orbit and dramatic red-to-green landscape transformation.
Wan 2.6 Text & Image to VideoWan 2.5 vertical spy-to-Zoom comedy video example
This Wan 2.5 watch page shows a vertical comedy prompt that opens like a spy action scene and ends with a Zoom-call reveal.
Wan 2.6 Text & Image to VideoKling 3 Pro neon street multi-shot example
This Kling 3 Pro example demonstrates a multi-shot neon street sequence with rain reflections, native audio and structured scene anchors.