Start a render
Wan 2.5 Text & Image to Video
Audio enabled
0:00 / 0:00

Wan 2.5 Text & Image to Video audio-enabled video example: city camera move

This Wan 2.5 Text & Image to Video text to video example shows city camera move. It highlights audio-enabled output with 10-second timing · 9:16 output.

Wan 2.5 Text & Image to VideoText to video10s9:16Enabled$0.65
Wan 2.5 Text & Image to VideoText to video10s9:16Audio

Prompt breakdown

Text-to-video prompt used to generate this render.

Ultra-realistic walking selfie shot filmed with a smartphone held in one hand. The person is speed-walking through a busy urban street in daylight. Camera movement is dynamic: fast steps, sudden micro-shakes, quick tilt…

Subject

Ultra-realistic walking selfie shot filmed with a smartphone held in one hand. The person is speed-walking through a busy urban street in daylight. Camera movement is dynamic: fast steps, sudden micro-shakes, quick tilt…

Workflow

Text to video

Camera

Audio Enabled

Output

10s · 9:16

Audio

Enabled

Constraints

Text To Video, Audio Enabled, Camera Move

Show full prompt

Ultra-realistic walking selfie shot filmed with a smartphone held in one hand. The person is speed-walking through a busy urban street in daylight. Camera movement is dynamic: fast steps, sudden micro-shakes, quick tilts as the person avoids people and obstacles. Natural motion blur, realistic stabilization drift, shifting sunlight and shadows on their face. High-detail skin texture, real reflections in the eyes. The person speaks extremely fast, slightly out of breath, trying to explain something urgently while walking. Lip-sync must perfectly match the following rapid line: “Okay listen, I don’t have much time but everything’s happening way faster than I expected and I swear I’ll explain everything once I get there!” Audio: realistic city ambience (footsteps, passing cars, faint horns), wind hitting the phone mic, breath sounds, occasional clothing rustle. Keep the phone-mic quality: compressed, slightly distorted on loud peaks. Mood: energetic, chaotic, spontaneous. No filters, no beautification. Keep it raw and real.

Why Wan 2.5 Text & Image to Video fits this shot

Wan 2.5 handles 5 or 10 second clips with optional background audio plus prompt expansion when you need extra detail.

Audio option

5s or 10s

480p–1080p

Key frames

Opening frame
Motion beat
Final shot

Related examples

View all examples