Wan2 Video Generation

This API supports multiple Alibaba Tongyi Wanxiang (Wan2) video generation models. Please refer to Alibaba Cloud’s official documentation for more details.

Overview

Generate videos from text prompts or images using various Wan2 models, each optimized for different use cases.

Supported Models

Model	Description	Audio	First Frame	First & Last Frame	Resolution	Duration	FPS	Format
wan2.5-t2v-preview	Text-to-video with auto sound or custom audio	✅	❌	❌	480P/720P/1080P	5s/10s	24fps	MP4
wan2.5-i2v-preview	Image-to-video with auto sound or custom audio	✅	✅	❌	480P/720P/1080P	5s/10s	24fps	MP4
wan2.2-i2v-flash	Fast version, 50% speed improvement	❌	✅	❌	480P/720P/1080P	5s	30fps	MP4
wan2.2-i2v-plus	Professional version, enhanced stability	❌	✅	❌	480P/1080P	5s	30fps	MP4
wan2.1-vace-plus	Multi-modal support, video editing	❌	✅	✅	720P	5s	30fps	MP4
wan2.1-kf2v-plus	First & last frame (keyframe-to-video)	❌	❌	✅	720P	5s	30fps	MP4

Resolution Options

480P

832×480 (16:9)
480×832 (9:16)
624×624 (1:1)

720P

1280×720 (16:9)
720×1280 (9:16)
960×960 (1:1)
1088×832 (4:3)
832×1088 (3:4)

1080P

1920×1080 (16:9)
1080×1920 (9:16)
1440×1440 (1:1)
1632×1248 (4:3)
1248×1632 (3:4)

Audio Features (Wan2.5 only)

Auto-generated Audio

Enabled by default for wan2.5-t2v-preview and wan2.5-i2v-preview
Automatically generates synchronized audio based on video content

Custom Audio

Supported formats: WAV, MP3
Duration: 3-30 seconds
Max file size: 15MB
Behavior: If audio is shorter than video, remaining portion is silent; if longer, it’s truncated

Example Requests

Text-to-Video (wan2.5-t2v-preview)

{
  "model": "wan2.5-t2v-preview",
  "prompt": "A small cat running on a grassy field in the moonlight",
  "size": "1920*1080",
  "duration": 10,
  "audio": true
}

Image-to-Video (wan2.5-i2v-preview)

{
  "model": "wan2.5-i2v-preview",
  "prompt": "The cat starts running forward",
  "image": "https://example.com/cat.jpg",
  "size": "1280*720",
  "duration": 5
}

Fast Generation (wan2.2-i2v-flash)

{
  "model": "wan2.2-i2v-flash",
  "prompt": "Gentle motion, camera slowly pans right",
  "image": "data:image/jpeg;base64,/9j/4AAQSkZJRg...",
  "size": "1920*1080"
}

Keyframe Interpolation (wan2.1-kf2v-plus)

{
  "model": "wan2.1-kf2v-plus",
  "prompt": "A black cat looks curiously at the sky, camera gradually rises from eye-level to overhead",
  "image": "https://example.com/first_frame.jpg",
  "last_frame": "https://example.com/last_frame.jpg",
  "size": "1280*720"
}

With Custom Audio (wan2.5-t2v-preview)

{
  "model": "wan2.5-t2v-preview",
  "prompt": "A person walking through a forest, birds chirping",
  "audio_url": "https://example.com/forest_sounds.mp3",
  "size": "1920*1080",
  "duration": 10
}

With Video Effect Template (wan2.1)

{
  "model": "wan2.1-vace-plus",
  "prompt": "Magical levitation effect",
  "image": "https://example.com/subject.jpg",
  "template": "flying",
  "size": "1280*720"
}

Image Requirements (for i2v and kf2v models)

Property	Requirement
Formats	JPEG, JPG, PNG (no transparency), BMP, WEBP
Dimensions	[360, 2000] pixels for both width and height
File Size	Max 10MB
Input	Public URL or Base64 encoded data

Parameters

size vs resolution

Text-to-video models use size parameter with exact dimensions (e.g., “1920*1080”)
Image-to-video models may use resolution parameter with quality tier (e.g., “1080P”)

The model automatically scales or matches the aspect ratio based on input

duration

Available options depend on model:

wan2.5: 5 or 10 seconds
wan2.2: 5 seconds (fixed)
wan2.1: 3, 4, or 5 seconds (varies by model)

prompt_extend

Default: true
Effect: Uses AI to enhance short prompts
Trade-off: Better results but increases processing time

Prompt Tips

For best results when describing motion:

Specify camera movement (pan left, zoom in, dolly shot)
Describe subject motion (walks forward, turns around)
Include environment details (windy, foggy, sunlit)
For keyframe interpolation, describe the transition between frames

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Select the Wan2 video generation model you want to call. Each model exposes a tailored parameter set.

model

enum<string>

required

Fixed model name.

Available options:

wan2.5-t2v-preview

prompt

string

required

Text description for the desired video content (max 2000 characters).

Maximum string length: 2000

negative_prompt

string

Negative prompt describing unwanted content (max 500 characters).

Maximum string length: 500

audio

boolean | null

default:true

Enable automatic audio generation. Set to false to force a silent output.

audio_url

string<uri> | null

Custom audio file URL (wav/mp3, 3-30s, ≤15MB). Overrides the audio flag.

size

enum<string>

default:1280*720

Output resolution ("width*height"). Supported tiers:

480P: 832*480 (16:9), 480*832 (9:16), 624*624 (1:1)
720P: 1280*720 (16:9), 720*1280 (9:16), 960*960 (1:1), 1088*832 (4:3), 832*1088 (3:4)
1080P: 1920*1080 (16:9), 1080*1920 (9:16), 1440*1440 (1:1), 1632*1248 (4:3), 1248*1632 (3:4)

Available options:

832*480,

480*832,

624*624,

1280*720,

720*1280,

960*960,

1088*832,

832*1088,

1920*1080,

1080*1920,

1440*1440,

1632*1248,

1248*1632

duration

enum<integer>

Video duration in seconds (24 fps). Supported values 5 or 10.

Available options:

5,

10

prompt_extend

boolean

default:true

Enable intelligent prompt rewriting (slightly longer latency, better detail).

seed

integer

Random seed [0, 2147483647].

Required range: 0 <= x <= 2147483647

Response

202 - application/json

Accepted - Task created successfully

task_info

object

Show child attributes

task_info.id

string<uuid>

required

UUID of the task

task_info.status

enum<string>

required

Task status

Available options:

pending,

processing,

completed,

failed

task_info.created_at

string<date-time>

required

Task creation timestamp (ISO 8601)

task_info.updated_at

string<date-time>

required

Task last update timestamp (ISO 8601)

task_info.error

object

Error details (only when status is failed)

Show child attributes

task_info.error.code

integer

MuleRouter Error code

task_info.error.title

string

MuleRouter Error title

task_info.error.detail

string

MuleRouter Error detail

Using the APIs

API reference

LLM

Image Generation

Video Generation

Overview

Supported Models

Resolution Options

480P

720P

1080P

Audio Features (Wan2.5 only)

Auto-generated Audio

Custom Audio

Example Requests

Text-to-Video (wan2.5-t2v-preview)

Image-to-Video (wan2.5-i2v-preview)

Fast Generation (wan2.2-i2v-flash)

Keyframe Interpolation (wan2.1-kf2v-plus)

With Custom Audio (wan2.5-t2v-preview)

With Video Effect Template (wan2.1)

Image Requirements (for i2v and kf2v models)

Parameters

size vs resolution

duration

prompt_extend

Prompt Tips

Authorizations

Body

Response

Using the APIs

API reference

LLM

Image Generation

Video Generation

​Overview

​Supported Models

​Resolution Options

​480P

​720P

​1080P

​Audio Features (Wan2.5 only)

​Auto-generated Audio

​Custom Audio

​Example Requests

​Text-to-Video (wan2.5-t2v-preview)

​Image-to-Video (wan2.5-i2v-preview)

​Fast Generation (wan2.2-i2v-flash)

​Keyframe Interpolation (wan2.1-kf2v-plus)

​With Custom Audio (wan2.5-t2v-preview)

​With Video Effect Template (wan2.1)

​Image Requirements (for i2v and kf2v models)

​Parameters

​size vs resolution

​duration

​prompt_extend

​Prompt Tips

Authorizations

Body

Response

Overview

Supported Models

Resolution Options

480P

720P

1080P

Audio Features (Wan2.5 only)

Auto-generated Audio

Custom Audio

Example Requests

Text-to-Video (wan2.5-t2v-preview)

Image-to-Video (wan2.5-i2v-preview)

Fast Generation (wan2.2-i2v-flash)

Keyframe Interpolation (wan2.1-kf2v-plus)

With Custom Audio (wan2.5-t2v-preview)

With Video Effect Template (wan2.1)

Image Requirements (for i2v and kf2v models)

Parameters

size vs resolution

duration

prompt_extend

Prompt Tips