Skip to main content
POST
/
vendors
/
alibaba
/
v1
/
wan2.5-t2v-preview
/
generation
Create Generation Task
curl --request POST \
  --url https://api.mulerouter.ai/vendors/alibaba/v1/wan2.5-t2v-preview/generation \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "prompt": "<string>",
  "negative_prompt": "<string>",
  "audio": true,
  "audio_url": "<string>",
  "size": "1280*720",
  "duration": 5,
  "prompt_extend": true,
  "seed": 1073741823
}
'
{
  "task_info": {
    "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "status": "pending",
    "created_at": "2023-11-07T05:31:56Z",
    "updated_at": "2023-11-07T05:31:56Z"
  }
}
This API supports Alibaba Tongyi Wanxiang (Wan2) video generation models. Please refer to Alibaba Cloud’s official documentation for more details.

Overview

Generate videos from text prompts using the wan2.5-t2v-preview model with optional audio generation.

Key Features

  • Text-to-video generation with auto sound or custom audio
  • Multiple resolution options (480P/720P/1080P)
  • 5s or 10s duration
  • 24fps output

Resolution Options

480P

  • 832×480 (16:9)
  • 480×832 (9:16)
  • 624×624 (1:1)

720P

  • 1280×720 (16:9)
  • 720×1280 (9:16)
  • 960×960 (1:1)
  • 1088×832 (4:3)
  • 832×1088 (3:4)

1080P

  • 1920×1080 (16:9)
  • 1080×1920 (9:16)
  • 1440×1440 (1:1)
  • 1632×1248 (4:3)
  • 1248×1632 (3:4)

Audio Features

Auto-generated Audio

  • Enabled by default
  • Automatically generates synchronized audio based on video content

Custom Audio

  • Supported formats: WAV, MP3
  • Duration: 3-30 seconds
  • Max file size: 15MB
  • Behavior: If audio is shorter than video, remaining portion is silent; if longer, it’s truncated

Example Requests

Basic Text-to-Video

{
  "prompt": "A small cat running on a grassy field in the moonlight",
  "size": "1920*1080",
  "duration": 10,
  "audio": true
}

With Custom Audio

{
  "prompt": "A person walking through a forest, birds chirping",
  "audio_url": "https://example.com/forest_sounds.mp3",
  "size": "1920*1080",
  "duration": 10
}

Silent Video

{
  "prompt": "City night scene timelapse, flowing traffic, brilliant lights",
  "size": "1280*720",
  "duration": 5,
  "audio": false
}

Prompt Tips

For best results when describing motion:
  • Specify camera movement (pan left, zoom in, dolly shot)
  • Describe subject motion (walks forward, turns around)
  • Include environment details (windy, foggy, sunlit)

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
prompt
string
required

Text description for the desired video content (max 2000 characters).

Maximum string length: 2000
negative_prompt
string

Negative prompt describing unwanted content (max 500 characters).

Maximum string length: 500
audio
boolean | null
default:true

Enable automatic audio generation. Set to false to force a silent output.

audio_url
string<uri> | null

Custom audio file URL (wav/mp3, 3-30s, ≤15MB). Overrides the audio flag.

size
enum<string>
default:1280*720

Output resolution ("width*height"). Supported tiers:

  • 480P: 832*480 (16:9), 480*832 (9:16), 624*624 (1:1)
  • 720P: 1280*720 (16:9), 720*1280 (9:16), 960*960 (1:1), 1088*832 (4:3), 832*1088 (3:4)
  • 1080P: 1920*1080 (16:9), 1080*1920 (9:16), 1440*1440 (1:1), 1632*1248 (4:3), 1248*1632 (3:4)
Available options:
832*480,
480*832,
624*624,
1280*720,
720*1280,
960*960,
1088*832,
832*1088,
1920*1080,
1080*1920,
1440*1440,
1632*1248,
1248*1632
duration
enum<integer>

Video duration in seconds (24 fps). Supported values 5 or 10.

Available options:
5,
10
prompt_extend
boolean
default:true

Enable intelligent prompt rewriting (slightly longer latency, better detail).

seed
integer

Random seed [0, 2147483647].

Required range: 0 <= x <= 2147483647

Response

202 - application/json

Accepted - Task created successfully

task_info
object