Skip to main content
POST
/
vendors
/
klingai
/
v1
/
kling-v3-omni
/
text-to-video
/
generation
Text to Video Generation
curl --request POST \
  --url https://api.mulerouter.ai/vendors/klingai/v1/kling-v3-omni/text-to-video/generation \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "prompt": "A futuristic cityscape with flying cars at sunset",
  "mode": "pro",
  "aspect_ratio": "16:9",
  "duration": 5,
  "sound": "on"
}
'
{
  "task_info": {
    "id": "8e1e315e-b50d-4334-a231-be7d19a372f4",
    "status": "pending",
    "created_at": "2026-03-03T00:00:00Z",
    "updated_at": "2026-03-03T00:00:00Z"
  }
}

Overview

Generate videos from text prompts using the Kling V3 Omni model. V3 Omni introduces:
  • First/Last frame control — provide first_frame and last_frame images to guide video generation
  • Multi-shot video — generate multi-scene videos via multi_shot and multi_prompt
  • Audio generation — produce synchronized audio with sound: "on"
  • Extended duration — 3 to 15 seconds
  • Element references — reference up to 7 elements (images + elements combined)
  • Standard / Professional modes — 720P (std) or 1080P (pro) output

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
first_frame
string | null

First frame image (URL or Base64). Sets the opening frame of the generated video.

last_frame
string | null

Last frame image (URL or Base64). Sets the closing frame of the generated video.

prompt
string | null

Text prompt describing the video to generate. Either prompt or multi_prompt must be provided.

Maximum string length: 2500
multi_prompt
object[]

Multi-segment prompts for finer control. Either prompt or multi_prompt must be provided.

Required array length: 1 - 6 elements
negative_prompt
string | null

Negative prompt to exclude unwanted content.

Maximum string length: 2500
sound
enum<string>
default:off

Whether to generate sound for the video.

Available options:
on,
off
mode
enum<string>
default:pro

Generation mode. std for standard quality, pro for higher quality.

Available options:
std,
pro
aspect_ratio
enum<string> | null

Aspect ratio of the generated video.

Available options:
16:9,
9:16,
1:1
duration
integer
default:5

Duration of the generated video in seconds (3-15).

multi_shot
boolean
default:false

Whether to enable multi-shot generation.

shot_type
enum<string> | null

Shot type configuration.

Available options:
customize,
intelligence
elements
object[]

Element list. Combined count of images and elements must not exceed 7.

Maximum array length: 7

Response

Accepted - Task created successfully

task_info
object