Skip to main content
POST
/
vendors
/
klingai
/
v1
/
kling-v3
/
text-to-video
/
generation
Text to Video Generation
curl --request POST \
  --url https://api.mulerouter.ai/vendors/klingai/v1/kling-v3/text-to-video/generation \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "prompt": "A futuristic cityscape with flying cars at sunset",
  "mode": "pro",
  "aspect_ratio": "16:9",
  "duration": 10,
  "sound": "on"
}
'
{
  "task_info": {
    "id": "8e1e315e-b50d-4334-a231-be7d19a372f4",
    "status": "pending",
    "created_at": "2026-02-14T00:00:00.000Z",
    "updated_at": "2026-02-14T00:00:00.000Z"
  }
}

Overview

Generate videos from text prompts using the Kling V3.0 model. V3.0 introduces:
  • First/Last frame control — provide first_frame and last_frame images to guide video generation
  • Multi-shot video — generate multi-scene videos in a single task via multi_shot and multi_prompt
  • Audio generation — produce synchronized audio with sound: "on"
  • Extended duration — 3 to 15 seconds in 1-second increments
  • Element references — reference subjects via elements with frontal and reference images
  • Standard / Professional modes — 720P (std) or 1080P (pro) output

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
first_frame
string | null

First frame image (URL or Base64). Sets the opening frame of the generated video.

last_frame
string | null

Last frame image (URL or Base64). Sets the closing frame of the generated video.

prompt
string

Positive text prompt. Cannot exceed 2500 characters.

Required when multi_shot is false or when shot_type is intelligence.

Maximum string length: 2500
negative_prompt
string

Negative text prompt. Cannot exceed 2500 characters.

Maximum string length: 2500
multi_shot
boolean
default:false

Whether to generate a multi-shot video.

  • true: Enable multi-shot mode. prompt is ignored; use shot_type and multi_prompt instead.
  • false: Single-shot mode (default).
shot_type
enum<string>

Shot segmentation method. Required when multi_shot is true.

  • customize: Custom shots, requires multi_prompt.
  • intelligence: AI-generated shots, requires prompt.
Available options:
customize,
intelligence
multi_prompt
object[] | null

Shot prompt list for multi-shot videos.

  • Max 6 shots, min 1 shot.
  • Each shot prompt max 512 characters.
  • Each shot duration must not exceed total duration and must be >= 1.
  • Sum of all shot durations must equal total task duration.

Required when multi_shot is true and shot_type is customize.

Required array length: 1 - 6 elements
sound
enum<string>
default:off

Generate audio simultaneously when generating videos.

  • on: Enable audio generation
  • off: Disable audio generation (silent video)
Available options:
on,
off
mode
enum<string>
default:std

Video generation mode.

std: Standard Mode (720P), cost-effective. pro: Professional Mode (1080P), higher quality video output.

Available options:
std,
pro
aspect_ratio
enum<string>
default:16:9

The aspect ratio of the generated video frame (width:height).

Available options:
16:9,
9:16,
1:1
duration
integer
default:5

Video length in seconds (3-15).

elements
object[] | null

Element definitions. Max 3 elements. Provide frontal and reference images. Use <<<element_1>>> in prompt to reference elements.

Required array length: 1 - 3 elements

Response

202 - application/json

Accepted - Task created successfully

task_info
object