Kling V3.0 Text to Video
Generate videos from text prompts using the Kling v3.0 model with multi-shot, audio generation, and extended duration support.
Documentation Index
Fetch the complete documentation index at: https://mulerouter.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Generate videos from text prompts using the Kling V3.0 model. V3.0 introduces:- Multi-shot video — generate multi-scene videos in a single task via
multi_shotandmulti_prompt - Audio generation — produce synchronized audio with
sound: "on" - Extended duration — 3 to 15 seconds in 1-second increments
- Standard / Professional modes — 720P (std) or 1080P (pro) output
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
First frame image (URL or Base64). Sets the opening frame of the generated video.
Last frame image (URL or Base64). Sets the closing frame of the generated video.
Positive text prompt. Cannot exceed 2500 characters.
Required when multi_shot is false or when shot_type is intelligence.
2500Negative text prompt. Cannot exceed 2500 characters.
2500Whether to generate a multi-shot video.
true: Enable multi-shot mode.promptis ignored; useshot_typeandmulti_promptinstead.false: Single-shot mode (default).
Shot segmentation method. Required when multi_shot is true.
customize: Custom shots, requiresmulti_prompt.intelligence: AI-generated shots, requiresprompt.
customize, intelligence Shot prompt list for multi-shot videos.
- Max 6 shots, min 1 shot.
- Each shot prompt max 512 characters.
- Each shot duration must not exceed total duration and must be >= 1.
- Sum of all shot durations must equal total task duration.
Required when multi_shot is true and shot_type is customize.
1 - 6 elementsGenerate audio simultaneously when generating videos.
on: Enable audio generationoff: Disable audio generation (silent video)
on, off Video generation mode.
std: Standard Mode (720P), cost-effective.
pro: Professional Mode (1080P), higher quality video output.
std, pro The aspect ratio of the generated video frame (width:height).
16:9, 9:16, 1:1 Video length in seconds (3-15).
Element definitions. Max 3 elements.
Provide frontal and reference images.
Use <<<element_1>>> in prompt to reference elements.
1 - 3 elementsResponse
Accepted - Task created successfully

