Kling Advanced Lip Sync

Advanced Lip Sync drives a person’s lip movements in a video to match provided speech. It supports two input modes — synthesize speech from text, or supply your own audio file. This endpoint requires a session_id obtained from the Identify Face step. Always call Identify Face first.

Workflow overview

identify-face  →  session_id
advanced-lip-sync (session_id + speech input)  →  task_id
Poll GET /kling/v1/videos/advanced-lip-sync/{task_id}  →  video URL

Input modes

Text mode — built-in TTS

Provide text, voice_id, and voice_language. The platform synthesizes the audio using the specified voice and drives the lip movements.

curl https://www.anyfast.ai/kling/v1/videos/advanced-lip-sync \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "session_id": "YOUR_SESSION_ID",
      "text": "Hello, welcome to my channel.",
      "voice_id": "girlfriend_1_cn",
      "voice_language": "en"
    }
  }'

Audio mode — custom audio file

Provide audio_url to drive lip movements directly from an existing audio recording.

curl https://www.anyfast.ai/kling/v1/videos/advanced-lip-sync \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "session_id": "YOUR_SESSION_ID",
      "audio_url": "https://example.com/speech.mp3"
    }
  }'

Parameters

Parameter	Type	Required	Description
`input.session_id`	string	Yes	Session ID from the Identify Face step
`input.face_image_url`	string	No	Reference face image URL for identity consistency
`input.text`	string	Text mode	Text for the character to speak
`input.voice_id`	string	Text mode	Voice ID for TTS synthesis. See the Voice ID reference for available voices with audio previews.
`input.voice_language`	string	Text mode	Language code: `zh` or `en`
`input.audio_url`	string	Audio mode	Public URL of an audio file

Polling

After the task is created, poll with GET /kling/v1/videos/advanced-lip-sync/{task_id} using the Task Query endpoint. Status transitions: queued → processing → succeeded / failed. On success, the video download URL is available in data.data.task_result.videos[0].url.

Prerequisites: Identify Face

You must call this first to obtain a session_id.

Voice ID Reference

Browse all available voice IDs with audio previews to choose the right voice for your lip sync.

API Reference

View the interactive API playground for Kling Advanced Lip Sync.

​Workflow overview

​Input modes

​Text mode — built-in TTS

​Audio mode — custom audio file

​Parameters

​Polling