Key capabilities
| Feature | Description |
|---|---|
| Text-to-video | Generate video from text prompts |
| Image-to-video (first frame) | Use an image as the first frame |
| Image-to-video (first + last frame) | Use two images as first and last frames |
| Multimodal reference | Combine images, videos, and audio as references (1–9 images, up to 3 videos, up to 3 audio clips) |
| Video editing | Modify elements in an existing video using reference images |
| Video extension | Extend and concatenate reference videos |
| Audio generation | Auto-generate synchronized voice, sound effects, and background music |
| Web search | Enhance generation with real-time internet content (text-to-video only) |
| Return last frame | Retrieve the last frame of generated video |
Output specifications
| Property | Value |
|---|---|
| Resolution | 480p, 720p |
| Aspect ratio | 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive |
| Duration | 4–15 seconds |
| Format | mp4 |
Workflow
Examples
Text-to-video
Multimodal reference (image + video + audio)
cURL
Video editing
cURL
Web search enhanced (text-to-video only)
cURL
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | seedance-fast |
content | array | Yes | Input content array (text, image_url, video_url, audio_url items) |
content[].type | string | Yes | text, image_url, video_url, or audio_url |
content[].text | string | Text items | Text prompt (max 500 Chinese chars / 1000 English words) |
content[].image_url.url | string | Image items | Image URL, Base64 data URI, or asset://<ID> |
content[].video_url.url | string | Video items | Video URL or asset://<ID> (mp4/mov, max 50 MB, 2–15s) |
content[].audio_url.url | string | Audio items | Audio URL, Base64 data URI, or asset://<ID> (wav/mp3, max 15 MB) |
content[].role | string | Conditional | first_frame, last_frame, reference_image, reference_video, reference_audio |
generate_audio | boolean | No | Generate synchronized audio. Default: true |
resolution | string | No | 480p or 720p. Default: 720p |
ratio | string | No | 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive. Default: adaptive |
duration | integer | No | 4–15 seconds, or -1 for auto. Default: 5 |
tools | array | No | [{"type": "web_search"}] for web search (text-to-video only) |
watermark | boolean | No | Add watermark. Default: false |
Input modes
| Mode | Content items | role values |
|---|---|---|
| Text-to-video | 1× text | — |
| Image-to-video (first frame) | text (optional) + 1× image_url | first_frame or omit |
| Image-to-video (first + last frame) | text (optional) + 2× image_url | first_frame + last_frame |
| Multimodal reference | text (optional) + image/video/audio | reference_image, reference_video, reference_audio |
| Video editing | text + image_url + video_url | reference_image + reference_video |
| Video extension | text + video_url(s) | reference_video |
Note: First frame, first+last frame, and multimodal reference are mutually exclusive — do not mix them in the same request.
Resolution pixel values
| Resolution | 16:9 | 4:3 | 1:1 | 3:4 | 9:16 | 21:9 |
|---|---|---|---|---|---|---|
| 480p | 864x496 | 752x560 | 640x640 | 560x752 | 496x864 | 992x432 |
| 720p | 1280x720 | 1112x834 | 960x960 | 834x1112 | 720x1280 | 1470x630 |
API Reference
View the interactive API playground for Seedance 2.0 Fast.