Seedance 2.0 is ByteDance’s latest video generation model available through Anyfast API. It supports text-to-video, image-to-video, multimodal reference input, video editing, video extension, and synchronized audio generation.Documentation Index
Fetch the complete documentation index at: https://docs.anyfast.ai/llms.txt
Use this file to discover all available pages before exploring further.
Key capabilities
| Feature | Description |
|---|---|
| Text-to-video | Generate video from text prompts |
| Image-to-video (first frame) | Use an image as the first frame |
| Image-to-video (first + last frame) | Use two images as first and last frames |
| Multimodal reference | Combine images, videos, and audio as references (1–9 images, up to 3 videos, up to 3 audio clips) |
| Video editing | Modify elements in an existing video using reference images |
| Video extension | Extend and concatenate reference videos |
| Audio generation | Auto-generate synchronized voice, sound effects, and background music |
| Web search | Enhance generation with real-time internet content (text-to-video only) |
| Return last frame | Retrieve the last frame of generated video |
Output specifications
| Property | Value |
|---|---|
| Resolution | 480p, 720p, 1080p |
| Aspect ratio | 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive |
| Duration | 4–15 seconds |
| Format | mp4 |
| Frame rate | 24 fps |
Workflow
Asset management workflow
To use persistent image, video, or audio assets (such as a fixed character reference), upload them via the Asset Management API and reference them withasset://<ID> in generation requests.
Step 1: Create an asset group
Create an asset group to get a group ID.Step 2: Create an asset in the group
Use the group ID from Step 1 to upload an image asset (e.g., a character face reference photo).Step 3: Generate a video using the asset
Reference the asset ID from Step 2 to generate a video.asyn).
Step 4: Poll for the result
Query the generation status using the task ID.Notes
- Download links expire after 12 hours; re-fetch after expiry.
- If the task reaches 100% but returns an error, the content was likely blocked by the provider’s content moderation system (e.g., celebrity likeness or copyrighted content). Try modifying the prompt or replacing the reference image.
Examples
Text-to-video
Multimodal reference (image + video + audio)
Reference assets in the prompt with@image1, @video1, @audio1 — numbered per media type by their order in the content array.
Important: In the content array, after the text prompt, asset items must follow the order: image → video → audio.
cURL
Video editing
cURL
Web search enhanced (text-to-video only)
cURL
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | seedance |
content | array | Yes | Input content array (text, image_url, video_url, audio_url items) |
content[].type | string | Yes | text, image_url, video_url, or audio_url |
content[].text | string | Text items | Text prompt (max 500 Chinese chars / 1000 English words) |
content[].image_url.url | string | Image items | Image URL, Base64 data URI, or asset://<ID> |
content[].video_url.url | string | Video items | Video URL or asset://<ID> (mp4/mov, max 50 MB, 2–15s) |
content[].audio_url.url | string | Audio items | Audio URL, Base64 data URI, or asset://<ID> (wav/mp3, max 15 MB) |
content[].role | string | Conditional | first_frame, last_frame, reference_image, reference_video, reference_audio |
generate_audio | boolean | No | Generate synchronized audio. Default: true |
resolution | string | No | 480p, 720p, or 1080p. Default: 720p |
ratio | string | No | 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive. Default: adaptive |
duration | integer | No | 4–15 seconds. Default: 5 |
tools | array | No | [{"type": "web_search"}] for web search (text-to-video only) |
watermark | boolean | No | Add watermark. Default: false |
Input modes
| Mode | Content items | role values |
|---|---|---|
| Text-to-video | 1× text | — |
| Image-to-video (first frame) | text (optional) + 1× image_url | first_frame or omit |
| Image-to-video (first + last frame) | text (optional) + 2× image_url | first_frame + last_frame |
| Multimodal reference | text (optional) + image/video/audio | reference_image, reference_video, reference_audio |
| Video editing | text + image_url + video_url | reference_image + reference_video |
| Video extension | text + video_url(s) | reference_video |
Note: First frame, first+last frame, and multimodal reference are mutually exclusive — do not mix them in the same request.
Referencing assets in prompts
Inside the text prompt, you can refer to media items using@<type><N> placeholders — numbered per media type by their order in the content array:
| Placeholder | Refers to |
|---|---|
@image1, @image2, … | The 1st, 2nd, … image_url item |
@video1, @video2, … | The 1st, 2nd, … video_url item |
@audio1, @audio2, … | The 1st, 2nd, … audio_url item |
"The character from @image1 walks through the scene in @video1 with @audio1 as background".
Note: Assets in thecontentarray must follow the order: image → video → audio.@image<N>,@video<N>, and@audio<N>references are numbered per media type by their position in the array.
Resolution pixel values
| Resolution | 16:9 | 4:3 | 1:1 | 3:4 | 9:16 | 21:9 |
|---|---|---|---|---|---|---|
| 480p | 864x496 | 752x560 | 640x640 | 560x752 | 496x864 | 992x432 |
| 720p | 1280x720 | 1112x834 | 960x960 | 834x1112 | 720x1280 | 1470x630 |
| 1080p | 1920x1080 | 1664x1248 | 1440x1440 | 1248x1664 | 1080x1920 | 2208x944 |
API Reference
View the interactive API playground for Seedance 2.0.