Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.anyfast.ai/llms.txt

Use this file to discover all available pages before exploring further.

Seedance 2.0 is ByteDance’s latest video generation model available through Anyfast API. It supports text-to-video, image-to-video, multimodal reference input, video editing, video extension, and synchronized audio generation.

Key capabilities

FeatureDescription
Text-to-videoGenerate video from text prompts
Image-to-video (first frame)Use an image as the first frame
Image-to-video (first + last frame)Use two images as first and last frames
Multimodal referenceCombine images, videos, and audio as references (1–9 images, up to 3 videos, up to 3 audio clips)
Video editingModify elements in an existing video using reference images
Video extensionExtend and concatenate reference videos
Audio generationAuto-generate synchronized voice, sound effects, and background music
Web searchEnhance generation with real-time internet content (text-to-video only)
Return last frameRetrieve the last frame of generated video

Output specifications

PropertyValue
Resolution480p, 720p, 1080p
Aspect ratio16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive
Duration4–15 seconds
Formatmp4
Frame rate24 fps

Workflow

1. POST /v1/video/generations  →  task_id
2. Poll GET /v1/video/generations/{task_id}  →  status
3. When status = "succeeded"  →  download video URL (valid 24 hours)

Asset management workflow

To use persistent image, video, or audio assets (such as a fixed character reference), upload them via the Asset Management API and reference them with asset://<ID> in generation requests.

Step 1: Create an asset group

Create an asset group to get a group ID.
curl https://www.anyfast.ai/volc/asset/CreateAssetGroup \
  -H 'Authorization: Bearer YOUR_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "volc-asset",
    "Name": "your-custom-name"
}'
Response example:
{
    "Id": "group-20260427160000-xxxxx"
}

Step 2: Create an asset in the group

Use the group ID from Step 1 to upload an image asset (e.g., a character face reference photo).
curl https://www.anyfast.ai/volc/asset/CreateAsset \
  -H 'Authorization: Bearer YOUR_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "volc-asset",
    "GroupId": "group-20260427160000-xxxxx",
    "Name": "character-reference",
    "AssetType": "Image",
    "URL": "https://example.com/example.png"
}'
Response example:
{
    "Id": "asset-20260427160000-xxxxx"
}

Step 3: Generate a video using the asset

Reference the asset ID from Step 2 to generate a video.
curl https://www.anyfast.ai/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance",
    "content": [
      {
        "type": "text",
        "text": "The person from @image1 walks along a sunlit street, cinematic quality"
      },
      {
        "type": "image_url",
        "image_url": {
          "url": "asset://asset-20260427160000-xxxxx"
        },
        "role": "reference_image"
      }
    ],
    "resolution": "720p",
    "duration": 5
  }'
The API returns an asynchronous task ID (prefixed with asyn).

Step 4: Poll for the result

Query the generation status using the task ID.
curl https://www.anyfast.ai/v1/video/generations/asynxxxx \
  -H "Authorization: Bearer YOUR_API_KEY"
Once the task completes, the response includes a pre-signed S3 download link.
Notes
  • Download links expire after 12 hours; re-fetch after expiry.
  • If the task reaches 100% but returns an error, the content was likely blocked by the provider’s content moderation system (e.g., celebrity likeness or copyrighted content). Try modifying the prompt or replacing the reference image.

Examples

Text-to-video

curl https://www.anyfast.ai/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance",
    "content": [
      {
        "type": "text",
        "text": "A cat playing piano in a sunlit room, cinematic lighting"
      }
    ],
    "generate_audio": true,
    "ratio": "16:9",
    "duration": 8
  }'

Multimodal reference (image + video + audio)

Reference assets in the prompt with @image1, @video1, @audio1 — numbered per media type by their order in the content array.
Important: In the content array, after the text prompt, asset items must follow the order: image → video → audio.
cURL
curl https://www.anyfast.ai/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance",
    "content": [
      {
        "type": "text",
        "text": "The character from @image1 dances in the scene from @video1, with @audio1 as background music"
      },
      {
        "type": "image_url",
        "image_url": {"url": "https://example.com/character.jpg"},
        "role": "reference_image"
      },
      {
        "type": "video_url",
        "video_url": {"url": "https://example.com/clip.mp4"},
        "role": "reference_video"
      },
      {
        "type": "audio_url",
        "audio_url": {"url": "https://example.com/bgm.mp3"},
        "role": "reference_audio"
      }
    ],
    "generate_audio": true,
    "ratio": "16:9",
    "duration": 11
  }'

Video editing

cURL
curl https://www.anyfast.ai/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance",
    "content": [
      {
        "type": "text",
        "text": "Replace the water bottle in @video1 with the perfume bottle from @image1, keep camera movement unchanged"
      },
      {
        "type": "image_url",
        "image_url": {"url": "https://example.com/perfume.jpg"},
        "role": "reference_image"
      },
      {
        "type": "video_url",
        "video_url": {"url": "https://example.com/original.mp4"},
        "role": "reference_video"
      }
    ],
    "generate_audio": true,
    "ratio": "16:9",
    "duration": 5
  }'

Web search enhanced (text-to-video only)

cURL
curl https://www.anyfast.ai/v1/video/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance",
    "content": [
      {
        "type": "text",
        "text": "Macro shot of cherry blossoms in spring, petals falling slowly"
      }
    ],
    "generate_audio": true,
    "ratio": "16:9",
    "duration": 11,
    "tools": [{"type": "web_search"}]
  }'

Parameters

ParameterTypeRequiredDescription
modelstringYesseedance
contentarrayYesInput content array (text, image_url, video_url, audio_url items)
content[].typestringYestext, image_url, video_url, or audio_url
content[].textstringText itemsText prompt (max 500 Chinese chars / 1000 English words)
content[].image_url.urlstringImage itemsImage URL, Base64 data URI, or asset://<ID>
content[].video_url.urlstringVideo itemsVideo URL or asset://<ID> (mp4/mov, max 50 MB, 2–15s)
content[].audio_url.urlstringAudio itemsAudio URL, Base64 data URI, or asset://<ID> (wav/mp3, max 15 MB)
content[].rolestringConditionalfirst_frame, last_frame, reference_image, reference_video, reference_audio
generate_audiobooleanNoGenerate synchronized audio. Default: true
resolutionstringNo480p, 720p, or 1080p. Default: 720p
ratiostringNo16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive. Default: adaptive
durationintegerNo4–15 seconds. Default: 5
toolsarrayNo[{"type": "web_search"}] for web search (text-to-video only)
watermarkbooleanNoAdd watermark. Default: false

Input modes

ModeContent itemsrole values
Text-to-video1× text
Image-to-video (first frame)text (optional) + 1× image_urlfirst_frame or omit
Image-to-video (first + last frame)text (optional) + 2× image_urlfirst_frame + last_frame
Multimodal referencetext (optional) + image/video/audioreference_image, reference_video, reference_audio
Video editingtext + image_url + video_urlreference_image + reference_video
Video extensiontext + video_url(s)reference_video
Note: First frame, first+last frame, and multimodal reference are mutually exclusive — do not mix them in the same request.

Referencing assets in prompts

Inside the text prompt, you can refer to media items using @<type><N> placeholders — numbered per media type by their order in the content array:
PlaceholderRefers to
@image1, @image2, …The 1st, 2nd, … image_url item
@video1, @video2, …The 1st, 2nd, … video_url item
@audio1, @audio2, …The 1st, 2nd, … audio_url item
Example: "The character from @image1 walks through the scene in @video1 with @audio1 as background".
Note: Assets in the content array must follow the order: image → video → audio. @image<N>, @video<N>, and @audio<N> references are numbered per media type by their position in the array.

Resolution pixel values

Resolution16:94:31:13:49:1621:9
480p864x496752x560640x640560x752496x864992x432
720p1280x7201112x834960x960834x1112720x12801470x630
1080p1920x10801664x12481440x14401248x16641080x19202208x944

API Reference

View the interactive API playground for Seedance 2.0.