Discussions

Ask a Question
Back to All

About ElevenLabs Voice Config

Using Elevenlab audio in the /talk endpoint using D-ID's API.

Even though the Voice config is set as follows in the request, the Stalability(=30%) and Similarity(=70%) shown in the Elevenlab history does not reflect this correctly.
Stalability seems poor when checking the actual video generated by the D-ID API.

How can I ensure that the Elevenlab Voice Config is reflected correctly?

Request body of /talk

POST https://api.d-id.com/talks

{
  "script": {
    "type": "text",
    "subtitles": "false",
    "provider": {
      "type": "elevenlabs",
      "voice_id": "{{voice_id}}",
      "model_id": "eleven_flash_v2_5",
      "voice_config": {
        "stability": 1,
        "similarity_boost": 1
        }
    },
    "input": "Hello, I am John. Pleased to meet you."
  },
  "config": {
    "logo": "false",
    "fluent": "false",
    "kool": "0.0",
    "stitch": "true"
  },
  "source_url": "https://xxx/sample.png"
  
}

Response body of Get talk

GET https://api.d-id.com/talks/{{talk_id}}

{
    "user": {
      {....}
    },
    "script": {
        "length": 38,
        "subtitles": false,
        "type": "text",
        "provider": {
            "type": "elevenlabs",
            "voice_id": "xxxxxx",
            "model_id": "eleven_flash_v2_5"
        }
    },
    ....
}