Clips Streams Overview 📙

✴️ Streams Overview


The D-ID Clips Live Streaming API allows you to use D-ID’s AI tools to generate videos of our high quality digital humans, in real time. This powerful functionality opens up various use cases, such as virtual assistants, interactive broadcasting, online education & training, and more. This document provides an overview of the Live Streaming API's features and guides you through the steps required to set up a video streaming session in real time. See a working example of Chat.D-ID app utilizes realtime streaming API here

✴️ See a Working Example

Chat D-ID App
chat.d-id.com

Chat.D-ID is an interactive chatbot application that showcases the power of the D-ID Live Streaming API. It features a real-time chatbot digital human that engages in dynamic conversations with users. You can experience the power of real-time video streaming as the digital human avatar responds to your queries with personalized video messages. Try it

Tell us what type of SDK would you choose ⭐️
We plan to release an SDK that designed to enhance our developers experience. This SDK will primarily focus on simplifying the utilization of our streaming API, while also ensuring effortless access to a wider array of features. Vote here.

✴️ When Not to Use Streaming

The Live Streaming API provides a dedicated /clips/streams endpoint specifically designed for real-time video streaming. However, if your requirement involves asynchronous video generation, where you can submit input and receive the generated video as a downloadable video file once it's ready, you should refer to the /clips endpoint instead.

✴️ Streaming Protocol

D-ID’s Live Streaming protocol is based on WebRTC (Web Real-Time Communication) which is a technology that enables real-time communication, including audio, video, and data streaming, directly between web browsers or other compatible applications. It establishes a peer-to-peer connection between the participants, allowing for efficient and low-latency streaming. To learn more about WebRTC and its underlying concepts, you can visit the WebRTC website. In the context of this document, we'll focus on the key aspects related to setting up a video streaming session.


✴️ Terminology

WebRTC
create a new stream
SDP Offer
start a stream
ICE Candidates
submit network information


WebRTC establishes a connection between two or more parties, allowing them to exchange audio, video, and data. This connection is peer-to-peer and is established using D-ID’s signaling server. Session Description Protocol (SDP) is used to negotiate and exchange session details between peers. The initiating peer sends an SDP offer containing its capabilities, and the receiving peer responds with an SDP answer that includes its own capabilities. Interactive Connectivity Establishment (ICE) is a technique used to determine the most suitable network path between peers. ICE candidates represent possible IP addresses and transport protocols that can be used for the connection.



✴️ Ready to Start

5 Steps
is all it takes


✴️ Step 1: Create a new stream

To initiate a video streaming session, make a POST request to /clips/streams endpoint. In the request’s body, you must provide a presenter_id and a driver_id pointing to the presenter you wish to animate in the stream. To learn more about presenters, check out the Clips Overview page.

This request will provide you with a unique id (referred to as stream_id in other requests) and a session ID. The stream ID serves as a unique identifier for the streaming session, while the session ID needs to be included in subsequent requests' bodies to ensure they reach the correct server instance.

Here's an example of the request you should send:

const sessionResponse = await fetchWithRetries(`https://api.d-id.com/clips/streams`, {
    method: 'POST',
    headers: {
      Authorization: `Basic {YOUR_DID_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      presenter_id: `{PRESENTER_ID}`,
      driver_id: `{DRIVER_ID}`
    }),
  });

And an example of the response you would get:

{
  "id": "your_stream_id",
  "session_id": "your_session_id",
  "offer": "your_sdp_offer",
  "ice_servers": [
    {
      "urls": ["stun:stun.example.com"]
    }
  ]
}

Make sure to extract and store both the stream ID (your_stream_id) and session ID (your_session_id) for further usage in subsequent steps.


✴️ Step 2: Starting the stream

After receiving the SDP offer from the server in Step 1, you need to generate the SDP answer and send it back. To obtain the SDP answer, you can use WebRTC APIs or libraries that provide the necessary functionality. Here is a general outline of the steps involved:

  1. Create a WebRTC peer connection object in your application.
  2. Set the received SDP offer as the remote description of the peer connection using the setRemoteDescription() method.
  3. Generate the SDP answer by calling the createAnswer() method on the peer connection.
  4. Set the generated SDP answer as the local description of the peer connection using the setLocalDescription() method.

Once you have obtained the SDP answer as a string, you can send it back to the server using the /clips/streams/{session_id}/sdp endpoint.

Here’s an example of the answer creation code, taken from this example repository:

async function createPeerConnection(offer, iceServers) {
  if (!peerConnection) {
    peerConnection = new RTCPeerConnection({ iceServers });
// Here we add event listeners for any events we want to handle
    peerConnection.addEventListener('icegatheringstatechange', onIceGatheringStateChange, true);
    peerConnection.addEventListener('icecandidate', onIceCandidate, true);
    peerConnection.addEventListener('iceconnectionstatechange', onIceConnectionStateChange, true);
    peerConnection.addEventListener('connectionstatechange', onConnectionStateChange, true);
    peerConnection.addEventListener('signalingstatechange', onSignalingStateChange, true);
    peerConnection.addEventListener('track', onTrack, true);
  }

  await peerConnection.setRemoteDescription(offer);
  const sessionClientAnswer = await peerConnection.createAnswer();
  await peerConnection.setLocalDescription(sessionClientAnswer);

  return sessionClientAnswer;
}
...
const sdpResponse = await fetch(`https://api.d-id.com/clips/streams/${streamId}/sdp`, {
    method: 'POST',
    headers: {
      Authorization: `Basic {YOUR_DID_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      answer: sessionClientAnswer,
      session_id: sessionId,
    }),
  });
};


✴️ Step 3: Submit network information

Once the SDP answer is sent, you must gather ICE candidates and send them to the server to complete the WebRTC handshake. ICE candidates allow the peers to discover and establish an optimal network path for communication.

Listen for the icecandidate event on your peer connection object and send each ICE candidate to the server using the /clips/streams/{stream_id}/ice endpoint. Replace {stream_id} with the appropriate stream ID obtained in Step 1. From the ice candidates you receive, you should only send the candidate, sdpMid, and sdpMLineIndex attributes.

Here’s an example of the icecandidate event handler from our streaming demo repository:

function onIceCandidate(event) {
  if (event.candidate) {
    const { candidate, sdpMid, sdpMLineIndex } = event.candidate;

    fetch(`https://api.d-id.com/clips/streams/${streamId}/ice`, {
      method: 'POST',
      headers: {
        Authorization: `Basic {YOUR_DID_API_KEY}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        candidate,
        sdpMid,
        sdpMLineIndex,
        session_id: sessionId,
      }),
    });
  }
}

Waiting for Connection Readiness:

After sending the SDP answer and the ICE candidates, you need to wait for the WebRTC connection to become ready. Listen for the iceconnectionstatechange event on your peer connection object and check for the iceConnectionState property. When the connection state changes to connected or completed, the connection is ready to proceed. This event listener is one of those we used in Step 2, specifically, onIceConnectionStateChange


✴️ Step 4: Create a clip stream

With the connection established, you can now create a clip. Make a POST request to /clips/streams/{stream_id} endpoint to request a video to be created and streamed over the established connection. Remember to include the session ID in the request body. In this request you can send the details of the audio or text for the avatar to speak, along with additional configuration options that allow for greater flexibility and customization.


✴️ Step 5: Closing the stream

To close the video streaming session, make a DELETE request to /clips/streams/{stream_id} endpoint. This will close the connection and end the session. If no messages are sent within the session for 5 minutes, the session will be automatically terminated.

Here is an example of the request:

fetch(`https://api.d-id.com/clips/streams/${streamId}`, {
    method: 'DELETE',
    headers: {
      Authorization: `Basic {YOUR_DID_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ session_id: sessionId }),
  });

✴️ Developer Starter Code

For a code example demonstrating the entire process, you can visit the D-ID Live Streaming Demo repository. The repository provides a starter code to help you get started with implementing the D-ID Streaming API in your own applications. By following these steps, you can leverage the D-ID Live Streaming API to create engaging and interactive video streaming experiences that incorporate real-time speaking digital avatars.

Initial setup:

  • Clone this repo from GitHub
  • Install express: open a terminal in the folder and run npm install express
  • Add your API key: edit the api.json inside the uncompressed folder and add your key there
    • Make sure that the service parameter in api.json is set to clips

Start the example app:

  • Bring up the app in the folder (ctr left click on folder through finder)
  • Open the terminal, run node app.js
  • You should see this message: server started on port localhost:3000
  • Open the app in the browser, add localhost:3000
  • Connect: press connect you should see the connection ready
  • Stream: press the start button to start streaming
{
    "key": "🤫",
    "url": "https://api.d-id.com",
  	"service": "clips",
}
const express = require('express');
const http = require('http');

const app = express();
app.use('/', express.static(__dirname));
const server = http.createServer(app);
server.listen(3000, () => console.log('Server started on port localhost:3000'));
'use strict';

import DID_API from './api.json' assert { type: 'json' };
if (DID_API.key == '🤫') alert('Please put your api key inside ./api.json and restart..')

const RTCPeerConnection = (window.RTCPeerConnection || window.webkitRTCPeerConnection || window.mozRTCPeerConnection).bind(window);

let peerConnection;
let streamId;
let sessionId;
let sessionClientAnswer;


const talkVideo = document.getElementById('talk-video');
talkVideo.setAttribute('playsinline', '');
const peerStatusLabel = document.getElementById('peer-status-label');
const iceStatusLabel = document.getElementById('ice-status-label');
const iceGatheringStatusLabel = document.getElementById('ice-gathering-status-label');
const signalingStatusLabel = document.getElementById('signaling-status-label');

const connectButton = document.getElementById('connect-button');
connectButton.onclick = async () => {
  if (peerConnection && peerConnection.connectionState === 'connected') {
    return;
  }

  stopAllStreams();
  closePC();

  const sessionResponse = await fetch(`${DID_API.url}/talks/streams`, {
    method: 'POST',
    headers: {'Authorization': `Basic ${DID_API.key}`, 'Content-Type': 'application/json'},
    body: JSON.stringify({
      source_url: "https://d-id-public-bucket.s3.amazonaws.com/or-roman.jpg"
    }),
  });

  
  const { id: newStreamId, offer, ice_servers: iceServers, session_id: newSessionId } = await sessionResponse.json()
  streamId = newStreamId;
  sessionId = newSessionId;
  
  try {
    sessionClientAnswer = await createPeerConnection(offer, iceServers);
  } catch (e) {
    console.log('error during streaming setup', e);
    stopAllStreams();
    closePC();
    return;
  }

  const sdpResponse = await fetch(`${DID_API.url}/talks/streams/${streamId}/sdp`,
    {
      method: 'POST',
      headers: {Authorization: `Basic ${DID_API.key}`, 'Content-Type': 'application/json'},
      body: JSON.stringify({answer: sessionClientAnswer, session_id: sessionId})
    });
};

const talkButton = document.getElementById('talk-button');
talkButton.onclick = async () => {
  // connectionState not supported in firefox
  if (peerConnection?.signalingState === 'stable' || peerConnection?.iceConnectionState === 'connected') {
    const talkResponse = await fetch(`${DID_API.url}/talks/streams/${streamId}`,
      {
        method: 'POST',
        headers: { Authorization: `Basic ${DID_API.key}`, 'Content-Type': 'application/json' },
        body: JSON.stringify({
          'script': {
            'type': 'audio',
            'audio_url': 'https://d-id-public-bucket.s3.us-west-2.amazonaws.com/webrtc.mp3',
          },
          'driver_url': 'bank://lively/',
          'config': {
            'stitch': true,
          },
          'session_id': sessionId
        })
      });
  }};

const destroyButton = document.getElementById('destroy-button');
destroyButton.onclick = async () => {
  await fetch(`${DID_API.url}/talks/streams/${streamId}`,
    {
      method: 'DELETE',
      headers: {Authorization: `Basic ${DID_API.key}`, 'Content-Type': 'application/json'},
      body: JSON.stringify({session_id: sessionId})
    });

  stopAllStreams();
  closePC();
};

function onIceGatheringStateChange() {
  iceGatheringStatusLabel.innerText = peerConnection.iceGatheringState;
  iceGatheringStatusLabel.className = 'iceGatheringState-' + peerConnection.iceGatheringState;
}
function onIceCandidate(event) {
  console.log('onIceCandidate', event);
  if (event.candidate) {
    const { candidate, sdpMid, sdpMLineIndex } = event.candidate;
    
    fetch(`${DID_API.url}/talks/streams/${streamId}/ice`,
      {
        method: 'POST',
        headers: {Authorization: `Basic ${DID_API.key}`, 'Content-Type': 'application/json'},
        body: JSON.stringify({ candidate, sdpMid, sdpMLineIndex, session_id: sessionId})
      }); 
  }
}
function onIceConnectionStateChange() {
  iceStatusLabel.innerText = peerConnection.iceConnectionState;
  iceStatusLabel.className = 'iceConnectionState-' + peerConnection.iceConnectionState;
  if (peerConnection.iceConnectionState === 'failed' || peerConnection.iceConnectionState === 'closed') {
    stopAllStreams();
    closePC();
  }
}
function onConnectionStateChange() {
  // not supported in firefox
  peerStatusLabel.innerText = peerConnection.connectionState;
  peerStatusLabel.className = 'peerConnectionState-' + peerConnection.connectionState;
}
function onSignalingStateChange() {
  signalingStatusLabel.innerText = peerConnection.signalingState;
  signalingStatusLabel.className = 'signalingState-' + peerConnection.signalingState;
}
function onTrack(event) {
  const remoteStream = event.streams[0];
  setVideoElement(remoteStream);
}

async function createPeerConnection(offer, iceServers) {
  if (!peerConnection) {
    peerConnection = new RTCPeerConnection({iceServers});
    peerConnection.addEventListener('icegatheringstatechange', onIceGatheringStateChange, true);
    peerConnection.addEventListener('icecandidate', onIceCandidate, true);
    peerConnection.addEventListener('iceconnectionstatechange', onIceConnectionStateChange, true);
    peerConnection.addEventListener('connectionstatechange', onConnectionStateChange, true);
    peerConnection.addEventListener('signalingstatechange', onSignalingStateChange, true);
    peerConnection.addEventListener('track', onTrack, true);
  }

  await peerConnection.setRemoteDescription(offer);
  console.log('set remote sdp OK');

  const sessionClientAnswer = await peerConnection.createAnswer();
  console.log('create local sdp OK');

  await peerConnection.setLocalDescription(sessionClientAnswer);
  console.log('set local sdp OK');

  return sessionClientAnswer;
}

function setVideoElement(stream) {
  if (!stream) return;
  talkVideo.srcObject = stream;

  // safari hotfix
  if (talkVideo.paused) {
    talkVideo.play().then(_ => {}).catch(e => {});
  }
}

function stopAllStreams() {
  if (talkVideo.srcObject) {
    console.log('stopping video streams');
    talkVideo.srcObject.getTracks().forEach(track => track.stop());
    talkVideo.srcObject = null;
  }
}

function closePC(pc = peerConnection) {
  if (!pc) return;
  console.log('stopping peer connection');
  pc.close();
  pc.removeEventListener('icegatheringstatechange', onIceGatheringStateChange, true);
  pc.removeEventListener('icecandidate', onIceCandidate, true);
  pc.removeEventListener('iceconnectionstatechange', onIceConnectionStateChange, true);
  pc.removeEventListener('connectionstatechange', onConnectionStateChange, true);
  pc.removeEventListener('signalingstatechange', onSignalingStateChange, true);
  pc.removeEventListener('track', onTrack, true);
  iceGatheringStatusLabel.innerText = '';
  signalingStatusLabel.innerText = '';
  iceStatusLabel.innerText = '';
  peerStatusLabel.innerText = '';
  console.log('stopped peer connection');
  if (pc === peerConnection) {
    peerConnection = null;
  }
}

See on GitHub


✴️ Video Tutorial

Disclaimer: This video gives instructions on how to use Talks Streams. To use Clips streams, change the value of the service parameter in your api.json file to clips. That will make the demo send the requests to /clips endpoints and adjust request payloads accordingly, most notably replacing the source_url in the stream creation step with presenter_id and driver_id as seen here.

D-ID's API - Streams Endpoint
Live Coding Session


✴️ Support


Have any questions? We are here to help! Please leave your question in the Discussions section and we will be happy to answer shortly.

Ask a question