Getting Started 🎉

✴️ D-ID's API for Developers


It's time to supercharge your product with the amazing Generative AI capabilities of D-ID's API. Using our state-of-the-art, battle-tested products, you can now create videos of Digital Humans at an unbelievable scale. Combine your ideas with our tech to create astonishing visuals that are high quality and customized to your needs. This API Integration Guide is specifically designed to help you incorporate the D-ID API into your applications with ease. This guide covers the essentials, such as authentication, API request construction, response handling, product descriptions, and use cases. A single image is all it takes to create a speaking digital human.

✴️ D-ID API Products

Talks
Speaking Portrait
Clips
Premium Presenters
Animations
Live Portrait
Agents
⭐️ New ⭐️


Create talking head videos from just text or audio, to make business content more cost-effective, engaging and human. Based on still photos. Enabling FULL-HD photorealistic avatars using just text or audio as input. High quality, body & hands movements, based on video footage. Live Portrait breathes life into any still photo. The Live Portrait process uses a driver video to animate a person in a still photo to precisely match the driver’s facial movements. D-ID Agents redefine digital connections, making them more personal, engaging, and human. Select your Agent’s appearance, and provide its knowledge.

See Talks endpoint See Clips endpoint See Animations endpoint See Agents endpoint


✴️ New Updates

What's New?
Always stay up to date with new releases and updates

NEW: Agents API is here! ⭐️
By blending the smarts of advanced language models with the warmth of face-to-face communication, D-ID Agents redefine digital connections, making them more personal, engaging, and human. All you have to do is select your Agent’s appearance, choose its voice, describe how you want it to interact, and provide it with documents to augment and personalize its knowledge base. You’ll have a digital person you can speak with in minutes, just like a real human. Check it out!

NEW: Discover dozens of new HQ Presenters now ready to use! ⭐️
Enabling FULL-HD photorealistic avatars, medium-shot, with body and hands movements using just text or audio as input. You can also create a custom HQ Presenter in Full-HD resolution based on your own video footage. Check it out!

NEW: HQ Presenters (Clips) are now streamable in real-time! ⭐️
D-ID's Clips Live Streaming API allows you to use D-ID’s AI tools to generate videos of our high-quality digital humans, in real-time. This powerful functionality opens up various use cases, such as virtual assistants, interactive broadcasting, online education & training, and more. Check it out!

NEW: Tailored API plans for developers! ⭐️
D-ID's tailored API plans are specifically designed to cater to your product's lifecycle. Whether you're in the build phase, scaling up, or ready to launch, we have a plan for you, ensuring that as your needs change, your costs remain predictable and manageable. Check it out!


✴️ API Video Tutorial

D-ID's API - Step by Step
Live Coding Session



✴️ Live Streaming

D-ID’s API now supports synchronistic generation of talking head videos from an image and text or audio file. Integrate it with your AI chatbot to create face-to-face CX conversations, use it to create real-time video call avatars or add it to your character-based online game. The possibilities are endless!

Live Streaming
Examples


Giving a Face to Conversational AI
chat.D-ID is a web app that uses real-time face animation and advanced text-to-speech to create an immersive and human-like conversational AI experience. The free app lets you speak face-to-face with ChatGPT. Try it out live.
Adding a Human Touch to AI
The real-time capabilities of the tech can be integrated with both open and closed domain AI models, enabling businesses of all sizes to create a more personal connection with their clients, employees, and communities.
Real-time video streaming opens up a new world of possibilities
D-ID’s API is robust, massively scalable and super simple to use – integrate it in just four lines of code. It now also supports streaming generation of talking head videos from an image and audio file. Build a whole ecosystem around our platform. The possibilities are endless.

See Streams endpoint for more details


✴️ Superfast Performance

D-ID’s Rendering time is 100 FPS, that's 4X faster than real-time! The fastest text-to-video solution in the world. Generate your videos at scale. D-ID's API handles tens of thousands of requests in parallel, with unbeatable service and robust performance. Over 150 million videos have been generated to date.

100 FPS
4X Faster than Realtime



✴️ Facial Expressions


NEW: Creating engaging visuals is all about capturing the attention of the viewer. With D-ID's API, you can take your visuals to the next level by controlling the expressions of your avatar. Adding expressions to your avatar can make them more engaging, fun, and lifelike. This can help boost the engagement with your viewers and increase the overall enjoyment of your visuals. Click here to learn more.

Standard Result
Neutral Expression
Results with Expressions
Different facial expressions results


Neutral Happy Surprise Serious


✴️ Learn more about D-ID

Meet the Natural User Interface (NUI) by D-ID. The interface that humanizes interactions with everything digital. Build interfaces users can talk to and that understand them. A face-to-face conversation with AI.

Natural User Interface (NUI)
www.d-id.com



✴️ Support


Have any questions? We are here to help! Please leave your question in the Discussions section and we will be happy to answer shortly.

Ask a question