D-ID
Reviews, test reports and deep-dive analysis
AI talking head videos from a photo — ideal for personalized content
🌐 Website Preview
d-id.com
Details
Pros
- Free tier available
- API available
- GDPR compliant
- EU server location
- Commercial license included
- Text to video generation
- Image to video conversion
- Lip sync support
Cons
- No camera control
- No video-to-video editing
Profile: D-ID
| Company | D-ID |
| Type | AI Video Generation |
| Founded | 2017 |
| Headquarters | Tel Aviv, Israel |
| Server Location | US, EU |
| GDPR Status | ✅ Compliant |
| Free Tier | Yes |
| Starting Price | $6/mo |
| Pricing Model | CREDITS |
| Website | d-id.com |
About D-ID
D-ID specializes in AI-generated talking avatars and faces, enabling anyone to create videos where AI-generated or photographed faces speak any text with realistic lip-sync. Founded in 2017 and headquartered in Tel Aviv, Israel, D-ID began as a privacy technology company (de-identification) before pivoting to generative AI video.
The platform's core product, Creative Reality Studio, transforms a single photo into a speaking avatar. Upload a portrait, enter text or upload audio, and D-ID generates a video with natural head movements and lip-sync. The technology supports 119 languages with various voice options. For more advanced use cases, D-ID offers streaming avatars for real-time interactive applications like customer service and virtual assistants.
Pricing starts with a free trial (5 minutes of video). The Lite plan at $5.99/month provides 10 minutes, the Pro plan at $46/month offers 25 minutes, and the Advanced plan at $299/month gives 65 minutes. Enterprise plans are available for custom requirements. API pricing enables integration into third-party applications.
D-ID is primarily used for customer-facing interactions, personalized marketing, e-learning content, and accessibility applications. The ability to make historical photos or illustrations "speak" has found creative applications in museums, education, and memorial services. It's more focused and specialized than general-purpose video generators, excelling specifically at the talking-head use case with minimal input requirements.