Watch My HeyGen Avatar Clone in Action: Studio Setup, Tech Stack, and First Results
Elizabeth Gearhart, Ph.D.
CMO, Gearhart Law · Founder, Gear Media Studios · AI Marketing Strategist
I've written about building my HeyGen avatar clone and the difference between avatar clones and thought clones. Now you can see the actual result. This short video shows my clone in action and explains the professional studio setup that produced it.
What You're Seeing
This is my HeyGen avatar clone — a digital version of me trained on 10 minutes of studio footage. The training session covered a full range of emotions and natural expressions: happy, serious, curious, laughing, and conversational moments. That emotional variety is what gives the AI enough material to produce a clone that feels authentic rather than robotic.
The clone isn't perfect yet — I'm still iterating — but the results from a professional recording chain are noticeably better than what I've seen from people who recorded directly through the HeyGen app.
The Studio Tech Stack
The recording chain made a significant difference. Here's exactly what was used and why each component matters for AI avatar quality:
| Equipment | Role | Why It Matters for HeyGen |
|---|---|---|
| Rodecaster Mixer | Audio capture and mixing | Broadcast-quality audio input for voice cloning accuracy |
| Blackmagic ATEM | Video switching & output | Combines video + audio into a single clean signal |
| 4K Camera | Video capture | High-resolution facial detail for accurate avatar rendering |
| SSD Drive | Recording destination | Lossless recording — no software compression artifacts |
| Professional Lighting | Even front illumination | Eliminates shadows that confuse facial recognition AI |
Key Insight: The SSD Recording Chain
Routing audio through the Rodecaster into the Blackmagic ATEM, then recording the combined output directly to an SSD, eliminated the compression artifacts that come from software-based recording or third-party export tools like CapCut. The result is lossless source material — which is exactly what AI avatar training needs.
Full Video Transcript
This is my clone. I recorded 10 minutes of video showing varying emotions and looking at the camera.
I recorded it in my studio with a lot of light, a good camera and high quality mic.
I recorded the sound into the Rodecaster mixer, then took the output and fed it into a Blackmagic ATEM.
The video output with sound from the ATEM was recorded onto an SSD drive. That seemed to give the highest quality.
Other people who I've spoken to who recorded using HeyGen app didn't get as good results.
I'll keep experimenting. I don't think my clone is perfect yet.
What's Next
I'm continuing to refine the clone and plan to use it for business content at scale — daily tips, podcast promotion clips, FAQ explainer videos, and LinkedIn thought leadership posts. Each iteration teaches me something new about what the AI responds to. I'll keep documenting the process here.
Frequently Asked Questions
What equipment did you use to record your HeyGen avatar training video?
The training video was recorded in a professional studio using a high-quality microphone fed into a Rodecaster mixer, with the audio output routed into a Blackmagic ATEM video switcher. The final video and audio output from the ATEM was recorded directly onto an SSD drive, which produced the highest quality source material for HeyGen.
Why is recording to an SSD better for HeyGen avatar training?
Recording directly to an SSD via a Blackmagic ATEM preserves the original video and audio quality without compression artifacts introduced by software recording or export tools. This gives HeyGen cleaner, higher-fidelity source material, which leads to a more accurate avatar and better voice cloning results.
Does using the HeyGen mobile app produce worse results than a professional setup?
Based on conversations with other creators, recording directly through the HeyGen app tends to produce lower-quality avatars compared to a dedicated professional recording setup. Factors like lighting, camera quality, audio fidelity, and recording chain all affect the final result.
How long did you record for the HeyGen training video?
The training session produced about 10 minutes of footage, near HeyGen's current upload limit. The footage included a variety of emotions and expressions — happy, serious, curious, laughing, and natural conversational moments — to give the AI a broad range to work with.
Is the HeyGen avatar clone perfect after the first attempt?
Not yet — and that's normal. Even with a professional studio setup, AI avatar clones often require iteration. Small refinements to the source footage, audio quality, and expression variety can meaningfully improve results over multiple attempts.
What is a Blackmagic ATEM and why is it useful for AI avatar recording?
The Blackmagic ATEM is a professional video switcher that can mix multiple video and audio sources and record the output at broadcast quality. For AI avatar recording, it ensures the final file has no compression loss from software encoding, giving tools like HeyGen the cleanest possible training data.