The One Reason Why I'll Never Use AI Text-to-Video Generators

I’ve tried several AI text-to-video generators, and while the technology is undeniably impressive, there’s always something about the final results that feels off. It took a while to pinpoint the issue, but I eventually realized it all comes down to one thing: the uncanny valley effect.

While I do use some AI-powered tools for visual effects in my video projects, I can’t bring myself to use AI to generate video footage because it just looks too…uncanny.

The Biggest Problem With AI Text-to-Video Generators

AI video generation has made huge strides in a short amount of time thanks to advancements in deep learning. If you were online in 2023 when AI video generation blew up, you might remember this clip ofWill Smith eating spaghettimaking the rounds. As groundbreaking as this type of tech was at the time, there’s no denying how unnatural and unsettling it looks.

In 2024, these generative AI video tools are becoming more polished, creating smoother visuals and more realistic movements. Take a look at the difference between thevideos created with Runway Gen-2in 2023 and the ones OpenAI unveiled in 2024 to introduce Sora AI.Soraisn’t available for public use yet, but this is the quality we’re being promised:

Despite the improvement, I’m still not sold. For one, Sora isn’t available to use yet, so we still have to use less refined generators that will produce the same creepy results as Will Smith’s spaghetti video.

Just look at this video I created withPixVerseusing the prompt “A person walking through a park on a sunny day, smiling and waving at the camera. Birds are flying overhead, and trees are swaying gently in the breeze.”

The first two seconds look decent, until the person’s fingers, hair, and face start melting into the air! Even when more advancedgenerators like Sorado arrive and give us more accurate and beautiful videos, there’s still something unnerving about the AI-generated humans and scenery.

Whereas older models usually produce videos with clear AI giveaways, like those claymation-style visuals, the improvements from the newer generators almost looktooperfect. When I watch those clips from Sora, it feels like the attempt to refine the results is moving into hyper-polished territory, where it looks so flawless it ends up feeling sterile and lifeless.

Unnatural, unsettling, sterile, and lifeless. This is exactly what the uncanny valley effect is—human-like, but not quite human.

No matter how good these generators get, the uncanny valley effect will always linger. Unless I’m going for an abstract aesthetic as surreal as what you’d only see in dreams, I won’t rely on an AI text-to-video generator for any of my video projects.

The Biggest Problem With AI Text-to-Video Generators#

The Biggest Problem With AI Text-to-Video Generators