Sora 2: Atlanta burns.
AI is redefining what it means to create. But at what cost?
A scene in the 1939 film Gone With The Wind shows Union soldiers setting Atlanta ablaze to destroy its strategic value to the Confederacy, leading to one of the most famous spectacles of early cinema.
But in the screenplay, this epic sequence was simply described as: “Atlanta burns.”
This apocryphal story has served for decades as a symbol of the hundreds of people, thousands of decisions, millions of dollars, and years of work it takes to translate words on a page into cinematic experiences.
So it’s ironic that we’ve arrived at a new moment in film history, powered by AI, that’s flipping the story of “Atlanta burns” on its head:
One person can create epic cinematic sequences on a laptop in a matter of minutes, for less than the cost of a movie ticket, with only a handful of words.
As I explore in this post, this has profound implications for filmmakers, the film industry at large, and most importantly, for what it means to create.
Enter Sora 2
Just last week, OpenAI, the company behind ChatGPT, dropped Sora 2, the second iteration of their text-to-video model. It’s not only another big leap in prompt adherence and photorealism; it’s also showing the ability to speak story by adding dialogue, sound design, music and even shot sequences that aren’t included in the original prompt.
While there’s a certain delight in seeing what Sora 2 does with even your most vague ideas, it’s a next-level example of just how much detail AI is generating that you don’t specify. And there’s no starker example than to create something with the minimum amount of input and see what you get.
So, in the spirit of the Gone With The Wind screenplay, I ran a few experiments…
Prompt 1: “Atlanta burns.”
While the best use of any Gen-AI tool is to be as specific as possible, I thought it was only fitting to give the latest high-tech AI model these two iconic words from classic cinema. Below is a string of five sequences I got by running this simple prompt five different times:
With just two words and no context, the model chose a modern-day setting instead of Gone With The WInd’s 19th century period, with multiple re-runs showing a mix of narrative, documentary, and news formats (with the last scene feeling straight out of the ABC series 911).
Yes, it’s weird, uncanny, and sloppy, but also think of all the decisions it made: casting, race, gender, locations, camera positions, shot lengths and sequences, dialogue, performance, lighting, time of day, and on and on. It’s easy to forget – especially when AI can serve them up in a matter of minutes – that these are the hundreds of decisions filmmaking teams decide over months of prep.
So, to further this experiment, I added one additional detail…
Prompt 2: “Atlanta burns, 1865.”
For my next run, I nudged the model toward the year that the events in Gone With The Wind took place. Just the addition of the year yielded very different outputs, and set the AI on a different path of improvisation, resulting in this string of five sequences:
Perhaps since it was so far in the past, it chose to give me documentary narration with historical context, though it did create the proper architecture, clothing, and transportation of the time period. But the dialogue is generic and the overall aesthetic is modern and cheap, making it feel like a low-budget re-creation for a bad TV show (albeit with impressive fire effects).
But again, it’s unsettling how many details it’s able to string together in a fairly coherent way just from a 3-word prompt, showing that Sora 2 is not only learning aesthetic mimicry, but is being trained to have a sense of how all the pieces fit together.
And if we think about the leap generative AI has made in the last year toward more photo-real images and video, imagine what another year of training and refinement on narrative, dialogue, and story structure might yield.
Which leads us to my third and final experiment…
Prompt 3: “Atlanta burns from Gone With The Wind”
In my final pass, I decided to take the gloves off and be as specific as possible, pointing Sora 2 directly at the movie. And, as this string of five outputs shows below, it definitely knew what I was talking about:
It’s notable how the color palette, music, sound effects, and camera movement all shifted to a vintage film aesthetic from 1939 (when Gone With The Wind was made). Even the dialogue and performances (none of it from the original film) have the theatrical rhythms and idioms of the time, and sound like an old recording.
It even got the scenario of the original film right – with Rhett and Scarlett escaping the city with their baby, and the tension around the flames reaching the depot – creating clips that feel like deleted scenes (though it often flopped their characters, with the male character playing Scarlett and the female playing Rhett).
Yes, it’s funky and sloppy. Still, it’s surprising how relevant the shot sequences are from just a 7-word prompt, and with more detailed prompting I could likely get close to a shot-for-shot remake of the original sequence.
It’s also important to note that the outputs I got from referencing Gone With The Wind show that Sora 2 definitely trained on that movie, and likely thousands of other films from that and other eras.
Which brings up the elephant that is still very much in the room: massive amounts of quality training data is essential for AI to create anything worthwhile, all of which has been scraped for free to make a for-profit tool that aims to disrupt the same industries from which its training data originated.
Final Thoughts
Admittedly, no self-respecting filmmaker is going to intentionally leave key decisions to AI, but there’s no doubt that as systems become more and more automated, it will become more seductive in its choices, eroding the idea of human authorship and self-expression by filling in compelling details.
I see this problem as the cinematic equivalent of what a recent MIT study discovered about students’ use of ChatGPT. Over-reliance on LLMs showed the accumulation of what they call “cognitive debt”: students who relied heavily on LLM use for writing consistently underperformed at neural, linguistic, and behavioral levels.
Put another way:
Outsourcing human expression to AI risks our very ability to express.
While AI is evolving on two separate tracks – one for professional filmmakers who will always force Gen-AI to the will of their vision, and another that amounts to automatic content – both directions have serious implications for mentorship in film and advertising.
But regardless how good AI gets at interpreting our prompts, generating photo-real clips, or understanding story, another fact is abundantly clear: Aesthetic and narrative mimicry is not expression. And humans will (hopefully) always feel the difference.
Veo 3 Postscript
I focused on Sora 2* for this post as it’s the latest headline-grabbing tool, and in my estimation the most powerful one out there at the moment (until maybe next week).
But I’ll leave you with this clip from Google’s Veo 3, where I used the prompt: “A scene from the film Gone With The Wind”. It proves AI tools are only as good as the content they train on. (Spoiler: I don’t think Veo 3 has trained on Gone With The Wind... yet).
*I’ll cover more of Sora 2 in my next post, where I explore the rise of the AI-generated slop machine…




Thank you Greg. Great post, great test use case. I was playing with Sora 2 last night and had a real moment of existential dread unlike any I've had so far.
Great post, Greg. I had a strange experience where I asked an LLM for the first time to help me get through a narrative writing block…and I really liked some of the ideas it had. They served my story, but you could argue they were pretty consequential elements. If offered by a human collaborator, it wouldn’t have seemed like a problem to me. But in this case, it feels disruptive to what I had understood creativity and self expression to be. Hoo boy. Makes me scared to dive in further.