Pages

Friday, February 16, 2024

And then Yesterday Happened


This month I have been doing a lot of blogging because this time of year my gears are still churning out plans for the year ahead. In December, it is not always easy to take stock of the past twelve months with all that is going on during the holidays, so this time of year is also about reflection. A lot happened last year with the rise of AI and my recalibrating is an attempt to adapt to all the changes. I didn't achieve all that I wanted, but I did enough to not hate myself either. Adding AI images to a previously written book was just a few baby steps but it was something. Preparations to begin on a larger project, my first graphic novel, were well underway. And then yesterday happened. 

Before I get into what happened yesterday, one hour after I published my last blog post, let me go over another writing project that has been percolating for a few years. Not only have I been laying the groundwork to use AI to create the images for my first graphic novel, but I've also been preparing to pitch a multi-story TV Series. A series that I created a podcast for last year that would not only stand alone as part of a transmedia pitch package but also take place smack-dab in the middle of the first season. Had I not created the fictional podcast I would not have had the biggest idea of them all, something I have been wanting to do since I was a kid. Back then I read several "Choose Your Own Adventure" stories that captivated my imagination because they were different than traditional books. 

Over the past few years, there have been a few interactive stories like "Bandersnatch" that give you a couple of chances during the movie to change the course of the narrative. Netflix has also done a few other interactive shows, mainly geared towards children like those books I read as a child, but that "Black Mirror" movie reawakened something within me. My taste in the stories I like to read and watch has changed since I was a kid and so has the technology. 

While most VR goggle devices have not really moved the needle, Apple's Vision Pro has blown people away. So, as I was preparing my pitch documents for this TV project I realized that this multi-story series  I have been working on is actually ideal for a "Choose Your Own Adventure" TV Series. I will pitch it as a multi-story series but I will also present it as a candidate to be made into an interactive story as well. The fact that there are multiple stories told within one season, some of the sci-fi, supernatural, and mystery story aspects of the series, as well as the podcast worked into the story makes it a prime candidate to be a groundbreaking interactive TV series unlike anything that has come before. 

I had planned to go into even more detail about this interactive story next week, but then Open AI (the company that brought us ChatGPT and Dall-E) released a demo for their new text-to-video tool SORA. Why is it a big deal? Because, as I stated in my blog yesterday one hour before the Sora news dropped, AI Video has been stagnant for almost a year. If you could manage to get a stable render you would only get 4 seconds of video. And so we have had to suffer through unstable pieced-together 4-second clips that only people working in the same space could really appreciate. The rest of us would check out of these at the first sign of instability or once we got tired of the constant cuts. It had become annoying. 


That is why Sora is such a big deal. While we do not have access to the tool as of yet, the crew at Open AI released a few dozen videos that showcased Sora's capabilities. And at the same time ended the need for us to ever have to watch any of those dodgy 4-second videos ever again. The companies like Runway and Pika must have lost their damn minds yesterday as Sora all but ended them. Unless they have some better models that they have been holding back. But I doubt they will come close to Sora. 

Open AI took the lead in LLM chatbots. While Google narrowed the gap yesterday in the chatbots field, the introduction of Sora hints that ChatGPT5 is also about to be released and will likely blow Google out of the water. 

Sora will also be challenging Midjourney for the image creation title as well, as Sora will also be creating far more impressive static images that look even better than what Dall-E 3 was creating. Dall-E 3 is impressive but it has more limitations than MJ 6, which was released in December. 

One of the problems, if you can call it a problem, with all these tools is that there are just so many of them. 2023 was all about ChatGPT4's domination and the multitude of fantastic image generators -- Midjourney, Dall-E 3, LeonardoAI, Stable Diffusion, Firefly, and several others. 

While the video generators of 2023 were amusing they didn't really move the needle as much as the ChatBots and Image generators. Runway, Pike, Leonardo and a few others were all generating similar types of results that created those pieced together 4-second clips. Deforum is a bit different as it created longer videos where the image is constantly morphing into something else that is similar to what it was. I liked all of these to varying degrees but the Deforum content will likely rise above the others because it is unique from the others just mentioned but also different from what I have seen so far from Sora. 

Interestingly, over the holidays I started to think about video games because I was so frustrated with AI Video's limitations. Unreal Engine 5, which is mostly used for gaming, has been used in shows like The Mandalorian and Duncan Jones is using it to film Rogue Trooper. So, while I was thinking about UE5 potentially being something I might need to learn more about if I wanted to create a more realistic AI video, I saw Sam Altman's tweet about Sora. As I am looking through those videos I notice some similarities to UE5. Last evening, Tim Brooks of Open AI, one of the people who worked on creating Sora, dropped some of the research on Sora in an article entitled Video generation models as world simulators and then I realized that even though I am not a tech guy I was able to deduce with my limited knowledge on this stuff that tools like UE5 were needed to take the next step in AI video. 

I've already seen people groaning that UE5 may be a part of Sora's training data, but it does make sense. Not all of the video that Sora creates has the look of a game, but you can see the influence in some of the videos. Let's just say that this is even more exciting to me than ChatGPT or Midjourney. Tools that I can use to make so much more content than I ever could before. But with Sora, there is a real hope that I, along with millions of others, will be able to make movies. Maybe even a TV show. 

This takes me back to the 2000s when I was making short films. I stopped because I was paying for those out of my own pocket, and even though they were just short films the time and money needed to create them took a lot out of me. 

Anyone who knows me will understand what something like Sora could mean for people like me. There is a lot to learn about this new AI Video tool. You can create up to a minute of steady video. One Minute! I was saying in another blog post that until we get to 10-30 seconds of stable video making a movie was impossible. Whether it is possible or not now will depend on how much control we can have over the generations. Can we create a character in one video and have that character be consistent in the next? That is the big one. That has been the big one with AI Image generators and only recently with Scenario and other tools has that become a much simpler thing to achieve. Consistency, stability of the videos, believability of what is created, and ability to edit what has been created. I'm sure there are other things I'm overlooking right now, and I'm sure there will be plenty of flaws that may limit what can actually be created. But it is a time to hope and dream again about AI Videos.

Sora brings us closer to a truly immersive world like that in "Ready Player One." We storytellers are going to have to step up our game to meet the challenge of creating these worlds because the tech is getting a lot closer to making it possible. I am trying to rise to that challenge with a possible interactive series, some of which may be able to be augmented by content created with a tool like Sora. But I may have to aim even bigger than that, or maybe this TV Series can be altered into an even more immersive experience. Either way, I am attempting to adapt, but I still have to keep learning to keep up with these changes. 

I know I am not the only one who cannot wait to get their hands on Sora. But, I literally have a library of screenplays that are ready to be created. I know this tool will not be perfect, and I am not getting my hopes up to high because I have learned that is never a good idea, especially with these early AI tools. Even the best image generators still have major limitations. Sora will change the game of AI Video generation, and we will likely see some amazing short movies as a result. However, based on the previews some major buzz-kill issues may limit what we can do. Someone with more technical nous than me may be able to overcome those issues and even be able to create enough excitement about their project to get a theater or large platform streaming release of a movie before the end of the year. We'll see. 

My being able to tackle my library of stories may not happen until we get a few more updates in, but we are getting closer. I hear Midjourney is close to showcasing its own Video model. Exciting times to be a creative person that is for sure. Now it's time to come back down to Earth and work on the projects in front of me until Sora actually releases. Then I'm not sure how I will be able to focus on anything else, but until then I have a TV Series to pitch and a Graphic Novel to create. Thanks for reading. 

No comments: