Pages

Thursday, June 20, 2024

The Rapid Rate of Progress


This past year and a half has been a blur as far as technology updates. Nvidia has launched into the stratosphere with AI companies needing their latest breakthroughs to keep up with AI's rapid growth and the public's need for a never-ending stream of new products and constant updates. It's almost dizzying at times. 

Last week, I was writing about Kling (an AI video model made in China and only available to those with a Chinese phone number) before Luma Dream Machine dropped and we in the States got access to it right away. Dream Machine is brilliant. It was at or near what I was seeing from the Kling and Sora videos. I quickly made a few myself. This is an AI image I generated last year. I used Dream Machine for the video and Pika for the background noise. I would have just snatched the audio and removed the Pika watermark if I were going to use it for anything other than as an example. 


Then everything changed yet again when Runway showcased their new model Gen-3. Based on what I have seen it is more creator-friendly than Sora and on par with their outputs, even better than those of Dream Machine, which I feel bad for because it is also a wonderful AI video tool. However, Runway will become my default model once we actually have access. That is until someone else comes along and blows Gen-3 out of the water. Until then, not only does it create great video 1,000 times better than Gen-2, but it's faster, you can create text, and there is also the lip-synch tool you can use with the video you generate. This video below is 100% generated with Gen-3 and the lip-synch tool. 


While it is not perfect, it may be good enough to be a starting point for me to start taking AI video tools seriously. I am not sure you could make an entire full-length live-action AI-only movie using just Gen-3, but this is lightyears better than what we all had to suffer through up until we first saw Sora's outputs. Check out Runway's Website for Gen-3 to take a look at some of the video samples.

The hits kept coming this week as Hedra, another AI video tool with lip-synch, was released. For my test run with this tool, I used one of the early Dall-E 3 generations of Thomas Edward Downs from my re-released illustrated version of Michaelmas. It's good, however, it feels like we will need to wait until the next update for it to be a tool that can be used professionally. That said, I have seen much better examples of Hedra since its release than the one I created below. 


On Wednesday, ElevenLabs released Voiceover Studio. You can take a video without sound and add voiceovers and sound effects. And you can use those voices to create conversations without video, say for a podcast.

For most people, these are just cool new temporary distractions. New ways to create cute little TikTok videos for their followers. For me, someone who has been writing stories for twenty years, these tools are something else entirely. I am drawn back to a time when I thought I could create short films to sell larger stories. 

Over the past year, I watched countless sketchy AI clips and pieced together dodgy AI short films. These people were trying to do what I had done all those years ago with real film. I could not chase that trend because the results were not good enough for me to put my name on it. No one who was not following the AI Video space would want to watch them. I grew to loathe these videos by late December, and by January the whole community began to show its frustrations at the limitations. Then Sora started to drop videos in February, yet only sharing the tool with people in Hollywood. Until last week we had been stuck dreaming of Sora and lamenting the unrealistic AI video that dominated this space for well over a year. 

Dream Machine is amazing, but it is Runway's Gen-3 and ElevenLabs Voiceover Studio that have me dreaming like I am back in my short film days. So, what does that mean? 

What it means is that I am continuing to work on the two-part illustrated novel series, planning a graphic novel, but I now have the tools at my disposal to do so much more. Tools that make me rethink things.

Over the past few months, I spent time working on a pitch package for a TV series that I then reached out to Hollywood about. I am not dialed in like I was back when I was making short films. For the most part, I have kept my expectations in check since 2009. Up until last year, I thought most of the spec screenplays I had written over the past 20 years would go largely unread and that no films or TV series would ever become of them. 

That all changed last year, as a new hope started to take root in my imagination. New dreams began to blossom. I started to envision how things might go with the rapid expansion of AI. I even rewrote a TV series so it could adapt along with the changes in AI. Immersive entertainment is the way of the future, even if sales for the amazing yet ridiculously overpriced Apple Vision Pro have ground to a halt. 

While I have been unable to persuade anyone to option my TV series, that doesn't mean I'm going to toss it into the library along with all the other spec stories I have written. I see now that the market for buying anything has dried up in a big way, and this was not a good time to try and sell such an ambitious project that has no pre-existing IP. The financial effects of the Covid years and all the strikes last year have streamers clutching their purse strings, afraid to take chances. 

No worries. Not like I haven't faced rejection before. Only now I no longer have to toss this story into a dusty heap along with all the others. Now I can start to develop the story myself. I had included an AI-generated storyboard for the opening sequence in the pilot with the pitch deck I sent around, but I was reluctant to create an AI trailer using all that wobbly 4-second stuff that was available at the time. Images have progressed a lot faster over the past year and a half than Video up until now.

With the advancements from Luma and Runway I can do a lot. Can I create an entire episode using AI video? No, I don't think so. However, I know for a fact that I can create a fairly good trailer. While I had been reluctant to jump into the AI video waters when the outputs were so poor. These are a 1,000 times better and I can't not jump in now. 

This is exciting in several ways. Not only will I eventually have a trailer for my TV series, but I will also be adept at using the tools by the time I am done, and new tools will be ready by the time I'm done as well. Who knows, maybe they will be good enough that I can just start work on doing the whole damn series. 

In addition, with ElevenLabs Voiceover Studio I can also create the entire pilot episode for the podcast that is part of the TV series. What do I mean? Well, I mentioned that I had developed this series to be something that could be augmented by AI and the trend of immersive entertainment. The podcast is hosted by two characters in the TV series. They are not main characters and only play more of a background role in one of the three main stories that run throughout the first season. However, their podcast looks into the mysteries in the small town. So, I decided to develop a fictional podcast that could stand alone from the TV series while also enhancing it.

These are two exciting new AI projects that I look forward to starting work on. Initially, they are meant to help me try and sell the series and have it produced in a traditional way on film. However, times are changing and if I can't sell this series to a streamer then, if AI keeps advancing at its current pace, I may be able to create the whole damn thing by myself. I don't want to have to do this, but it is likely the future for storytelling. So, I am not afraid to be a pioneer in this new art form either. But it has to look and feel real. I have seen people over the past year dive into creating dodgy AI videos and fall in love with what they were doing so much that they were blind to the actually limitations. If regular people can't enjoy it then I don't want to be wasting my time on it. 

This is why a trailer using these updated tools and a 30-minute (audio-only) pilot episode for the podcast seem like an acceptable option at this time. This will not only be great practice for me with these tools, but the trailer and the pilot can be used to pitch the series. I may or may not make them publicly available because I will be contacting producers with them. 

I don't want to be releasing anything to the public that is not of the highest quality. That's why I do not think that even these updated amazing tools can capture and hold the public's attention for 30 let alone 120 minutes. Not yet. They bring us one step closer, but we are not quite at the point where I can work on my own with these tools to create a proper movie. 

However, one way some people will be able to use these updated tools is to augment a live-action production. I also think it may be possible to create an animated film that people will enjoy once Gen-3 is released. That would be HUGE. But live-action will take a little longer. I saw a few good animated shorts over the past few months that were made with the last-gen AI video tools. I mentioned last summer and I still believe it is true today that the first film people will watch and enjoy that is 100% AI-generated will be animated.

Do I have material that I could use in an animated film? Of course. Can I do the current TV series using animation? Yes, but... I would like that to be in live action. However, there are three stories interwoven in the TV series. Maybe I could tell one of the stories through animation in the trailer. Yeah, I kind of like that idea. I have a lot to think about as far as the trailer before I start generating scenes. 

Exciting times. And as I am finishing this up another AI update has dropped. While it's not another video model, Anthropic just released Claude 3.5 Sonnet. 



This will put more pressure on OpenAI, Google, Meta, XAI and others to keep pace. Google and OpenAI each made big shows over a month ago and have yet to release all that they promised. Tick tock you're getting lapped on the products you haven't even released in full yet. Good on you Anthropic for keeping the pressure on and continuing to accelerate towards AGI. 

Last year, I made a promise to myself to try and keep up with all the AI advancements and it has been like a rollercoaster. There was a bit of a break around the holidays, and even into the new year. However, over the past two months things have started to gain speed. TBH I am having trouble keeping up and trying to get work done. Once I get deeper into these multiple projects I may struggle to keep up, but I'll try nonetheless. 

To sum up. The tools are now here that people can use to begin work on building their so-called media empires. My own plans will start with stories that existed before AI was a twinkle in my eye. We'll see over the next few months how much progress I can make using these new and improved tools to create in a wide variety of media: illustrated novels, graphic novels, trailers, and a podcast. Thanks for reading. 

No comments: