Pages

Tuesday, August 13, 2024

In Flux


Been doing some thinking over the past two weeks. I spent much of July focused on the current state of AI video and was trying to determine its limitations and capabilities. That meant researching Gen-3, Dream Machine, and Kling. There are others but those are the main three so far. There is great potential. No doubt. But, to get the best results you need to use Image-to-Video, preferably with one or two images. I can see generating between a series of single images in a longer sequence as a thing. Having one at the start and one at the end of a 10-second clip is wicked cool, but I've heard of these tools being able to do 2-3 minute sequences. Imagine adding 120 single images and next thing you know you have a whole scene created without needing to be so tedious with these shorter clips. Not all the models are doing these beginning and end frames yet but they will soon. Oh, and FLUX is giving Midjourney a run for its money as far as realistic images. 

I love all of that. Imagining how it will progress is just as fascinating to me. It's amazing to watch. New use cases are rolling out every few weeks. For instance, you can shoot a video with your camera and take an image from that video to add a VFX sequence that can be edited into the live shot using editing software. After I saw that I realized I could go back and test on old short films I made years ago. I am discovering these brilliant techniques people are coming up with and testing them to determine possible use cases. 

Like many people who have realized over the past year that we have entered a new era-- one that seems likely to change society and the way we live our lives, I have been trying to determine how to pivot. While I am no longer a young man, I still have dreams and aspirations similar to those of that younger version of me who returned from California with a creative fire burning in his eyes 25 years ago. The stories are the key. They reveal the lessons learned along the way, and the possible futures based on the world as it is perceived.  

Give me an hour at a cafe with a cup of coffee, a good book, my phone, a notebook, and a pen and I couldn't be happier. It's a pocket of time when I am free to let my mind wander. While some of my best story ideas happen while I am out for a walk, so many of those ideas are fleshed out at a cafe. The constant change around me, people coming and going, as I sit there observing while looking inwards, making connections, recalling the journey, and trying to predict and plan for what comes next. 

I have given myself till the end of this week to assess AI Video models to see what I could learn and then determine how that might impact me creatively in the near future. I still have another week, and there are likely to be many new use cases to discover, but I feel I have made my mind up already.

I am not in this space to be the guy who is the first to discover new techniques with these GenAI tools. I do not feel obligated to post content every day to keep my engagement metrics up. No, not yet at least. I want to see how others are using these tools to learn from them so that I can tell my stories in new ways. I think we are in a new frontier-- creatively speaking, and I consider myself one of those pioneers. My goal is to create a multimedia company. Or a "media empire" as a friend recently joked after reading my most recent TV series bible. It lays out big plans for the TV series that involve a few new ways of interacting with the content. 

Simply put, the goal is to work with Gen AI tools to be able to do more. Two things came to mind last spring when I began to immerse myself in the Gen AI space: How do I use these tools to help me creatively? And, how do I use this technology to help others? 

I am by no means an altruistic saint who thinks only of helping his fellow man every second of every day. Far from it. We are a screwy species and it is often best to mind our own damn business. But I do come from a long line of educators so maybe genetically that is where it comes from. Anyway, an APP was one of the first things I thought of last spring after sitting down with GPT-4 for a month. I have been researching ever since. 

While I won't go into detail about the APP at this time, it is interesting that other than creating moving and still images to accompany my written words that I thought of creating an APP to help others. The idea just made sense. Even more so now. Not only can I help others with it but I can also help myself as well.

As I was assessing the current state of AI Video tools last week I realized something. 

If I am serious about starting a multimedia company, I can't expect AI video trailers/ short films, or graphic novels to fund the way forward... yet. AI Video has gotten a whole hell of a lot better than it was this time last year. However, it's not easy to tell a substantial story. And while the trailer I am working on means a lot to me, it cannot be my main focus. These tools need to get a lot better. Right now you need to be a patient and persistent puzzle master to piece together a worthwhile 2-minute trailer. You'll need to pay out your ears for all the tools needed to create something special. But it can be done. Within a few months, folks will start creating longer works where they have pieced together using the same methods from shorts to create something special. We'll learn their process and cringe at how difficult it was. And yet that will be the most difficult it will ever be. By this time next year, it will be so much easier to do all of this.

There is a window that has opened for AI Video creators and those like myself who are gradually learning more about it every day. The familiarity with these current tools and the proven results of using them may help during the big run to create content that will likely arise next year once AI Video takes its next big leap forward. That leap should be to provide the ability for these models to take a script from a scene, ask you questions about it to make sure it understands what you are wanting and then generate the scene. Once these models can communicate with us like LLMs do using chain of thought then we will see a massive explosion of AI-empowered storytelling. 

For the time being, AI Video is still too unstable, both with its outputs and the overall process. These tools have only become worth my time since June. Sora was in February, but that doesn't count because we still haven't gotten our hands on it. Again, I am not here to discover all the techniques and share those. The people doing that are amazing and I thank them for what they are doing. Their work will be a road map for all of us. They are the OG pioneers, charting the path forward for the rest of us. 

As far as the APP, I can help people with it while working in the background on the more creative side of things. I want to avoid going the clickbait route where I create disposable content to feed a metric. I prefer substance not only with my creative output but also with the APP. The goal is to provide a service that people need.  I want to create value for others and I fully plan on doing a free version of the APP, which may be all people ever need. And that is great! But, I also see charging a monthly fee for premium services for those who need more than the basic service. 

The decision to focus on the APP is not the one that I wanted to make. If I was calling all the shots, then I would have access to all AI video tools that are being held back for the election. That would mean I might be able to go full-steam ahead on making movies and TV with AI tools. Something that I may be able to do now with animation, which, as I have said before, has more room for error than the life-like AI content. But I am not in the animation mindset yet. Once I transition to the comic book series/ graphic novel then I might be more open to focusing on AI animation. Thinking about that now maybe I should focus on the comic book sooner rather than later. Food for thought. 

My evaluation with one week to go in my AI video assessment period is that with the publicly available tools you can make comics, illustrated novels, commercials, trailers, music videos, short live-action films, and longer animated projects that most people would never know are largely AI-generated. The VFX part of this can't be overlooked. That means those who have been filming live-action sequences but have been strapped financially, can actually do some amazing things right now with AI tools. That is all great but these are not my main creative focus. While this company I am creating will include illustrated novels and comics, these current capabilities are still short of where I would need them to be to create realistic AI TV series and movies. That said, in the meantime, I can focus on all the other things I can create using AI video and audio tools, which is a lot. For me, it is all training for TV and movies, though.  

If I had access to all the tools that are being worked on behind the scenes I would likely have a different take on things in this moment. I like to think I have some idea of what may be in the pipeline, but you never know for sure. The good thing is that it is highly likely that these tools will only get better, and fast. So, it makes sense to focus more of my attention on creating the APP for the next few months. Once the dust settles after the election, it will be the perfect time to shift my main focus back to AI Video. Not that I won't be working on AI Video at all between now and then. No, I just need to prioritize the APP for now to try an make some headway before Fall. 

This time last year, I was thinking that we would be right about where we are with video. A short scene is not a performance, though. Not yet, at least. Consistency and stability are nearly solved and performance will be the next big hurdle. Or at least I think it should be. I believe we may have an AI-animated movie out by the end of the year that will be indistinguishable from a traditionally animated movie.

While I want to be able to do all of these things, I am not attempting to be the first. I want to keep learning about all of it because my goals are more intricate than just trying to be the first to create a proper AI movie. That said, I have thought about what that might be like-- the first AI movie that most people cannot tell was created using only AI tools. It could be a hybrid that actually has some live-action. That seems likely to happen soon, which will raise a lot of eyebrows. And that may open the door for the first all-AI movie that generates enough buzz to create some acceptance and appreciation from the public. The Blair With Project always comes to mind when I think of this. 

That said, these AI tools will continue to improve with each passing week. While my main focus will be on the APP, I will keep working on the trailer for the TV series. I won't be sharing a lot of details about the APP until it is ready for public testing. My initial goal is to have it ready for initial testing by November. Rolling out the APP after the election is a good target. It seems likely that even better tools will become available then. This will allow me to adapt the APP based on any updates before final testing and release. 

I'll keep pushing with the trailer and the illustrated novel series in the background. If I am to create a proper multimedia company I need to have a lot of content. I am also open to doing more with these tools in ways that may not be top of mind at this moment. I may get adept enough with these tools that others may want my assistance with their projects. I could grow fond of creating commercials or fall in love with AI animation. Maybe I decide to create a video game. Who knows? 

The one thing I would stop everything to work on is a new form of storytelling entertainment. If the tools get good enough that I can do all I laid out in the bible for my recent TV series, then that will be my top focus. In reality, I am building towards that anyway. So it is best that I take this step-by-step approach toward the likely inevitability of a more immersive entertainment experience. 

It is a process. A process guided by imagination and fueled by rapid technological change. Embracing it was much easier than expected. Some are vehemently against any use of AI for anything involving creativity. Again, I get it. However, the dreams I have had for over 20 years were stifled in the pre-AI era. My creative visions had remained only partially realized through the written word. The chance to create more with these stories may allow me to fulfill the creative goals that I began setting out for myself when I returned from LA at the turn of the century.

I am an independent artist, and through the years I have grown to value my artistic freedom more than I felt a need to sacrifice it all for someone else's idea of success. I just love to create stories. And with AI I will be able to create all the worlds I've ever imagined while maintaining my artistic autonomy. And that's all that matters. Thanks for reading.