AI PERSONAL ROADMAP

Mar 12

My background is in filmmaking and video post-production, so naturally my first entry into AI was in finding ways to enhance footage. Especially a decade ago and more, there were many limitations to shooting in digital. A lot of the things that we produced prior to the introduction of HD cameras did not age well, and I always wanted to remaster a few of my favorite edits. One of them was a short film my friend and now commercial director King Palisoc’s final student film for college, Mahal Ko Si Direk! (I Love My Director!)

This movie was shot on a Canon XL2. While that was the top of the line for MiniDV cameras at the time (2005!), ultimately the image resolution was a measly 720x480. I was able to recapture the original footage from tape, upscaled the footage (using software that I’ve unfortunately been unable to recall) and reedited/colored in 1920x1080.

Nowadays it’s easier to upscale footage, with many options online. I would recommend Topaz Video Enhance, but there are free options as well.

Another AI video enhancement technique I looked into was frame interpolation, i.e. creating intermediate frames in between existing frames to provide more motion data, aka synthetic slow motion.

I also experimented with AI solutions to applying an image’s style onto another. As interesting as this approach was, I was looking for a video-centric solution, which brought me to a more dependable, widely used (albeit non-AI) option - EbSynth - Transform Video by Painting Over a Single Frame. Still, I wanted something that had I could run on my own hardware and give me more control.

ust over a year ago I started playing around with Runway.ml’s text2image model. It’s interesting to see how far we’ve come in such a short span of time.

After a year we were getting even closer, with Wombo being able to generate more coherent images.

When Midjourney first came out, my immediate objective was to see how far one could push the algorithm, prompting a variety of scenes and compositions. True enough, especially compared to the images above, Midjourney certainly produced images with a level of coherence we’ve never seen before. Of course apps from other companies have already been operating since before, but there was a certain unmatchable ease one got with Midjourney.

After a month of using the platform however I noticed that while Midjourney’s images were getting better and better every week, Midjourney seemed to heavily prefer certain styles and subjects. And of course, Midjourney didn’t do animation, so it wasn’t a viable long term solution for me.

Things changed when Stability.ai released stable diffusion to the public. Suddenly anyone could make applications to run on their open, publicly available model. In a few short days several options opened up to everyone, including me.

mauithevideoguy

AI PERSONAL ROADMAP

Entry III: More than one way to skin a cat.

MY IELTS APPLICATION