TL;DR
I used DemoFusion image-to-image generation and Real-ESRGAN-NCNN to upscale and enhance some sprite sheets of the game Little Fighter 2 which was released in 1999. You can find the results and the code on my GitHub. So far I have enhanced the full sprite sheets of the character LouisEX and Freeze and partial sprites of Firen, Dennis and John. I also enhanced some level graphics. I may continue with the other characters in the future.
Introduction
I have been playing Little Fighter 2 since I was a kid - it’s a classic for me. LF2 is a freeware 2D fighting game developed by Marti Wong and Starsky Wong released in 1999. It’s a very fun game and I still play it from time to time. It was one of the first games I not only played but also investigated the game files and tried to understand how things work under the hood. For this purpose the game is excellent because its sprite sheets are not encrypted and are represented in the game’s folder as simple bmp files. This makes it very easy to tinker with the game’s graphics and create mods for new characters, levels etc.
The character data and description of the sprite sheets is also very straightforward it’s stored in .dat files (encrypted but easily decryptable):
1 | string key = "odBearBecauseHeIsVeryGoodSiuHungIsAGo"; |
credits to https://github.com/ahmetsait/LF2.IDE
Here is a sample of the .dat file of the LouisEX character:
1 | <bmp_begin> |
The decrypted .dat file contain descriptors for the frames in the sprite sheets and paths to the according .bmp files with info about the size, number of rows and columns etc.
The sprite sheet graphics are stored in the game’s folder as bmp files. Here is a sample of the sprite sheet of LouisEX:
As you can see the sprite sheets are very low resolution. The game was released in 1999 and the sprite sheets were designed to be displayed in a 640x480 resolution. The game is still very fun to play so I thought it would be cool to try to enhance the sprite sheets with AI. As a first attempt I tried to process whole sprite sheets, but quickly realized that the results were not good. So I started to extract single sprites from the sprite sheets and process them individually.
I started with a C# script that parses .dat files and pulls out the single images from the sprite sheets, renders them on a black background and saves them as png files.
1 | //extracted from louisEX.dat |
These are the extracted sprites of LouisEX:
For all characters and game objects this resulted in 7600+ images.
I started with some research on which tools I could use to enhance the sprites and did some experiments with Magnific.ai, Upscayl and DemoFusion.
Magnific.ai
Over the last months tweets like this one have been popping up on my timeline:
https://twitter.com/doganuraldesign/status/1735600747788873808
And I thought it would be cool to try something similar with video game sprite sheets. In the post above the author used a tool called Magnific.ai which is a commercial tool for image upscaling and enhancement using generative AI. I did some experimenting with Magnific and was very impressed with the results, the details the tool was able to add to the images were amazing. The UX is really good - everyone should be able to use the tool without any prior knowledge of generative AI.
Pretty impressive stuff. Without any knowledge of the underlying tech, I assume that the tool also uses a combination of image upscaling and generative AI to enhance the image - you can set the level of hallucination which is similar to the guidance scale in image-to-image generation. One thing that I missed in Magnific was the ability to enhance batches of images - for my usecase I extracted single sprites from the sprite sheets and wanted to enhance them individually so I needed a tool that could enhance multiple images at once or at least automate the enhancing process.
Upscayl
After a quick web search about batch image upscaling, I found Upscayl. This is an open-source tool that uses Real-ESRGAN to upscale and enhance images. I have already worked with GANs before when I trained StyleGAN3 with 10000 Bored Apes - so I was curious to see how this tool and the models behind it would perform. The UI / UX of Upscayl is great. The process is very straightforward - select an image, pick a model, and start the upscale. That’s it. Everything runs locally and you can also select a folder with images to batch process them - awesome! There are 6 default models that come with the tool and you can also add custom models if you want to. I went with the included models and started to experiment with the tool based on this 79x79 image and a 4x upscale factor.
I think the results are okay, but not comparable to the results of Magnific, especially when it comes to enhancing images. Adding details and new content in the images is something that I was not able to achieve with Upscayl. I haven’t looked into custom models yet, so there may be a way to improve the results further. For upscaling the results are nice though. Here is a comparison of the original image and the 4x upscaled image:
Real-ESRGAN-NCNN-vulkan
I wanted to pack all my steps in a notebook file so I started to check the Upscayl code to see how I can run it from a script. Under the hood the tool uses Real-ESRGAN-NCNN-vulkan which can be run from the command line like this (there are binaries for Windows, Linux and Mac):
1 | # upscale images from input folder and save them to output folder |
DemoFusion
After some more research I stumbled upon DemoFusion. A fairly recent project with the paper published on December 15th 2023. DemoFusion is a tool that combines a generative AI model with various other mechanisms to enhance images. For a detailed explanation please refer to the DemoFusion website. The default way to run DemoFusion is text-to-image which takes a prompt and generates a high resolution image with a high level of detail. There is also an image-to-image pipeline which is what I used for sprite sheet enhancement. Compared to Upscayl, which you can just download and run (easy to understand and very good for beginners) DemoFusion is a bit more complicated to set up. But fear not, all the steps are documented in the repository. In my case I wanted to do image-to-image generation so I started with the gradio_demo_img2img.py script. The script needs a trained model to work - I used stabilityai/stable-diffusion-xl-base-1.0
. After the model download and the installation of the dependencies I was able to run the script and generate some images.
I was very impressed with the results, going from a pixelated low resolution image to a 2048x2048 image with a lot of details is amazing - my first results were not quite as pretty as the output of Magnific but still very good and of course running the open source model locally is a big plus. This is a comparison of the original 79x79 image and the 2048x2048 image generated by DemoFusion:
After some test images I decided to continue with a combination of Upscayl and DemoFusion. I used Upscayl with the ultrasharp model to 4x the images and input them into DemoFusion to enhance them further. The 4x upscaling was done directly via the Upscayl UI. A nice thing about DemoFusion is that it comes with a gradio interface, so it was very easy to experiment with the parameters in a browser and run the model via a local API.
Here is a is a comparison with the ultrasharp + DemoFusion results:
For the parameters I used the following values:
1 | { |
For the first character with a full set of sprites I tried to stick to the original style of the graphics, so I didn’t use some higher values for the guidance scale. Also, the controlnet conditioning was not changed drastically to keep the original pose of the characters. To show what influece a higher guidance scale and a lower controlnet conditioning value can have - check out these samples:
As you can see with an increase in the guidance scale the character gets a greater level of detail and different style. A lower value for the controlnet conditioning allows the model to go beyond the composition of the input image. This alters the pose of the character quite a bit. I think it would be interesting to experiment with these parameters further and see how the results change.
putting it all together
After I tweaked the parameters of the model and a batch upscale of the sprites (which took about 1 hour via realesrgan-ncnn-vulkan.exe) I was ready to start further processing in DemoFusion. On average each DemoFusion run took between 150 and 250 seconds on my hardware so processing all images >7000 locally was not an option. I also tried to run the script on a Hugging Face space and on Replicate but without a bigger budget it was not really feasible to process all sprites. The replicate bill for 1 character (~140 sprites) was about 20$ on a A100 (80GB) GPU - with 100 seconds in average per image.
I quickly decided to just start with a few characters locally and check how the results would look like in the game. I started by processing the sprites of the character LouisEX and the level elements of the first stage. After I had these assets processed, I combined them with the following script to create a full sprite sheet again:
1 | var original = SKBitmap.Decode("input sprite file from .dat content"); |
After this process I had the original sprite sheet and the enhanced sprite sheet side by side and I was able to compare them.
Little Fighter 2K
Before the enhanced sprite sheets can be used in the game they need to be resized to the original size which of course causes some fidelity loss, but I first wanted to try it without upscaling everything else in the game - this would be a follow up project, I guess. I replaced the original sprite sheets with the enhanced ones and started the game. I guess with the small resolution the results are not that impressive anymore, but it’s still cool to see the difference between the original and the enhanced sprite sheets. Especially the level elements like the parallax background look much better.
One thing that I noticed is the difference in style between separate sprites. Even though all parameters including the seed of the prediction were the same for all images, there are some significant differences in the style of the sprites. Will need to investigate this further.
In the three sprites that make up the standing animation of LouisEX you can see that the style changes from frame to frame.
For the next character I tried to use a specific style that I pass in as a prompt to the model - I have also increased the guidance scale to see the influence on the results and get an even higher level of detail and more creativity in the results. I decided to continue with the character Freeze. Here is the prompt I used:
1 | videogame rendered human+++ martial arts fighter, |
The results are pretty cool and much more creative than the images of LouisEX. I think the guidance scale of 20 is a bit too high though because some frames contain a lot of hallucinations and weird artifacts. There are also still quite some style differences between the frames. There are also super weird and funny results. Especially when the input image has an effect like fire or ice or an exotic pose the model imagines some crazy stuff going beyond the prompt.
I definitely have to tweak the parameters and the prompt further, but for now I’m happy with my results and what I have learned with this project. I will probably continue with other characters and game elements and check if I can modify the spritesheet descriptors to take higher resolution images as inputs. But that’s it for now. Here is some gameplay of LouisEX and Freeze with their enhanced spritesheets:
Check out the code and the results on my GitHub