Neural networks are a tool, not a magic button. ZiMAD on working with AI on graphics for mobile games

How to find a common language with AI and what part of the work can really be delegated to neural networks without loss of quality," Andrey Novikov, an artist at ZiMAD, told in his column for App2Top.

Andrey Novikov

When I first started working as a designer, there was much less software on the market. And its functionality was limited.

Today, everything is different. For example, some solutions allow you to minimize monotonous work.

There is a downside to this. The software itself is becoming more difficult to master. Therefore, there is a need to constantly learn, to master something.

Now in my work I mainly use a bunch of Photoshop with built-in AI and Stable Diffusion.

I started studying the latter with online manuals. A year ago, without dealing with them, it was impossible to even install and configure a neural network (there were no ready-made packages). As a result, I even had to take a training course on working with Stable Diffusion.

Do I advise you to do the same?

Definitely. If there is an opportunity to learn from those who have already figured out the topic and are ready to share their experience, you need to use it. Courses can shorten the journey in two weeks to two days — give you all the appearances and passwords, the database itself, with which you can immediately start working with the software.

Stable Diffusion is configured locally on the computer — install Python, run the compiler, then get a link, clicking on which opens the web interface.

The feature of Stable Diffusion is the variety of settings. To work with it, you need to know which ones work and how they work. These are not the usual Photoshop buttons, where you can guess the purpose of the tool from the icons.

But you can also work through other interfaces, it's a matter of choosing a specific user. Other interface options include Easy Diffusion, Vlad Diffusion, and NMKD Stable Diffusion GUI.

The neural network outputs the result almost instantly — this is not a render, in which you can have time to go to the kitchen for coffee and also exchange a few phrases with colleagues at the cooler. But in order to work quickly with AI, it is important to customize everything for yourself.

AI Training

Working with a neural network "on a rolling basis" is the result of learning.

For example, I needed the neural network to be able to generate an interior in a style that suits me. To do this, I did the following:

I took images of one of the ZiMAD project (Puzzle Villa);
I cut all the images into small parts so that each fragment had an interior element;
I asked the neural network to describe what it sees in the images (at this stage it is important to assess the accuracy of the data and correct them as necessary; if this is not done, then the neural network will persistently generate erroneous generations during further work; for example, if it takes a chest of drawers for a person, then when requesting the generation of a chest of drawers it will give out a person);
after training, the neural network received the necessary style.

By the way, it can be transferred for use by colleagues to achieve maximum uniformity of the final images. To do this, it is enough for them to have a key — bending.

As a result, working with AI was reduced to three stages for me:

I show neural networks of 50+ images, describe and get a new style;
I skip the images that need to be finalized through the style;
The illustrator refines the resulting image (adds or removes details).

1 — the original image; 2 — the image passed through the style; 3 — the image modified by the illustrator.

Prompta is the key to interacting with AI

If you want to get high-quality results from the neural network, you need to get your hands on working with prompta, text queries. The final result depends on how they are formulated.

Each promt is unique, and any little thing can affect the image, including even the word order used.

However, I do not recommend using prompta exclusively. You can exchange images with a neural network. For me personally, it's easier than explaining everything in text. For example, you need to get an image based on a specific cat.

I insert a cat, I write that I want a cat in a hat. But the probability that the AI will give the desired result at the first request tends to zero. To speed up the process, I'm sketching a hat in Photoshop.

Next, already in the AI interface, I highlight the area that the neural network needs to work with (the inpaint tool), and also set the necessary settings — the number of variations, the level of study and, most importantly, Denoising strength, which is responsible for how the neural network will follow my sketch (where 0 — will not make changes at all, and 1.0 will completely ignore my sketch and draw whatever she wants).

Next, I get several variations of the cat in the hat I need and choose the option that suits me.

Another example of how a neural network speeds up the artist's work is presented below.

The 2D artist needed to create a unique UI for the event. He made a sketch, I passed it through neural networks, additionally writing in the script all the wishes of the artist. As a result, we almost immediately got an excellent basis for improvement.

Important: the boxes with apples, ladders and sheep in the illustration were generated separately. This was done for a more predictable result and further flexible refinement.

Another very powerful direction is 3D. Now I have tasks to generate small rooms. The neural network copes well with interiors, but the perspective is constantly jumping during generation, and I need a specific angle. I solve the problem using ControlNet in just a couple of steps.

Stage one. I model the scene in 3D, expose objects on the scene, expose the camera, render the depth map and get this image.

Stage two. I upload the depth map to ControlNet, set the settings so that the neural network generates images based on this depth map, preserving the angle and location of objects. Next, I write in the script that the AI should give out a dark, cluttered attic with a window in the roof and a cardboard box in the center of the frame. I get the following image.

After that, I refine the atmosphere in graphic editors, based on the technical specifications. It was supposed to be an image of a sad cat who asks for help while in the attic on a rainy night in a severe thunderstorm. I get an image that I can transfer to the animator for further work.

The value of AI for artists

A neural network is a tool that helps you make a database that you can upgrade at your own pleasure. Simply put, thanks to the neural network, artists have the opportunity to devote more time to the study of the image, rather than technical work.

A year or two ago, when I was given the task to draw 20 avatars, the process of working on them would have looked like this:

search for references;
preparing sketches;
drawing each avatar from scratch;
refinement of each avatar.

Today, the neural network has taken over most of the work, significantly reducing the time spent on a routine task.

If the project style is ready, then generating, for example, an avatar for the game may take me only 5-7 minutes. During this time, I will get 20-30 images to choose from. It takes about an hour to generate a dozen avatars with the same number of variations.

Using the power of AI, I can single-handedly accomplish many more tasks than before. That's why it's a great tool. It is a tool, not a magic button, which allegedly threatens to take away the work of artists

Neural networks complement each other perfectly. For example, Stable Diffusion works better with casual graphics, and Photoshop's neural network is good at working with realistic images. Only the result is not as predictable as in Stable Diffusion, so there are no additional settings. This is compensated by the user-friendly interface.

For example, if there is an interesting picture that is inconveniently dropped, then you can finish the missing part directly in Photoshop. This is done on a one-two basis. In Stable Diffusion, you can get the same result, but with a little more effort.

Midjourney is able to make beautiful fantasy pictures, but, unfortunately, their resolution leaves much to be desired. Therefore, in ZiMAD we use Midjourney in conjunction with Stable Diffusion to increase resolution. The cool thing is that it is possible to keep the image the same, improving only the resolution or adding the necessary details, if required.

For example, a colleague with the help of Midjourney made a beautiful picture for Magic Jigsaw Puzzles, and I increased its resolution to 4k via Stable Diffusion.

In this case, the Script field helps me, I choose one of the proposed upscalers (in our case, this is LDSR), set the Scale Factor parameter, which is responsible for how many times the image will be enlarged, and also do not forget to set the Denoising strength parameter to near-minimum values (if you set this parameter too high, then the neural network is serious redraws the picture, the values for the upscale in our case are 0.1-0.2).

If you choose a good parameter for denoising, then the neural network will only finish the details and will not change the picture much. For example, this is how the wolf got a well-developed coat and lost pixelation.

Of course, there are disadvantages to working with Stable Diffusion. Architecture and inscriptions are still difficult for her. Buildings can turn out to be chaotic, and the text can be completely unreadable.

But I am sure that these shortcomings will be resolved over time.

Despite the difficulties that arise when working with neural networks, artists and game teams need to learn how to work with them today. Otherwise, they are unlikely to be able to remain competitive in the market.

For example, now I have a task to prepare 50 icons. It could take me at least a week to create them manually. With the help of a neural network, it is unlikely to take more than two days.

So, it is better not to resist new technologies, but to learn how to interact with them effectively.