The realm of AI art is a fascinating landscape of endless creativity. Much like the natural world, where no two leaves are identical, AI-generated art thrives on this uniqueness, offering a plethora of distinct images from a single prompt.
This uniqueness, while captivating, presents challenges, particularly for creators of storybooks or picture books who seek to maintain a consistent character appearance throughout their works.
The quest for character consistency in AI art isn’t a novel challenge. Traditional methods like the seed number and reference image uploads have been the go-to solutions.
However, with the evolution of platforms like Midjourney, newer, more effective techniques have emerged.
The core objective of this article is to unveil a novel method designed to enhance character consistency. This technique is not only effective for photorealistic characters but is equally applicable to anime-style creations.
1. Creating a Character’s Photo Album
The initial step involves creating a comprehensive photo album for your character. This album should display various angles and expressions of the character. Here’s an example of the prompt I used:
Prompt: wide 12-frame photo sheet, young woman with pixie-cut brunette hair, white background, diverse angles and expressions
I usually import this prompt into DALL-E for its standardized grid layout, which offers numerous clever applications (see my previous article). Below is an image generated by DALL-E:
Comparatively, Midjourney’s layout seems a bit cluttered, and character consistency is slightly compromised:
Interestingly, Midjourney can achieve a stable grid layout by utilizing an image generated by DALL-E as a reference.
This synergy allows for the creation of additional avatars through techniques like panning and upscaling.
However, it’s prudent to exercise caution here. My current recommendation is to avoid overextending this feature. Increasing the number of grids tends to compromise Midjourney’s stability.
2. Screenshot and Avatar Upload
After crafting the character album, we can now use it as a robust reference tool.
Select a few headshots that best fit your scene, capture them via screenshots, and upload them to Midjourney. This creates a versatile reference image repository, adaptable to various scenes and character expressions.
For efficiency, I recommend embedding these links in Notion, allowing for easy access and image copying, a more streamlined process compared to Discord.
3. Scene Creation and Face Swapping
Creating the scene is next. I formulated prompts describing the character’s general features to align as closely as possible with the character.
Prompt: outdoor photography, young woman, pixie-cut brunette hair, riding bicycle along park path, morning light on face, trees and sunlight filtering through the leaves
Midjourney’s output was as follows:
Notice the disparity between this output and our intended character. We now employ Midjourney’s inpainting feature to rectify this. The process involves selecting the character’s face and using a headshot from our album that matches the angle and expression needed.
We then input this image link into Midjourney for localized repainting, adjusting for lighting as necessary.
Choosing the most suitable image from Midjourney’s output is crucial. Sometimes, the face-swapping may appear rigid, but don’t worry – this can be refined. If the similarity isn’t quite there, multiple inpaintings might be required.
Further fine-tuning involves using Midjourney’s “Vary” feature. Both “Vary Strong” and “Vary Subtle” are effective here. Ensure Remix mode is activated; this allows for prompt consistency, especially since we have deleted most of the original prompt during previous inpainting.
Re-enter the initial prompt in the dialog box that appears, and include a reference image to maintain facial feature consistency.
The reference image can either be your most recent or one from the headshots album, weighted at 2:
https://s.mj.run/1KlOcmgDtbs outdoor photography, young woman, pixie-cut brunette hair, riding bicycle along park path, morning light on face, trees and sunlight filtering through the leaves –iw 2
In my case, the third image appeared more aligned with our character, showcasing distinct features like short hair and arched eyebrows.
This refining process is iterative. Improved character headshots can be added to the album, enriching your reference gallery and enhancing future consistency.