Categories: Tutorials

Setting up and utilizing InstantID in Automatic1111/ComfyUI

There are multiple models released on the internet but this is simply amazing. Here, we are talking about InstantID which works on the concept of IP Adapter and ControlNet.

This model has grabbed amazing popularity in their GitHub repository. This can be used as an alternative for face-swapping methods like Roop and Reactor or other methods for different image art style generation like LoRA. This model gives more accurate images as compared to the other models even with no model training.

Installing into Automatic1111:

1. First, you have to open the Automatic1111 and move to the “Extension” tab click on “Check for Updates” update the ControlNet extension by selecting it from the list (if the updates are available), and click “Apply and Restart UI” to restart Automatic1111.

If you face any error while installing then simply close the cmd and Automatic1111 and restart again.

2. Here, for using the InstantID model in ControlNet, we need to use two ControlNets to take the benefit of it.

3. Now, download the required models from the GitHub link (IpAdapter model and ControlNet model). Rename the downloaded file into “ip_adapter_instant_id_sdxl” and another file as “control_instant_id_sdxl“.

Windows users need to enable the option “File name extensions” under Menu section otherwise you can’t rename the files with its extensions.

After that put them into the “{Automatic1111 root folder}/models/ControlNet” folder. Well, InstantID uses these two models to work with Automatic1111.

4. Now restart the Automatic1111 to take effect (after restarting it takes time for the first time to download some 300 MB prerequisites for the model in the background) and then select “txt2-img” tab, you will see the ControlNet Unit0[instantID] and ControlNet Unit0[instantID] tabs.

5. Pick the first ControlNet, into the preprocessor as “instant_id_face_embeddings” and under the Model section choose “ip_adapter_instant_id_sdxl“.

6. Into the second Control net under preprocessor select “instant_id_face_keypoints” and into the Model its “contol_instant_id_sdxl“.

Installing in ComfyUI:

0. First of all, for using InstantId, you need to install ComfyUI on your machine.

1. Update the ComfyUI by navigating into ComfyUI Manager section and click on “Update ComfyUI“.

2. Clone the repository by moving to your “ComfyUI/custom_nodes” folder and open command prompt by tying “cmd” into folder address bar.

Then paste the copied link with into command prompt and press enter (command provided below):

~~git clone https://github.com/cubiq/ComfyUI_InstantID.git~~

3. Download the instantFace- antelopev2 (not the classic buffalo_l) from the Hugging face

~~https://huggingface.co/MonsterMMORPG/tools/tree/main~~

And move the downloaded model into “ComfyUI/models/insightface/models/antelopev2” folder.

Now download Ipadapter from here:
~~https://huggingface.co/InstantX/InstantID/resolve/main/ip-adapter.bin?download=true~~

And save your files into “ComfyUI/models/instantid” folder.

4. At last download control net: ~~https://huggingface.co/InstantX/InstantID/resolve/main/ControlNetModel/diffusion_pytorch_model.safetensors?download=true~~

And again move downloaded files into “ControlNET” folder.

4. Now, just restart and refresh your ComfyUI to take effect.

Generating Consistent Images using InstantID:

Let’s try something interesting now.

Here, we took a target image of actress Scarlet Johnson (shown below) in a portrait-style front-face pose wearing a white jacket in a 1:1 image ratio.

For the second image (shown below), we took a random AI-generated beautiful model with blue short hair in a side pose. Upload the first targeted image into ControlNet0 and the second reference image into ControlNet1.

Make sure you use a single-face image and not blurry or blocked otherwise the face detection will not work by InstantID. You can also work with multiple faces but it will detect only the big face amongst all.

Set the preprocessor resolution value range from 512 to 1024.

Set Stable diffusion Checkpoints (in our case it is SDXL turbo)

SD VAE – Automatic

Put your positive and negative prompts in which style you want to generate an image, whether in anime style, 2d art, or just adding an object.

-Sampling Method-DPM Karras++

-CFG scale- 4

-Sampling steps – 8

-Control Weight for both ControlNet – 0.5

-Ending Control Step for both ControlNet- 0.5

These all are recommended settings but if you are unsatisfied with the results you can tweak these a little more in-depth for good results.

This is the target image.

This is the reference image for the pose.

Here, we want to generate the image of Scarlet Johnson with sunglasses in the second image of a beautiful girl’s pose, so we put the positive prompt as “wearing sunglasses” and here is the result.

So, this is the result. The generated image is good, the face pose has been detected by the InstantID but the shoulder is not at the perfect angle as in the reference image.

So, we tried again and we got the required results. You can see it perfectly detected the face’s pose and generated the same with sunglasses.

Now let’s try something different. We took an image of one of our X(Twitter) followers. This time we will only use the target image and not the reference image. So, we uploaded the target image into ControlNet0 and inputted the prompts.

This is the uploaded target image.

This is the first result in different clothing.

Prompt used: pink short hair, blue eyes, flora tattoos on arms

This is the second result in different clothing.

Prompt used: pink short hair, blue eyes, flora tattoos on arms, grey background

This is the third result in 2d art.

Prompt used: pink short hair, blue eyes, flora tattoos on arms,2d art

Note: Please use this model responsibly with consent otherwise you may face legal issues.

Extra Tips:

Some extra tips you need to consider while using InstantID which can do wonders in a long run.

When using InstantID it has been observed that a good base model generates quite impressive results so if the model is taking longer or results are not quite good then you can try different base models like Turbo DiffusionXL Turbo (from CivitAI).

We tested a lot and came to the conclusion that setting the CFG scale to lower than usual with the base model generates quite satisfactory results. Not only this, but here pixels also matter a lot. Well, you can see how the results differ while changing the pixels.

Sometimes this model takes longer than the usual time to generate the image. This happens due to its cache generation again and again while you generate the image. The solution is to set the cache size to less than 1 which will fix it.

This is currently using the In face methodology, so in the meantime it’s not available for commercial purposes, but researchers and educators are always invited.

Conclusion:

InstandID proved to be a powerful model when we try to do the comparison with other models like Roop, Reactor, or LoRA model where fine-tuning and training are compulsory to generate the images in different poses.

InstantID can be used as an extension in Stable Diffusion WebUIs like Automatic1111 or ComfyUI to generate in any style or pose.

admage