Forge WebUI: Boost Your Speed by 6x

How to install Forge WebUI


Forge for Stable diffusion has been released which is designed on the top of Automatic1111 based on the Gradio python library. The name of this WebUI has been taken from the famous game “Minecraft Forge“.

The developer of Forge (also Fooocus and ControlNet) has promised that in the future this WebUI will be converted to the extension of actual Automatic1111 so that you can use it as an extra optional feature with one click. 

Let’s see how you can install Forge WebUI with comparison and test with multiple NVIDIA GPUs.

Comparison between different GPU benchmarks:

GPU Type VRAM Inference Speedup GPU Memory Peak Drop Max Diffusion Resolution Increase Max Diffusion Batch Size Increase
Common (8GB VRAM) 8GB 30~45% 700MB to 1.3GB 2x to 3x 4x to 6x
Less Powerful (6GB VRAM) 6GB 60~75% 800MB to 1.5GB 3x 4x
Powerful (24GB VRAM) 24GB 3~6% 1GB to 1.4GB 1.6x 2x
ControlNet (SDXL) N/A 30~45% (with SDXL) N/A N/A 2x (ControlNet Count)

These are the results reported by the Forge official page but we will test with different GPUs and see how much performance optimization we will get. But, first of all, let’s do the installation.

Installing Forge WebUI:

We have categorized different users’ needs that will help to go through the tutorial.

A. Upgrade Automatic1111 to Forge UI

If you are an old Automatic1111 user, who wants to leverage the power of Forge UI then this process will help you to upgrade it. 
1. First download (Zip file)  provided by the official GitHub repository and extract it using 7ZIP or WinRAR software. 
This file package has all Python, Git, CUDA 12.1, Pytorch 2.3.1 and other included dependencies into it.
Update and run bat file
2. We are using WinRAR for extraction. After extraction move to the folder “webui_forge…” and click on the “update.bat” file to start downloading and updating the necessary files. You can see the progressing status in the command prompt window. 

Its recommended to first update otherwise you may run the bug loaded files which will cause multiple errors.

3. When you see the update complete message popping on the command prompt means that upgrading has been done. Now, close the command prompt and click on the “run.bat” file to run the Forge WebUI on your browser. 
Right click on web-user.bat file
setup web-user.bat file
4. Set up webui-user.bat file by right-clicking on it and setting up the “Automatic1111” directory path. To configure the dark mode, use “–theme dark” as the arguments.

Always consider ” (backslash) and ‘/’ (forward slash) while copying and pasting the directory path.

Open ForgeUI
5. Open Forge UI by copying and pasting the local URL into the browser.

If you are facing error while using the current process, then you can try the usual installation which is provided below which directs you to get the fresh installation.

B. Install Forge WebUI directly

If you are a new user who haven’t installed Automatic1111 but wants to use Forge WebUI directly, then you need to use this workflow. 
1. Download (Zip file)  provided by the official GitHub repository and extract it using zip file extracting software like 7ZIP or WinRAR. This file packs with all  the dependencies like Python, Git, CUDA 12.1, Pytorch 2.3.1, etc.
Update and run bat file
2. After extraction navigate to the folder “webui_forge…” and click on the “update.bat” file. This will do the necessary installation in the background. Wait for the installation to complete.
To check the real-time status head over to the progressing status in the command prompt window. 
3. After installation, you will get the “update complete” message. Close the command prompt and click on the “run.bat” file to run the Forge WebUI on your browser.

C. Run both Automatic1111 and Forge

Before using these methods make sure you know what you are doing and you are well familiar with Python and Git.
Method 1: 
This method provides the option to upgrade and downgrade the Automatic1111 and move to Forge UI.
1. First move to the “stable-diffusion-webui” folder and open the command prompt by clicking on the address bar. 
Make sure you have Python and Git installed and have the knowledge of it.
Now, use these Git commands one by one by just copying and pasting into your command prompt:

git remote add forge https://github.com/lllyasviel/stable-diffusion-webui-forge
git branch lllyasviel/main
git checkout lllyasviel/main
git fetch forge
git branch -u forge/main
git pull
This will add a new directory into your “stable-diffusion-webui” directory and download all Forge’s files so that you can use all the checkpoints and extensions earlier installed for Automatic1111. 
Now if you want to move back to  Automatic1111, use this command:
git checkout master 
or 
git checkout dev

Then you use this command to move the directory
cd extensions/sd-webui-animatediff 
git checkout forge/master
Method2: 

Using symlink you can run both of it. For that just open the Command Prompt. 
Let’s assume you have Forge in C:UsersexampleDesktopforge and A1111 in C:UsersexampleDesktopa1111 respectively. 
If there’s a ‘models‘ directory inside C:UsersexampleDesktopforge just delete it. 
Now run this command in the command prompt(change your path accordingly).
mklink /d c:usersexampledesktopforgemodels c:usersexampleDesktopa1111models

Testing with Forge and Automatic1111

Let’s do some testing with these two WebUIs with different machines but similar prompts. We are using the SDXL base model for generating 1024 by 1024 images. We ran multiple tests and came to the conclusion as follows:

  1. Tested with RTX3050 4GB VRAM
     
    Prompt: a girl with a neon rainbow colored hair holding a black cat, realistic anime, uhd image, cartoonish characters, violet and azure, nightcore, depth of layers, cartoon-like characters, rainbow colors, hyper realism , 8k, octane render 

    Art generated using Automatic1111
    Generated using Automatic1111

    Art generated using Forge WebUI
    Generated using Forge

    Generated time on Automatic1111- 10min
    Generated time on Forge- 2min

  2. Tested with RTX3070 8GB VRAM

    Prompt: front profile full body Photography, in front of black wall, a punk 80’s British model woman with 50’s haircut, in a blue and white turtleneck dress and large sunglasses, 80 degree view, art by Sergio Lopez , Natalie Shau, james jean and salvador dali

    Generated using Automatic1111
    Generated using Automatic1111

    Generated using Forge
    Generated using Forge

    Generated time on Automatic1111- 10 sec
    Generated time on Forge- 5 sec

  3. Tested with RTX3060 6GB VRAM

    Prompt: statue of liberty made up of  ice in northlights heaven illustration, portrait, hyper realistic , ultra detailed, 8k

    Statue of liberty - Art using Forge
    Generated using forge

    Art using Automatic1111
    Generated using Automatic1111

    Generated time on Automatic1111- 23 sec
    Generated time on Forge- 15 sec

  4. Tested with RTX1070 6GB VRAM

    Prompt: girl wearing headphones, super rough abstract aquarell amazing neon painting style on black 

    A girl wearing headphone (Automatic1111)
    Generated using Automatic1111

    A girl wearing headphone (Forge)
    Generated using Forge


    Generated time on Automatic1111- 55sec
    Generated time on Forge- 15sec


  5. Tested with RTX4060 16GB VRAM

    Prompt: a beautiful indian bride, wearing heavy glass makeup, brown eyes, intricate red sari, portrait, shining golden jewellery, professional photoshoot, low angle photography, 8k,highly detailed, octane render

    Indian bride photoshoot- Art using Automatic1111
    Generated using Automatic1111

    Indian bride photshoot- Art using Forge
    Generated using Forge

Generated time on Automatic1111- 15.5sec
Generated time on Forge- 14.5sec


Performance comparison with Automatic1111 and Forge
The performance has been significantly increased a lot. What we noticed is that using SDXL models gives a way better performance than using SD1.5 models in VRAM memory allotment. 

Conclusion:

Forge Webui is known to be a life-saving effort from its developers for the people who were struggling to run SDXL models into Automatic1111 with lower VRAM. It also includes Stable Video diffusion, and Zero123 tab however you will lose some extensions like Roop but Reactor can be used instead.