Categories: Tutorials

Stable Diffusion 3: Install and Launch ComfyUI

After a long wait, the Stable Diffusion 3 trained on 2 billion parameters has been released as an open source on June 12, 2024. For non-commercial purpose, its needed to first accept the the Stability AI’s terms and conditions.

Well for commercial use, you get 23 free credits which means 6.5 credits per image generation(for SD3) and 4 credits(for SD3 turbo), after that, you need to pay for that. Right now there is only support for ComfyUI and you definitely know that StabilityAI don’t like Automatic1111 more. So, Automatic1111 users need to wait for sometime to get the support.

You can learn more about this model by going through the Stable Diffusion 3 tutorial where we did comparison among various ImageGen models like MidjourneyV6 and Dalle3, Stable DiffusionXL 1.0, and Stable Cascade.

Now, let’s start the step-by-step installation process.

Installing ComfyUI locally:

1. Install ComfyUI on your machine and learn the basics of ComfyUI .

2. Now, just install the Stable Diffusion Medium model from the official StabiltyAI’s Hugging Face repository. To do this, you have to create a Hugging Face account and accept the terms and conditions for non-commercial usage.

To get the better optimization with NVIDIA TensorRT, you can download the optimized model version from their official repository. But, we are downloading the general one.

For commercial purposes, its recommended to join the StabilityAI’s creator license or enterprise license. After huge confusion in the community, they have updated their community license which can be accessed from StabilityAI’s page.

3. Move to the “Files and Versions” tab. Here, you will get four versions of the Stable Diffusion 3 model:

sd3_medium.safetensors (Not including clip model)
sd3_medium_incl_clips.safetensors (Included clip model)
sd3_medium_incl_clips_t5xxlfp16.safetensors (Included clip)
sd3_medium_incl_clips_t5xxlfp8.safetensors (Included clip)

The first model listed above doesn’t include the clip model (clip_g.safetensors, clip_l.safetensors, clip_fp8.safetensors, etc.) which needs to be downloaded from the “text encoders” folder of the repository.

Now just save them inside “ComfyUI/models/clip” folder.

But, the rest of them are already included, so you don’t need to download them. It depends on your choice, size, and requirements for which model you want to use. You can download the basic one “sd3_medium.safetensors” that can work very well. But for illustration, we are downloading all one by one.

4. After downloading the required models, save them inside “ComfyUI/models/checkpoints” folder.

5. Now, just download the ComfyUI workflows (.json files) from the “comfy_example_workflows” folder of the repository and drag-drop them into the ComfyUI canvas.

You can also get ideas Stable Diffusion 3 prompts by navigating to “sd3_demo_prompt.txt” inside the repository.

6. Finally, if you get some red-colored missing nodes, then navigate to ComfyUI’s manager, select “Install missing custom nodes” to install them, just “Restart” and “Refersh” ComfyUI to take effect.

At the first run, you will get “Prompt outputs failed validation” error. This means you did not chosen your Stable Diffusion 3 model as the checkpoint.

Simply navigate to the Load checkpoint node and select your downloaded SD3 model.

Use the “TripleCLIPLoader” node to load the downloaded clip models.

At the default seed value is set to fixed. For random output choose “randomize” from the drop down list.

Set the image dimension from the “EmpySD3LAtentImage” node. Put your positive and negative prompt into “CLIP Text Encode (Prompt)” and “CLIP Text Encode (Negative Prompt)” respectively.

As instructed by Emad Mostaque (Former CEO of StabilityAI), use these settings:

-Sampler: DPMPP_2M (Not Karras etc)

-Steps:32

-CFG: 4-6

As usual, we have the KSampler node where we put sampling steps, CFG, Sampler name, Scheduler, and Denoise value. After the image rendering process, if you want to save image then do the right click to save it.

Prompt: “Good Morning” written with foam in coffee cup, clear, high details

The results are not cherry picked and we did multiple tries to get the perfect results. We generated some results which are not so much satisfying what they claims but its really early to say anything because its in developing stage and more fine tuned version are on the way.

We are using NVIDIA RTX 4090, and the render time took 6seconds.

To save the image, you have to create a new node “save image” by double clicking on clean canvas and connect the node to VAE decode node.

Installing using API in ComfyUI:

1. First you should have installed ComfyUI on your machine.

2. Move to the “ComfyUIcustom_nodes” folder. Click in the address bar, remove the folder path, and type “cmd” to open your command prompt.

3. Copy the command with the GitHub repository link to clone the repository on your machine (provided below). This will download Stable Diffusion 3 on your machine:

~~git clone https://github.com/ZHO-ZHO-ZHO/ComfyUI-StableDiffusion3-API.git~~

4. Paste the copied command and press enter to start the installation. Just wait for a moment to get the model installed and restart ComfyUI.

5. Now, the most important part is to create your account using your email by navigating to respective StabilityAI’s official page and create your API key.

6. Move inside the “ComfyUIcustom_nodesComfyUI-StableDiffusion3-API” folder. Copy your API Key from StabilityAI’s dashboard and paste it inside the “config.json” file using any editor(Notepad++/Notepad). But, we are using VS code to edit it.

To edit this just place your API key inside an inverted comma properly otherwise your API key will not work and you will get a 401 credentials error. We have noticed many people are facing this issue. Now, just save the file and restart ComfyUI to take effect.

The API key shown above is just an example for illustration purpose and not the actual API key, so don’t use it.

Basic Workflow:

To create the SD3 node, double-click on the clear canvas and search “stable diffusion 3” on the search bar. Select to add as a new node. Then just connect it to the save image node. Put your prompts and relevant required parameters and click “Queue Prompt” to generate.

Stable Diffusion3 node features:

Prompt box- You will get two boxes for positive and negative conditioning where you put your positive and negative prompts(but not supported by SD3 Turbo model).
Aspect Ratio- You can generate an image in these ratios-“21:9”, “16:9”, “5:4”, “3:2”, “1:1”, “2:3”, “4:5” , “9:16” and “9:21”.
Mode- This option is for working in different types of workflows. Right now the model is in the development stage, so you get text-to-image mode as working.
Model- Two models are available to work with, the first is the Stable Diffusion3 and the other one is the Stable diffusion 3 turbo.
Seed- Set any number as your seed value for generation.
Control After Generate- This is to set you seed in fixed, increment, decrement, randomize.
Strength(optional)- The default is 1.0 you need to set it.

Some Examples:

Some of the generated Image art using Stable Diffusion 3. We tested with different typography styles, and the results were quite interesting.

Prompt: a beautiful girl, holding a cardboard written “SD3”, low lighting effect, 8k

Prompt: A building painted with “SD3” written, on display billboard, night city, 8k

Prompt: a beautiful girl standing, near neon board, written “Stable Diffusion”, red lightings, uhd, 8k

The typography is really great with Stable Diffusion3. The detailing is good, and the colors are very enriched but fingers are again a problem. To get the perfect AI fingers, we generated multiple times to attain that result.

Conclusion:

As compared to other diffusion models, Stable Diffusion 3 generates more refined results. But, its really early to say that it’s a more improved model because people are complaining about the bad generation. Remember the older days when other popular models like Stable Diffusion1.5 or Stable Diffusion XL were not that perfect at their early stage.

The most important feature we got is the typography and the prompt understanding has been improved. The model has been open source and now we can access it for non-commercial purposes.

admage