Major changes to wiki to improve clarity compared to README and seperate features repository.
16
Change-UI-Defaults.md
Normal file
@ -0,0 +1,16 @@
|
||||
The default values in the web UI can be changed by editing `ui-config.json`, which appears in the base directory containing `webui.py` after the first run.
|
||||
The changes are only applied after restarting.
|
||||
```json
|
||||
{
|
||||
"txt2img/Sampling Steps/value": 20,
|
||||
"txt2img/Sampling Steps/minimum": 1,
|
||||
"txt2img/Sampling Steps/maximum": 150,
|
||||
"txt2img/Sampling Steps/step": 1,
|
||||
"txt2img/Batch count/value": 1,
|
||||
"txt2img/Batch count/minimum": 1,
|
||||
"txt2img/Batch count/maximum": 32,
|
||||
"txt2img/Batch count/step": 1,
|
||||
"txt2img/Batch size/value": 1,
|
||||
"txt2img/Batch size/minimum": 1,
|
||||
...
|
||||
```
|
@ -1,6 +1,6 @@
|
||||
To install custom scripts, drop them into `scripts` directory and restart the web ui.
|
||||
|
||||
## Advanced prompt matrix
|
||||
# Advanced prompt matrix
|
||||
https://github.com/GRMrGecko/stable-diffusion-webui-automatic/blob/advanced_matrix/scripts/advanced_prompt_matrix.py
|
||||
|
||||
It allows a matrix prompt as follows:
|
||||
@ -8,7 +8,7 @@ It allows a matrix prompt as follows:
|
||||
|
||||
Does not actually draw a matrix, just produces pictures.
|
||||
|
||||
## Wildcards
|
||||
# Wildcards
|
||||
https://github.com/jtkelm2/stable-diffusion-webui-1/blob/master/scripts/wildcards.py
|
||||
|
||||
Script support so that prompts can contain wildcard terms (indicated by surrounding double underscores), with values instantiated randomly from the corresponding .txt file in the folder `/scripts/wildcards/`. For example:
|
35
Dependencies.md
Normal file
@ -0,0 +1,35 @@
|
||||
# Required Dependencies
|
||||
1. Python 3.10.6 and Git:
|
||||
- Windows:
|
||||
- [Python](https://www.python.org/downloads/windows/)
|
||||
- [Git](https://git-scm.com)
|
||||
- Linux (Debian-based):
|
||||
```bash
|
||||
sudo apt install wget git python3 python3-venv
|
||||
```
|
||||
- Linux (Red Hat-based):
|
||||
```bash
|
||||
sudo dnf install wget git python3
|
||||
```
|
||||
- Linux (Arch-based):
|
||||
```bash
|
||||
sudo pacman -S wget git python3
|
||||
```
|
||||
2. The stable-diffusion-webui code may be cloned by running:
|
||||
```bash
|
||||
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
|
||||
```
|
||||
3. The Stable Diffusion model checkpoint `model.ckpt` needs to be placed in the base directory, alongside `webui.py`
|
||||
- [Official download](https://huggingface.co/CompVis/stable-diffusion-v-1-4-original)
|
||||
- [File storage](https://drive.yerf.org/wl/?id=EBfTrmcCCUAGaQBXVIj5lJmEhjoP1tgl)
|
||||
- Torrent (magnet:?xt=urn:btih:3a4a612d75ed088ea542acac52f9f45987488d1c&dn=sd-v1-4.ckpt&tr=udp%3a%2f%2ftracker.openbittorrent.com%3a6969%2fannounce&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337)
|
||||
|
||||
# Optional Dependencies
|
||||
## GFPGAN (Improve Faces)
|
||||
GFPGAN can be used to improve faces, requiring the [model](https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth) to be placed in the base directory.
|
||||
|
||||
## ESRGAN (Upscaling)
|
||||
ESRGAN models such as those from the [Model Database](https://upscale.wiki/wiki/Model_Database, may be placed into the ESRGAN directory.
|
||||
A file will be loaded as a model if it has `.pth` extension, and it will show up with its name in the UI.
|
||||
|
||||
> Note: RealESRGAN models are not ESRGAN models, they are not compatible. Do not download RealESRGAN models. Do not place RealESRGAN into the directory with ESRGAN models.
|
347
Features.md
Normal file
@ -0,0 +1,347 @@
|
||||
This is a feature showcase page for [Stable Diffusion web UI](https://github.com/AUTOMATIC1111/stable-diffusion-webui).
|
||||
|
||||
All examples are non-cherrypicked unless specified otherwise.
|
||||
|
||||
# Outpainting
|
||||
|
||||
Outpainting extends original image and inpaints created empty space.
|
||||
|
||||
Example:
|
||||
|
||||
| Original | Oupainting | Outpainting again |
|
||||
|------------------------------|------------------------------|------------------------------|
|
||||
| ![](images/outpainting-1.png) | ![](images/outpainting-2.png) | ![](images/outpainting-3.png) |
|
||||
|
||||
Original image by Anonymous user from 4chan. Thank you, Anonymous user.
|
||||
|
||||
You can find the feature in the img2img tab at the bottom, under Script -> Poor man's outpainting.
|
||||
|
||||
Outpainting, unlike normal image generation, seems to profit very much from large step count. A recipe for a good outpainting
|
||||
is a good prompt that matches the picture, sliders for denoising and FCG scale set to max, and step count of 50 to 100 with
|
||||
euler ancestral or DPM2 ancestral samplers.
|
||||
|
||||
| 81 steps, Euler A | 30 steps, Euler A | 10 steps, Euler A | 80 steps, Euler A |
|
||||
|-------------------------------------|---------------------------------------|--------------------------------------|-------------------------------------|
|
||||
| ![](images/inpainting-81-euler-a.png) | ![](images/inpainting-30-euler-a.png) | ![](images/inpainting-10-euler-a.png) | ![](images/inpainting-80-dpm2-a.png) |
|
||||
|
||||
# Inpainting
|
||||
In img2img tab, draw a mask over a part of image, and that part will be in-painted.
|
||||
|
||||
![](images/inpainting.png)
|
||||
|
||||
Options for inpainting:
|
||||
- draw a mask yourself in web editor
|
||||
- erase a part of picture in external editor and upload a transparent picture. Any even slightly transparent areas will become part of the mask. Be aware that [some editors](https://docs.krita.org/en/reference_manual/layers_and_masks/split_alpha.html#how-to-save-a-png-texture-and-keep-color-values-in-fully-transparent-areas) save completely transparent areas as black by default.
|
||||
- change mode (to the bottom right of the picture) to "Upload mask" and choose a separate black and while image for mask (white=inpaint).
|
||||
|
||||
## Masked content
|
||||
Masked content field determines content is placed to put into the masked regions before thet are inpainted.
|
||||
|
||||
| mask | fill | original | latent noise | latent nothing |
|
||||
|-------------------------------------------------|-------------------------------------------------|-----------------------------------------------------|---------------------------------------------------------|-----------------------------------------------------------|
|
||||
| ![](images/inpainting-initial-content-mask.png) | ![](images/inpainting-initial-content-fill.png) | ![](images/inpainting-initial-content-original.png) | ![](images/inpainting-initial-content-latent-noise.png) | ![](images/inpainting-initial-content-latent-nothing.png) |
|
||||
|
||||
## Inpaint at full resolution
|
||||
Normally, inpaiting resizes the image to target resolution specified in the UI. With Inpaint at full resolution
|
||||
enabled, only the masked region is resized, and after processing it is pasted back to the original picture.
|
||||
This allows you to work with large pictures, and allows to render the inpained object at a much larger resolution.
|
||||
|
||||
|
||||
| Input | Inpaint normal | Inpaint at whole resolution |
|
||||
|-------------------------------------|----------------------------------|-----------------------------------|
|
||||
| ![](images/inpaint-whole-mask.png) | ![](images/inpaint-whole-no.png) | ![](images/inpaint-whole-yes.png) |
|
||||
|
||||
|
||||
## Masking mode
|
||||
There are two options for masked mode:
|
||||
- Inpaint masked - the region under the mask is inpainted
|
||||
- Inpaint not masked - under the mask is unchanged, everything else is inpainted
|
||||
|
||||
## Alpha mask
|
||||
|
||||
| Input | Output |
|
||||
|------------------------------|-------------------------------|
|
||||
| ![](images/inpaint-mask.png) | ![](images/inpaint-mask2.png) |
|
||||
|
||||
|
||||
# Prompt matrix
|
||||
Separate multiple prompts using the `|` character, and the system will produce an image for every combination of them.
|
||||
For example, if you use `a busy city street in a modern city|illustration|cinematic lighting` prompt, there are four combinations possible (first part of prompt is always kept):
|
||||
|
||||
- `a busy city street in a modern city`
|
||||
- `a busy city street in a modern city, illustration`
|
||||
- `a busy city street in a modern city, cinematic lighting`
|
||||
- `a busy city street in a modern city, illustration, cinematic lighting`
|
||||
|
||||
Four images will be produced, in this order, all with same seed and each with corresponding prompt:
|
||||
![](images/prompt-matrix.png)
|
||||
|
||||
Another example, this time with 5 prompts and 16 variations:
|
||||
![](images/prompt_matrix.jpg)
|
||||
|
||||
You can find the feature at the bottom, under Script -> Prompt matrix.
|
||||
|
||||
# Stable Diffusion upscale
|
||||
Upscale image using RealESRGAN/ESRGAN and then go through tiles of the result, improving them with img2img.
|
||||
Also has an let you do the upscaling part yourself in external program, and just go through tiles with img2img.
|
||||
|
||||
Original idea by: https://github.com/jquesnelle/txt2imghd. This is an independent implementation.
|
||||
|
||||
To use this feature, tick a checkbox in the img2img interface. Input image will be upscaled to twice the original
|
||||
width and height, and UI's width and height sliders specify the size of individual tiles. Because of overlap,
|
||||
the size of tile can be very important: 512x512 image needs nine 512x512 tiles (because of overlap), but only
|
||||
four 640x640 tiles.
|
||||
|
||||
Rcommended parameters for upscaling:
|
||||
- Sampling method: Euler a
|
||||
- Denoising strength: 0.2, can go up to 0.4 if you feel adventureous
|
||||
|
||||
| Original | RealESRGAN | Topaz Gigapixel | SD upscale |
|
||||
|-------------------------------------------|---------------------------------------------|---------------------------------------------------------|---------------------------------------------|
|
||||
| ![](images/sd-upscale-robot-original.png) | ![](images/sd-upscale-robot-realesrgan.png) | ![](images/sd-upscale-robot-esrgan-topaz-gigapixel.png) | ![](images/sd-upscale-robot-sd-upscale.png) |
|
||||
| ![](images/sd-upscale-castle-original.png) | ![](images/sd-upscale-castle-realesrgan.png) | ![](images/sd-upscale-castle-esrgan-topaz-gigapixel.png) | ![](images/sd-upscale-castle-sd-upscale.png) |
|
||||
| ![](images/sd-upscale-city-original.png) | ![](images/sd-upscale-city-realesrgan.png) | ![](images/sd-upscale-city-esrgan-topaz-gigapixel.png) | ![](images/sd-upscale-city-sd-upscale.png) |
|
||||
|
||||
# Attention
|
||||
Using `()` in prompt increases model's attention to enclosed words, and `[]` decreases it. You can combine
|
||||
multiple modifiers:
|
||||
|
||||
![](images/attention-3.jpg)
|
||||
|
||||
# Loopback
|
||||
A checkbox for img2img allowing to automatically feed output image as input for the next batch. Equivalent to
|
||||
saving output image, and replacing input image with it. Batch count setting controls how many iterations of
|
||||
this you get.
|
||||
|
||||
Usually, when doing this, you would choose one of many images for the next iteration yourself, so the usefulness
|
||||
of this feature may be questionable, but I've managed to get some very nice outputs with it that I wasn't abble
|
||||
to get otherwise.
|
||||
|
||||
Example: (cherrypicked result)
|
||||
|
||||
![](images/loopback.jpg)
|
||||
|
||||
Original image by Anonymous user from 4chan. Thank you, Anonymous user.
|
||||
|
||||
# X/Y plot
|
||||
Creates a grid of images with varying parameters. Select which parameters should be shared by rows and columns using
|
||||
X type and Y type fields, and input those parameters separated by comma into X values/Y values fields. For integer,
|
||||
and floating ponit numbers, ranges are supported. Examples:
|
||||
|
||||
- `1-5` = 1, 2, 3, 4, 5
|
||||
- `1-5 (+2)` = 1, 3, 5
|
||||
- `10-5 (-3)` = 10, 7
|
||||
- `1-3 (+0.5)` = 1, 1.5, 2, 2.5, 3
|
||||
|
||||
![](images/xy_grid-medusa.png)
|
||||
|
||||
Here's are settings that create the graph above:
|
||||
|
||||
![](images/xy_grid-medusa-ui.png)
|
||||
|
||||
# Textual Inversion
|
||||
Allows you to use pretrained textual inversion embeddings.
|
||||
See original site for details: https://textual-inversion.github.io/.
|
||||
I used lstein's repo for training embdedding: https://github.com/lstein/stable-diffusion; if
|
||||
you want to train your own, I recommend following the guide on his site.
|
||||
|
||||
To make use of pretrained embeddings, create `embeddings` directory in the root dir of Stable
|
||||
Diffusion and put your embeddings into it. They must be .pt files about 5Kb in size, each with only
|
||||
one trained embedding, and the filename (without .pt) will be the term you'd use in prompt
|
||||
to get that embedding.
|
||||
|
||||
As an example, I trained one for about 5000 steps: https://files.catbox.moe/e2ui6r.pt; it does
|
||||
not produce very good results, but it does work. Download and rename it to `Usada Pekora.pt`,
|
||||
and put it into `embeddings` dir and use Usada Pekora in prompt.
|
||||
|
||||
![](images/inversion.png)
|
||||
|
||||
# Resizing
|
||||
There are three options for resizing input images in img2img mode:
|
||||
|
||||
- Just resize - simply resizes source image to target resolution, resulting in incorrect aspect ratio
|
||||
- Crop and resize - resize source image preserving aspect ratio so that entirety of target resolution is occupied by it, and crop parts that stick out
|
||||
- Resize and fill - resize source image preserving aspect ratio so that it entirely fits target resolution, and fill empty space by rows/columns from source image
|
||||
|
||||
Example:
|
||||
![](images/resizing.jpg)
|
||||
|
||||
# Sampling method selection
|
||||
Pick out of multiple sampling methods for txt2img:
|
||||
|
||||
![](images/sampling.jpg)
|
||||
|
||||
# Seed resize
|
||||
This function allows you to generate images from known seeds at different resolutions. Normally, when you change resolution,
|
||||
the image changes entirely, even if you keep all other parameters including seed. With seed resizing you specify the resolution
|
||||
of the original image, and the model will very likely produce something looking very similar to it, even at a different resolution.
|
||||
In the example below, the leftmost picture is 512x512, and others are produced with exact same parameters but with larger vertical
|
||||
resolution.
|
||||
|
||||
| Info | Image |
|
||||
|---------------------------|-------------------------------|
|
||||
| Seed resize not enabled | ![](images/seed-noresize.png) |
|
||||
| Seed resized from 512x512 | ![](images/seed-resize.png) |
|
||||
|
||||
Ancestral samplers are a little worse at this than the rest.
|
||||
|
||||
You can find this ferature by clicking the "Extra" checkbox near the seed.
|
||||
|
||||
# Variations
|
||||
A Variation strength slider and Variation seed field allow you to specify how much the existing picture should be altered to look
|
||||
like a different one. At maximum strength you will get picture with Variation seed, at minimum - picture with original Seed (except
|
||||
for when using ancestral samplers).
|
||||
|
||||
![](images/seed-variations.jpg)
|
||||
|
||||
You can find this ferature by clicking the "Extra" checkbox near the seed.
|
||||
|
||||
# Styles
|
||||
Press "Save prompt as style" button to write your current prompt to styles.csv, the file with collection of styles. A dropbox to
|
||||
the right of the prompt will allow you to choose any style out of previously saved, and automatically append it to your input.
|
||||
To delete style, manually delete it from styles.csv and restart the program.
|
||||
|
||||
# Negative prompt
|
||||
|
||||
Allows you to use another prompt of things the model should avoid when generating the picture. This works by using the
|
||||
negative prompt for unconditional conditioning in the sampling process instead of empty string.
|
||||
|
||||
| Original | Negative: purple | Negative: tentacles |
|
||||
|-------------------------------|---------------------------------|------------------------------------|
|
||||
| ![](images/negative-base.png) | ![](images/negative-purple.png) | ![](images/negative-tentacles.png) |
|
||||
|
||||
# CLIP interrogator
|
||||
|
||||
Originally by: https://github.com/pharmapsychotic/clip-interrogator
|
||||
|
||||
CLIP interrogator allows you to retrieve prompt from an image. The prompt won't allow you to reproduce this
|
||||
exact image (and sometimes it won't even be close), but it can be a good start.
|
||||
|
||||
![](images/CLIP-interrogate.png)
|
||||
|
||||
The first time you run CLIP interrogator it will download few gigabytes of models.
|
||||
|
||||
CLIP interrogator has two parts: one is a BLIP model that creates a text description from the picture.
|
||||
Other is a CLIP model that will pick few lines relevant to the picture out of a list. By default, there
|
||||
is only one list - a list of artists (from `artists.csv`). You can add more lists by doing the follwoing:
|
||||
|
||||
- create `interrogate` directory in same place as web ui
|
||||
- put text files in it with a relevant description on each line
|
||||
|
||||
For example of what text files to use, see https://github.com/pharmapsychotic/clip-interrogator/tree/main/data.
|
||||
In fact, you can just take files from there and use them - just skip artists.txt because you already have a list of
|
||||
artists in `artists.csv` (or use that too, who's going to stop you). Each file adds one line of text to final description.
|
||||
If you add ".top3." to filename, for example, `flavors.top3.txt`, three most relevant lines from this file will be
|
||||
added to the prompt (other numbers also work).
|
||||
|
||||
There are settings relevant to this feature:
|
||||
- `Interrogate: keep models in VRAM` - do not unload Interrogate models from memory after using them. For users with a lot of VRAM.
|
||||
- `Interrogate: use artists from artists.csv` - adds artist from `artists.csv` when interrogating. Can be useful disable when you have your list of artists in `interrogate` directory
|
||||
- `Interrogate: num_beams for BLIP` - parameter that affects how detailed descriptions from BLIP model are (the first part of generated prompt)
|
||||
- `Interrogate: minimum descripton length` - minimum length for BLIP model's text
|
||||
- `Interrogate: maximum descripton length` - maximum length for BLIP model's text
|
||||
- `Interrogate: maximum number of lines in text file` - interrogator will only consider this many first lines in a file. Set to 0, default is 1500, which is about as much as a 4GB videocard can handle.
|
||||
|
||||
# Interrupt
|
||||
|
||||
Press the Interrupt button to stop current processing.
|
||||
|
||||
# 4GB videocard support
|
||||
Optimizations for GPUs with low VRAM. This should make it possible to generate 512x512 images on videocards with 4GB memory.
|
||||
|
||||
`--lowvram` is a reimplementation of optimization idea from by [basujindal](https://github.com/basujindal/stable-diffusion).
|
||||
Model is separated into modules, and only one module is kept in GPU memory; when another module needs to run, the previous
|
||||
is removed from GPU memory. The nature of this optimization makes the processing run slower -- about 10 times slower
|
||||
compared to normal operation on my RTX 3090.
|
||||
|
||||
`--medvram` is another optimization that should reduce VRAM usage significantly by not processing conditional and
|
||||
unconditional denoising in a same batch.
|
||||
|
||||
This implementation of optimization does not require any modification to original Stable Diffusion code.
|
||||
|
||||
# Face restoration
|
||||
Lets you improve faces in pictures using either GFPGAN or CodeFormer. There is a checkbox in every tab to use face restoration,
|
||||
and also a separate tab that just allows you to use face restoration on any picture, with a slider that controls how visible
|
||||
the effect is. You can choose between the two methods in settings.
|
||||
|
||||
| Original | GFPGAN | CodeFormer |
|
||||
|-------------------------|--------------------------------|------------------------------------|
|
||||
| ![](images/facefix.png) | ![](images/facefix-gfpgan.png) | ![](images/facefix-codeformer.png) |
|
||||
|
||||
|
||||
# Saving
|
||||
Click the Save button under the output section, and generated images will be saved to a directory specified in settings;
|
||||
generation parameters will be appended to a csv file in the same directory.
|
||||
|
||||
# Correct seeds for batches
|
||||
If you use a seed of 1000 to generate two batches of two images each, four generated images will have seeds: `1000, 1001, 1002, 1003`.
|
||||
Previous versions of the UI would produce `1000, x, 1001, x`, where x is an image that can't be generated by any seed.
|
||||
|
||||
# Loading
|
||||
Gradio's loading graphic has a very negative effect on the processing speed of the neural network.
|
||||
My RTX 3090 makes images about 10% faster when the tab with gradio is not active. By default, the UI
|
||||
now hides loading progress animation and replaces it with static "Loading..." text, which achieves
|
||||
the same effect. Use the `--no-progressbar-hiding` commandline option to revert this and show loading animations.
|
||||
|
||||
# Prompt validation
|
||||
Stable Diffusion has a limit for input text length. If your prompt is too long, you will get a
|
||||
warning in the text output field, showing which parts of your text were truncated and ignored by the model.
|
||||
|
||||
# Png info
|
||||
Adds information about generation parameters to PNG as a text chunk. You
|
||||
can view this information later using any software that supports viewing
|
||||
PNG chunk info, for example: https://www.nayuki.io/page/png-file-chunk-inspector
|
||||
|
||||
# Settings
|
||||
A tab with settings, allowing you to use UI to edit more than half of parameters that previously
|
||||
were commandline. Settings are saved to config.js file. Settings that remain as commandline
|
||||
options are ones that are required at startup.
|
||||
|
||||
# User scripts
|
||||
If the program is launched with `--allow-code` option, an extra text input field for script code
|
||||
is available in the bottom of the page, under Scripts -> Custom code. It allows you to input python
|
||||
code that will do the work with the image.
|
||||
|
||||
In code, access parameters from web UI using the `p` variable, and provide outputs for web UI
|
||||
using the `display(images, seed, info)` function. All globals from script are also accessible.
|
||||
|
||||
A simple script that would just process the image and output it normally:
|
||||
|
||||
```python
|
||||
import modules.processing
|
||||
|
||||
processed = modules.processing.process_images(p)
|
||||
|
||||
print("Seed was: " + str(processed.seed))
|
||||
|
||||
display(processed.images, processed.seed, processed.info)
|
||||
```
|
||||
|
||||
# UI config
|
||||
You can change parameters for UI elements:
|
||||
- radio groups: default selection
|
||||
- sliders: defaul value, min, max, step
|
||||
|
||||
The file is ui-config.json in webui dir, and it is created automatically if you don't have one when the program starts.
|
||||
|
||||
Some settings will break processing, like step not divisible by 64 for width and heght, and some, lie changing default
|
||||
function on the img2img tab, may break UI. I do not have plans to address those in near future.
|
||||
|
||||
# ESRGAN
|
||||
It's possible to use ESRGAN models on the Extras tab, as well as in SD upscale.
|
||||
|
||||
To use ESRGAN models, put them into ESRGAN directory in the same location as webui.py.
|
||||
A file will be loaded as model if it has .pth extension. Grab models from the [Model Database](https://upscale.wiki/wiki/Model_Database).
|
||||
|
||||
Not all models from the database are supported. All 2x models are most likely not supported.
|
||||
|
||||
# img2img alternative test
|
||||
- see [this post](https://www.reddit.com/r/StableDiffusion/comments/xboy90/a_better_way_of_doing_img2img_by_finding_the/) on ebaumsworld.com for context.
|
||||
- find it in scripts section
|
||||
- put description of input image into the Original prompt field
|
||||
- use Euler only
|
||||
- recommended: 50 steps, low cfg scale between 1 and 2
|
||||
- denoising and seed don't matter
|
||||
- decode cfg scale between 0 and 1
|
||||
- decode steps 50
|
||||
- original blue haired woman close nearly reproduces with cfg scale=1.8
|
11
Home.md
@ -1 +1,10 @@
|
||||
Welcome to the stable-diffusion-webui wiki!
|
||||
**Stable Diffusion web UI** is a browser interface for Stable Diffusion based on Gradio library.
|
||||
|
||||
- [Features](Features)
|
||||
- [Dependencies](Dependencies)
|
||||
- [Installation and run on NVidia GPUs](Install-and-Run-on-NVidia-GPUs)
|
||||
- [Installation and run on AMD GPUs](Install-and-Run-on-AMD-GPUs)
|
||||
- [Running with custom parameters](Run-with-Custom-Parameters)
|
||||
- [Changing UI defaults](Change-UI-Defaults)
|
||||
- [Custom scripts from users](Custom-Scripts-from-Users)
|
||||
- [Troubleshooting](Troubleshooting)
|
||||
|
114
Install-and-Run-on-NVidia-GPUs.md
Normal file
@ -0,0 +1,114 @@
|
||||
Before attempting to install make sure all the required [dependencies](Dependencies) are met.
|
||||
|
||||
# Automatic Installation
|
||||
## Windows
|
||||
Run `webui-user.bat` from Windows Explorer as normal, ***non-administrate***, user.
|
||||
|
||||
## Linux
|
||||
To install in the default directory `/home/$(whoami)/stable-diffusion-webui/`, run:
|
||||
```bash
|
||||
bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh)
|
||||
```
|
||||
|
||||
In order to customize the installation, clone the repository into the desired location, change the required variables in `webui-user.sh` and run :
|
||||
```bash
|
||||
bash webui.sh
|
||||
```
|
||||
|
||||
# Almost Automatic Installation and Launch
|
||||
To install the required packages via pip without creating a virtual environment, run:
|
||||
```bash
|
||||
python launch.py
|
||||
```
|
||||
|
||||
Command line arguments may be passed directly, for example:
|
||||
```bash
|
||||
python launch.py --opt-split-attention --ckpt ../secret/anime9999.ckpt
|
||||
```
|
||||
|
||||
# Manual Installation
|
||||
The following process installs everything manually on both Windows or Linux (the latter requiring `dir` to be replaced by `ls`):
|
||||
```bash
|
||||
# install torch with CUDA support. See https://pytorch.org/get-started/locally/ for more instructions if this fails.
|
||||
pip install torch --extra-index-url https://download.pytorch.org/whl/cu113
|
||||
|
||||
# check if torch supports GPU; this must output "True". You need CUDA 11. installed for this. You might be able to use
|
||||
# a different version, but this is what I tested.
|
||||
python -c "import torch; print(torch.cuda.is_available())"
|
||||
|
||||
# clone web ui and go into its directory
|
||||
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
|
||||
cd stable-diffusion-webui
|
||||
|
||||
# clone repositories for Stable Diffusion and (optionally) CodeFormer
|
||||
mkdir repositories
|
||||
git clone https://github.com/CompVis/stable-diffusion.git repositories/stable-diffusion
|
||||
git clone https://github.com/CompVis/taming-transformers.git repositories/taming-transformers
|
||||
git clone https://github.com/sczhou/CodeFormer.git repositories/CodeFormer
|
||||
git clone https://github.com/salesforce/BLIP.git repositories/BLIP
|
||||
|
||||
# install requirements of Stable Diffusion
|
||||
pip install transformers==4.19.2 diffusers invisible-watermark --prefer-binary
|
||||
|
||||
# install k-diffusion
|
||||
pip install git+https://github.com/crowsonkb/k-diffusion.git --prefer-binary
|
||||
|
||||
# (optional) install GFPGAN (face restoration)
|
||||
pip install git+https://github.com/TencentARC/GFPGAN.git --prefer-binary
|
||||
|
||||
# (optional) install requirements for CodeFormer (face restoration)
|
||||
pip install -r repositories/CodeFormer/requirements.txt --prefer-binary
|
||||
|
||||
# install requirements of web ui
|
||||
pip install -r requirements.txt --prefer-binary
|
||||
|
||||
# update numpy to latest version
|
||||
pip install -U numpy --prefer-binary
|
||||
|
||||
# (outside of command line) put stable diffusion model into web ui directory
|
||||
# the command below must output something like: 1 File(s) 4,265,380,512 bytes
|
||||
dir model.ckpt
|
||||
|
||||
# (outside of command line) put the GFPGAN model into web ui directory
|
||||
# the command below must output something like: 1 File(s) 348,632,874 bytes
|
||||
dir GFPGANv1.3.pth
|
||||
```
|
||||
|
||||
The installation is finished, to start the web ui, run:
|
||||
```bash
|
||||
python webui.py
|
||||
```
|
||||
|
||||
# Windows 11 WSL2 instructions
|
||||
To install under a Linux distro in Windows 11's WSL2:
|
||||
```bash
|
||||
# install conda (if not already done)
|
||||
wget https://repo.anaconda.com/archive/Anaconda3-2022.05-Linux-x86_64.sh
|
||||
chmod +x Anaconda3-2022.05-Linux-x86_64.sh
|
||||
./Anaconda3-2022.05-Linux-x86_64.sh
|
||||
|
||||
# Clone webui repo
|
||||
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
|
||||
cd stable-diffusion-webui
|
||||
|
||||
# Create and activate conda env
|
||||
conda env create -f environment-wsl2.yaml
|
||||
conda activate automatic
|
||||
|
||||
# (optional) install requirements for GFPGAN (upscaling)
|
||||
wget https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth
|
||||
```
|
||||
|
||||
At this point, the instructions for the Manual installation may be applied starting at step `# clone repositories for Stable Diffusion and (optionally) CodeFormer`.
|
||||
|
||||
# Troubleshooting
|
||||
- Although support will only be offered for Python 3.10.6, other versions should work.
|
||||
- The installer creates a python virtual environment, so none of the installed modules will affect existing system installations of python.
|
||||
- To use the system's python rather than creating a virtual environment, use custom parameter replacing `set VENV_DIR=-`.
|
||||
- To reinstall from scratch, delete directories: `venv`, `repositories`.
|
||||
|
||||
## Windows
|
||||
- If the desired version of Python is not in PATH, modify the line `set PYTHON=python` in `webui-user.bat` with the full path to the python executable.
|
||||
- Example: `set PYTHON=B:\soft\Python310\python.exe`
|
||||
- This won't work with git.
|
||||
- `webui.bat` installs requirements from `requirements_versions.txt`, which lists versions for modules specifically compatible with Python 3.10.6. If this doesn't work with other versions of Python, setting the custom parameter `set REQS_FILE=requirements.txt` may help.
|
19
Run-with-Custom-Parameters.md
Normal file
@ -0,0 +1,19 @@
|
||||
# webui-user
|
||||
The recommended way to customize how the program is run is editing `webui-user.bat` (Windows) and `webui-user.sh` (Linux):
|
||||
- `set PYTHON` allows for setting a custom Python path
|
||||
- Example: `set PYTHON=b:/soft/Python310/Python.exe`
|
||||
- `set VENV_DIR` allows for setting a custom virtual environment:
|
||||
- Example: `set VENV_DIR=-` runs the program using the system's python
|
||||
- `set COMMANDLINE_ARGS` setting the command line arguments `webui.py` is ran with
|
||||
- Example: `set COMMANDLINE_ARGS=--ckpt a.ckpt` uses the model `a.ckpt` instead of `model.ckpt`
|
||||
|
||||
# Command Line Arguments
|
||||
## Creating Large Images
|
||||
Use `--opt-split-attention` parameter. It slows down sampling a tiny bit, but allows you to make gigantic images.
|
||||
|
||||
## Running online
|
||||
Use the `--share` option to run online. You will get a xxx.app.gradio link. This is the intended way to use the program in collabs. You may set up authentication for said gradio shared instance with the flag `--gradio-auth username:password`, optionally providing multiple sets of usernames and passwords separated by commas.
|
||||
|
||||
Use `--listen` to make the server listen to network connections. This will allow computers on the local network to access the UI, and if you configure port forwarding, also computers on the internet.
|
||||
|
||||
Use `--port xxxx` to make the server listen on a specific port, xxxx being the wanted port. Remember that all ports below 1024 need root/admin rights, for this reason it is advised to use a port above 1024. Defaults to port 7860 if available.
|
14
Troubleshooting.md
Normal file
@ -0,0 +1,14 @@
|
||||
# Low VRAM Video-cards
|
||||
When running on video cards with a low amount of VRAM (<=4GB), out of memory errors may arise.
|
||||
Various optimizations may be enabled through command line arguments, sacrificing some/a lot of speed in favor of using less VRAM:
|
||||
- If you have 4GB VRAM and want to make 512x512 (or maybe up to 640x640) images, use `--medvram`.
|
||||
- If you have 4GB VRAM and want to make 512x512 images, but you get an out of memory error with `--medvram`, use `--medvram --opt-split-attention` instead.
|
||||
- If you have 4GB VRAM and want to make 512x512 images, and you still get an out of memory error, use `--lowvram --always-batch-cond-uncond --opt-split-attention` instead.
|
||||
- If you have 4GB VRAM and want to make images larger than you can with `--medvram`, use `--lowvram --opt-split-attention`.
|
||||
- If you have more VRAM and want to make larger images than you can usually make (for example 1024x1024 instead of 512x512), use `--medvram --opt-split-attention`. You can use `--lowvram` also but the effect will likely be barely noticeable.
|
||||
- Otherwise, do not use any of those.
|
||||
|
||||
# Green or Black screen
|
||||
Video cards
|
||||
When running on video cards which don't support half precision floating point numbers (a known issue with 16xx cards), a green or black screen may appear instead of the generated pictures.
|
||||
This may be fixed by using the command line arguments `--precision full --no-half` at a significant increase in VRAM usage, which may require `--medvram`.
|
1
_Footer.md
Normal file
@ -0,0 +1 @@
|
||||
This is the _Stable Diffusion web UI_ wiki. [Wiki Home](https://github.com/neovim/neovim/wiki)
|
BIN
images/CLIP-interrogate.png
Normal file
After Width: | Height: | Size: 183 KiB |
BIN
images/GFPGAN.png
Normal file
After Width: | Height: | Size: 677 KiB |
BIN
images/attention-3.jpg
Normal file
After Width: | Height: | Size: 944 KiB |
BIN
images/facefix-codeformer.png
Normal file
After Width: | Height: | Size: 164 KiB |
BIN
images/facefix-gfpgan.png
Normal file
After Width: | Height: | Size: 163 KiB |
BIN
images/facefix.png
Normal file
After Width: | Height: | Size: 150 KiB |
BIN
images/inpaint-mask.png
Normal file
After Width: | Height: | Size: 472 KiB |
BIN
images/inpaint-mask2.png
Normal file
After Width: | Height: | Size: 466 KiB |
BIN
images/inpaint-whole-mask.png
Normal file
After Width: | Height: | Size: 604 KiB |
BIN
images/inpaint-whole-no.png
Normal file
After Width: | Height: | Size: 431 KiB |
BIN
images/inpaint-whole-yes.png
Normal file
After Width: | Height: | Size: 431 KiB |
BIN
images/inpainting-10-euler-a.png
Normal file
After Width: | Height: | Size: 801 KiB |
BIN
images/inpainting-30-euler-a.png
Normal file
After Width: | Height: | Size: 835 KiB |
BIN
images/inpainting-80-dpm2-a.png
Normal file
After Width: | Height: | Size: 840 KiB |
BIN
images/inpainting-81-euler-a.png
Normal file
After Width: | Height: | Size: 838 KiB |
BIN
images/inpainting-initial-content-fill.png
Normal file
After Width: | Height: | Size: 364 KiB |
BIN
images/inpainting-initial-content-latent-noise.png
Normal file
After Width: | Height: | Size: 455 KiB |
BIN
images/inpainting-initial-content-latent-nothing.png
Normal file
After Width: | Height: | Size: 375 KiB |
BIN
images/inpainting-initial-content-mask.png
Normal file
After Width: | Height: | Size: 569 KiB |
BIN
images/inpainting-initial-content-original.png
Normal file
After Width: | Height: | Size: 428 KiB |
BIN
images/inpainting.png
Normal file
After Width: | Height: | Size: 1.3 MiB |
BIN
images/inversion.png
Normal file
After Width: | Height: | Size: 678 KiB |
BIN
images/loopback.jpg
Normal file
After Width: | Height: | Size: 465 KiB |
BIN
images/negative-base.png
Normal file
After Width: | Height: | Size: 1.9 MiB |
BIN
images/negative-purple.png
Normal file
After Width: | Height: | Size: 1.9 MiB |
BIN
images/negative-tentacles.png
Normal file
After Width: | Height: | Size: 1.6 MiB |
BIN
images/outpainting-1.png
Normal file
After Width: | Height: | Size: 733 KiB |
BIN
images/outpainting-2.png
Normal file
After Width: | Height: | Size: 947 KiB |
BIN
images/outpainting-3.png
Normal file
After Width: | Height: | Size: 1.4 MiB |
BIN
images/prompt-matrix.png
Normal file
After Width: | Height: | Size: 1.8 MiB |
BIN
images/prompt_matrix.jpg
Normal file
After Width: | Height: | Size: 1.2 MiB |
BIN
images/resizing.jpg
Normal file
After Width: | Height: | Size: 453 KiB |
BIN
images/sampling.jpg
Normal file
After Width: | Height: | Size: 1.5 MiB |
BIN
images/sd-upscale-castle-esrgan-topaz-gigapixel.png
Normal file
After Width: | Height: | Size: 1.8 MiB |
BIN
images/sd-upscale-castle-original.png
Normal file
After Width: | Height: | Size: 462 KiB |
BIN
images/sd-upscale-castle-realesrgan.png
Normal file
After Width: | Height: | Size: 1.1 MiB |
BIN
images/sd-upscale-castle-sd-upscale.png
Normal file
After Width: | Height: | Size: 1.7 MiB |
BIN
images/sd-upscale-city-esrgan-topaz-gigapixel.png
Normal file
After Width: | Height: | Size: 1.4 MiB |
BIN
images/sd-upscale-city-original.png
Normal file
After Width: | Height: | Size: 388 KiB |
BIN
images/sd-upscale-city-realesrgan.png
Normal file
After Width: | Height: | Size: 1.3 MiB |
BIN
images/sd-upscale-city-sd-upscale.png
Normal file
After Width: | Height: | Size: 1.3 MiB |
BIN
images/sd-upscale-robot-esrgan-topaz-gigapixel.png
Normal file
After Width: | Height: | Size: 1008 KiB |
BIN
images/sd-upscale-robot-original.png
Normal file
After Width: | Height: | Size: 287 KiB |
BIN
images/sd-upscale-robot-realesrgan.png
Normal file
After Width: | Height: | Size: 1.1 MiB |
BIN
images/sd-upscale-robot-sd-upscale.png
Normal file
After Width: | Height: | Size: 959 KiB |
BIN
images/seed-noresize.png
Normal file
After Width: | Height: | Size: 2.2 MiB |
BIN
images/seed-resize.png
Normal file
After Width: | Height: | Size: 2.1 MiB |
BIN
images/seed-variations.jpg
Normal file
After Width: | Height: | Size: 863 KiB |
BIN
images/xy_grid-medusa-ui.png
Normal file
After Width: | Height: | Size: 1.0 MiB |
BIN
images/xy_grid-medusa.png
Normal file
After Width: | Height: | Size: 4.3 MiB |