Lpw_stable_diffusion pipeline (longer prompts, prompt weights!)

gadicc · December 7, 2022, 7:54pm

~~Warning: this is an experimental feature under active development.~~ This has been running successfully on kiri.art since December, 2022.

This is the first implementation of the new https://banana-forums.dev/t/all-your-pipelines-are-belong-to-us/83 support (in dev), starting with the lpw_stable_diffusion pipeline. Quoting their intro:

Long Prompt Weighting Stable Diffusion

Features of this custom pipeline:

Input a prompt without the 77 token length limit.

Includes tx2img, img2img. and inpainting pipelines.

Emphasize/weigh part of your prompt with parentheses as so: a baby deer with (big eyes)

De-emphasize part of your prompt as so: a [baby] deer with big eyes

Precisely weigh part of your prompt as so: a baby deer with (big eyes:1.3)

Prompt weighting equivalents:

a baby deer with == (a baby deer with:1.0)

(big eyes) == (big eyes:1.1)

((big eyes)) == (big eyes:1.21)

[big eyes] == (big eyes:0.91)

If you see Token indices sequence length is longer than the specified maximum sequence length for this model ( *** > 77 ) . Running this sequence through the model will result in indexing errors . Do not worry, it is normal.

Massive props to SkyTNT for contributing this awesome pipeline to diffusers!

~~This is super new and not well tested… please send feedback!~~
~~Make sure you’re running the latest commits from **dev branch~~.

No special build args are required!

Test like this:

python test.py txt2img \
  --call-arg PIPELINE="lpw_stable_diffusion"
  --call-arg custom_pipeline_method="text2img"  # See notes below.
  --model-arg width=768  # If using stabilityai/stable-diffusion-2
  --model-arg height=768 # If using stabilityai/stable-diffusion-2
  --model-arg seed=1     # Useful to experiment with weight changes
  --model-arg prompt="a baby deer with (big eyes)"

Further explanations:

PIPELINE="lpw_stable_diffusion"

Specifies this community pipeline. Currently this is the only one supported, but we
may well support more or all out of the box if this is a success.
custom_pipeline_method="text2img"

lpw_stable_diffusion supports text2img (“text”, not “txt”), img2img and inpaint, so you get to enjoy this awesomeness for all these use-cases!
width,height

Make sure to match (or be close) to the trained image sizes for best results. stabilityai/stable-diffusion-2 is native 768x768. stabilityai/stable-diffusion-2-base (and all the older models) are native 512x512. Asking for a 512x512 image from a 768x768 gives incredibly poor results!
seed

Setting this to a constant value (e.g. 1, or any number) can be useful to understand how
changing prompt weights (and only the weights) affects the “same” image.
prompt

See the examples above from the pipeline author for a good explanation.

Further information is available in the lpw_stable_diffusion.py source, particularly the doc comments for text2img(), img2img and inpaint towards the end of the file (but also some interesting comments higher up on how the prompt weightings work, etc).