Lpw_stable_diffusion pipeline (longer prompts, prompt weights!)

Warning: this is an experimental feature under active development. This has been running successfully on kiri.art since December, 2022.

This is the first implementation of the new https://banana-forums.dev/t/all-your-pipelines-are-belong-to-us/83 support (in dev), starting with the lpw_stable_diffusion pipeline. Quoting their intro:

Long Prompt Weighting Stable Diffusion

Features of this custom pipeline:

  • Input a prompt without the 77 token length limit.
  • Includes tx2img, img2img. and inpainting pipelines.
  • Emphasize/weigh part of your prompt with parentheses as so: a baby deer with (big eyes)
  • De-emphasize part of your prompt as so: a [baby] deer with big eyes
  • Precisely weigh part of your prompt as so: a baby deer with (big eyes:1.3)

Prompt weighting equivalents:

  • a baby deer with == (a baby deer with:1.0)
  • (big eyes) == (big eyes:1.1)
  • ((big eyes)) == (big eyes:1.21)
  • [big eyes] == (big eyes:0.91)

If you see Token indices sequence length is longer than the specified maximum sequence length for this model ( *** > 77 ) . Running this sequence through the model will result in indexing errors . Do not worry, it is normal.

Massive props to SkyTNT for contributing this awesome pipeline to diffusers! :pray:

This is super new and not well tested… please send feedback!
Make sure you’re running the latest commits from **dev branch.

No special build args are required! :tada:

Test like this:

python test.py txt2img \
  --call-arg PIPELINE="lpw_stable_diffusion"
  --call-arg custom_pipeline_method="text2img"  # See notes below.
  --model-arg width=768  # If using stabilityai/stable-diffusion-2
  --model-arg height=768 # If using stabilityai/stable-diffusion-2
  --model-arg seed=1     # Useful to experiment with weight changes
  --model-arg prompt="a baby deer with (big eyes)" 

Further explanations:

  • PIPELINE="lpw_stable_diffusion"

    Specifies this community pipeline. Currently this is the only one supported, but we
    may well support more or all out of the box if this is a success.

  • custom_pipeline_method="text2img"

    lpw_stable_diffusion supports text2img (“text”, not “txt”), img2img and inpaint, so you get to enjoy this awesomeness for all these use-cases!

  • width,height

    Make sure to match (or be close) to the trained image sizes for best results. stabilityai/stable-diffusion-2 is native 768x768. stabilityai/stable-diffusion-2-base (and all the older models) are native 512x512. Asking for a 512x512 image from a 768x768 gives incredibly poor results!

  • seed

    Setting this to a constant value (e.g. 1, or any number) can be useful to understand how
    changing prompt weights (and only the weights) affects the “same” image.

  • prompt

    See the examples above from the pipeline author for a good explanation.

Further information is available in the lpw_stable_diffusion.py source, particularly the doc comments for text2img(), img2img and inpaint towards the end of the file (but also some interesting comments higher up on how the prompt weightings work, etc).