Much faster init() times! For runwayml/stable-diffusion-v1-5:
Previously: 4.0s, now: 2.4s (40% speed gain)
Much faster inference() times! Particularly from the 2nd inference onwards.
Here’s a brief comparison of inference average times (for 512x512 x50 steps):
Upgrade to Diffusers v0.7.0. There is a lot of fun stuff in this release,
but notably for docker-diffusers-api TODAY (more fun stuff coming next week!),
we have much faster init times (via fast_load
) and the greatly anticipated support for the Euler schedulers ( a1ea8c0
).
We now use the full scheduler name for callInputs.SCHEDULER. "LMS", "DDIM", "PNDM" all still work fine for now but give a deprecation warning
and will stop working in a future update. The full list of supported schedulers
is: LMSDiscreteScheduler, DDIMScheduler, PNDMScheduler, EulerAncestralDiscreteScheduler, EulerDiscreteScheduler. These cover the
most commonly used / requested schedulers, but we already have code in place to
support every scheduler provided by diffusers, which will work in a later
diffusers release when they have better defaults.
DPMSolverMultistepScheduler. Docker-diffusers-api is simply a wrapper
around diffusers. We support all the included schedulers out of the box,
as long as they can init themselves with default arguments. So, the above
scheduler was already working, but we didn’t mention it before. I’ll just
quote diffusers:
DPMSolverMultistepScheduler is the firecracker diffusers implementation
of DPM-Solver++, a state-of-the-art scheduler that was contributed by one
of the authors of the paper. This scheduler is able to achieve great
quality in as few as 20 steps. It’s a drop-in replacement for the default
Stable Diffusion scheduler, so you can use it to essentially half
generation times.
Storage Class / S3 support. We now have a generic storage class, which
allows for special URLs anywhere anywhere you can usually specify a URL,
e.g. CHECKPOINT_URL, dest_url (after dreambooth training), and the new MODEL_URL (see below). URLs like “s3:///bucket/filename” will work how
you expect, but definitely read docs/storage.md
to understand the format better. Note in particular the triple forwardslash
(“///”) in the beginning to use the default S3 endpoint.
Dreambooth training, working but still in development. See this forum post
for more info.
PRECISION build var, defaults to "fp16", set to "" to use the model
defaults (generally fp32).
CHECKPOINT_URL conversion:
Crash / stop build if conversion fails (rather than unclear errors later on)
Force cpu loading even for models that would otherwise default to GPU.
This fixes certain models that previously crashed in build stage (where GPU
is not available).
--extract-ema on conversion since these are the more important weights for
inference.
CHECKPOINT_CONFIG_URL now let’s to specify a specific config file for
conversion, to use instead of SD’s default v1-inference.yaml.
MODEL_URL. If your model is already in diffusers format, but you don’t
host it on HuggingFace, you can now have it downloaded at build time. At
this stage, it should be a .tar.zst file. This is an alternative to CHECKPOINT_URL which downloads a .ckpt file and converts to diffusers.
test.py:
New --banana arg to run the test on banana. Set environment variables BANANA_API_KEY and BANANA_MODEL_KEY first.
You can now add to and override a test’s default json payload with:
--model-arg prompt="hello"
--call-arg MODEL_ID="my-model"
Support for extra timing data (e.g. dreambooth sends train
and upload timings).
Quit after inference errors, don’t keep looping.
Dev: better caching solution. No more unruly root-cache directory. See CONTRIBUTING.md for more info.
While the above has been working great for me and others for a while now, main branch enjoys wider use, so please do report anything unusual after upgrading!
TL;DR; There is a new main-v0-final branch tracking the current last commit in main before some breaking changes early next year. Use it for any future builds until you’re ready to upgrade to the new architecture (I’ll post a full upgrade guide when everything is ready, but it will be pretty easy).
I’ve been working really hard on getting some good continuous integration (CI) testing with automatic releases (with semantic versioning) in place. Still need to add lots more tests but the infrastructure is now all in place!
Hope everyone has lots of fun tonight and we’ll be in touch in the New Year!