Like to live on the bleeding edge? I’ll post particularly interesting updates to the dev branch here, for early adopters, before they’re merged to main. Issues are more likely and feedback is especially appreciated at this stage to help reach a stable release.
Watch this post to get emailed on updates. The post is locked so it will only be updates from me, and no discussion.
New storage class; S3 support. We now have a generic storage class, which
allows for special URLs anywhere anywhere you can usually specify a URL,
e.g. CHECKPOINT_URL, dest_url (after dreambooth training), and the new MODEL_URL (see below). URLs like “s3:///bucket/filename” will work how
you expect, but definitely read docs/storage.md to understand the format better. Note in particular the triple forwardslash (“///”) in the beginning to use the default S3 endpoint.
Dreambooth training, working but still in development. See this forum post for more info.
PRECISION build var, defaults to "fp16", set to "" to use the model defaults (generally fp32).
CHECKPOINT_URL conversion:
Crash / stop build if conversion fails (rather than unclear errors later on)
Force cpu loading even for models that would otherwise default to GPU.
This fixes certain models that previously crashed in build stage (where GPU
is not available).
--extract-ema on conversion since these are the more important weights for
inference.
CHECKPOINT_CONFIG_URL now let’s to specify a specific config file for
conversion, to use instead of SD’s default v1-inference.yaml.
MODEL_URL. If your model is already in diffusers format, but you don’t
host it on HuggingFace, you can now have it downloaded at build time. At
this stage, it should be a .tar.zst file. This is an alternative to CHECKPOINT_URL which downloads a .ckpt file and converts to diffusers.
test.py:
New --banana arg to run the test on banana. Set environment variables BANANA_API_KEY and BANANA_MODEL_KEY first.
You can now add to and override a test’s default json payload with:
--model-arg prompt="hello"
--call-arg MODEL_ID="my-model"
Support for extra timing data (e.g. dreambooth sends train
and upload timings).
Dev: better caching solution. No more unruly root-cache directory. See CONTRIBUTING.md for more info.
Diffusers models are out. They haven’t been officially announced yet and this is still very early days, but they’re there. Note, it’s a 768x768 model, you’ll get better results at this native resolution (the 512x512 model is probably being built as we speak, the checkpoint is out already). LMSDiscreteScheduler got really bad results for me, DDIMScheduler worked great.
I have to stress that this is the bleeding edge… its had very limited testing, diffusers is still catching up… but… still, very useable Definitely sounds like txt2img has had more testing than other pipelines which might require a bit of work, however, if you look at the diffusers repo, work is happening there at a rapid pace
Works great with the latest dev from docker-diffusers-api, built with build arg MODEL_ID="stabilityai/stable-diffusion-2", and called with e.g.
DPMSolverMultistepScheduler. Docker-diffusers-api is simply a wrapper
around diffusers. We support all the included schedulers out of the box,
as long as they can init themselves with default arguments. So, the above
scheduler was already working, but we didn’t mention it before. I’ll just
quote diffusers:
DPMSolverMultistepScheduler is the firecracker diffusers implementation
of DPM-Solver++, a state-of-the-art scheduler that was contributed by one
of the authors of the paper. This scheduler is able to achieve great
quality in as few as 20 steps. It’s a drop-in replacement for the default
Stable Diffusion scheduler, so you can use it to essentially half
generation times.
All schedulers work great now with SDv2 (most didn’t before the final diffusers release)!
There’ll be a proper changelog a little further down the line, but in the meantime, I thought some people might be interested in the following new features (in experimental status) that have been highly requested.
We also:
diffusers: which has a number of useful fixes (to be detailed later)
tests: default to DPMSolverMultistepScheduler and 20 steps
s3: show progress indicator for uploads/downloads
base image: can be configured via FROM_IMAGE build-arg
dreambooth: now default to fp16 mixed_precision training
Latest diffusers, SDv2.1. All the latest goodness, and upgraded some
dependencies too. Models are:
stabilityai/stable-diffusion-2-1-base (512x512)
stabilityai/stable-diffusion-2-1 (768x768)
ALL THE PIPELINES. We no longer load a list of hard-coded pipelines
in init(). Instead, we init and cache each on first use (for faster
first calls on cold boots), and, all pipelines, both official diffusers
and community pipelines, are available. Full details
Dreambooth: Enable mixed_precision training, default to fp16.
[Experimental] Runtime downloads (Dreambooth only for now, more on the way)
S3: Add upload/download progress indicators.
Stable Diffusion has standardized image instead of init_image for
all pipelines. Using init_image now shows a deprecation warning and
will be removed in future.
Changed sd-base to diffusers-api as the default tag / name used
in the README examples and optional [./build][build script].
Much better error handling. We now try...except both the pipeline
run and entire inference() call, which will save you a trip to banana’s
logs which don’t always even show these errors and sometimes just leave
you with an unexplained stuck instance. These kinds of errors are almost
always a result of problematic callInputs and modelInputs used for the
pipeline call, so finding them will be a lot easier now.
Hey all, it seems the “much better error handling” feature above has been breaking optimisation on banana. I would have picked this up sooner but optimization in general has been quite flaky the last few weeks so I wasn’t sure about this until earlier today. I’m working on this but since banana’s optimization is a bit of a black box this basically means trying something, waiting 1hr+ for build + optimization to complete to see result, then repeat. But at least I through this process I was finally able to figure out which commit broke it.
[1e97c12] chore(readme): remove python3 server.py from docker run command ← works!
The breaking commit above is:
README.md:
- 1. `docker run -it --gpus all -p 8000:8000 diffusers-api python3 server.py`
+ 1. `docker run -it --gpus all -p 8000:8000 diffusers-api`
and somehow that one line change in a non-code text file breaks optimization.
I’m honestly at a loss. If someone wants to make a quick $50, will happily hand it over it someone beats me to solving this, but I haven’t given up yet! Promise to keep everyone posted, but also have family visiting from overseas so time is more limited
TL;DR; There is a new dev-v0-final branch tracking the current last commit in main before some breaking changes early next year. Use it for any future builds until you’re ready to upgrade to the new architecture (I’ll post a full upgrade guide when everything is ready, but it will be pretty easy).
I’ve been working really hard on getting some good continuous integration (CI) testing with automatic releases (with semantic versioning) in place. Still need to add lots more tests but the infrastructure is now all in place!
Hope everyone has lots of fun tonight and we’ll be in touch in the New Year!
So, last Friday Jan 27th, I merged all the “v1” code to the dev and main branches. v1 is still not “officially” released (hence no announcement in the official releases thread, yet), but has been running in production on kiri.art for over a month now, all working great.
The main reason it’s already on main branch even before a formal release, is to take advantage of our new release pipeline. On every merge to main, a suite of unit and integration tests are run, and if they all pass, the commit history is analysed to automatically publish a new semantic version release (bumping major version for breaking changes, minor for new features, and patch for bug fixes). Since we’re publishing actual docker images now, this gives you a lot more ease and control to pin to specific versions and upgrade or downgrade as desired.
The following guide explains how to upgrade. Feedback from early adopters will be a great help on the road to make v1 “official”. Again, it’s been running in production on kiri.art for over a month, but your use case may be different, and although we have great testing infrastructure in place now, we are still quite far away from 100% code coverage. So give things a spin and let us know:
The CHANGELOG is pretty big, but here are some highlights:
v1 architecture: new “split” architecture, our own optimization code, separate MODEL_REVISION and MODEL_PRECISION, and more (see the “UPGRADING” post above for more details).
Upgraded Diffusers to v0.12.0 (actually, to a bug fix commit shortly after), and merged in latest changes to the dreambooth training script. Upgraded most dependencies in requirements.txt to latest versions.
README improvements (better explanations of model/call inputs, how to use SEND_URL, etc), with more on the way.