Here. I created a gist with a Dreambooth Jupyter notebook for Colab that already implements captions for finetuning. https://gist.github.com/jochemstoel/139b47f4ea6510bd0667961355fcd38f
Ah interesting, in the colab he pulls his fork of diffusers from his updt
branch… which has a modified train_dreambooth.py which both adds LoRA support (in official diffusers, LoRA is in a separate script) and things like --image_captions_filename
, --external_captions
, --captions_dir
and related code.
I hope maybe he’ll submit these changes back to upstream diffusers but either way I’m sure I’d be able to add a similar feature based off his code. In short, very possible and we’ll definitely get it - eventually Probably a good thing to work on after I finish the initial LoRA support. Nice find, thanks!
Hey, is it possible to do LoRA inference already? It’s so stupid I think I remember a notification about it where you said it is possible in the dev branch but I can’t find a post anywhere here on kiri.
(or does it not require any changes in the diffusers api repo?)
EDIT: I asked the following on Banana, which is essentially what I want to know:
I did a LoRA training run and now have a .safetensors file. Now what? Can I use this file by providing it as the MODEL_URL environment variable in the docker-diffusers-api repository by gadicc or does LoRA have certain requirements? I want to do inference. I trained it on Replicate but their API for inference has the stupid safety_checker enabled.
No, not yet. This shouldn’t be too difficult to add though (the training will take longer, but ability to use loras on existing models should be quick).
And looks like lucataco is doing great work there, as usual. Thanks for the exact ref which will be helpful.
I have an international flight today but I’ll take a look at this tomorrow and maybe we can get it out to dev same day.
Thanks
Ok, thanks. I hope so. This is frustrating.
Was a bit from my travels but working on this today
Ok, here’s a first-look in dev.
$ python test.py txt2img \
`# Common options, but notably:` \
`# MODEL_ID should match base model that was fine-tuned with LoRA` \
--call-arg MODEL_ID="runwayml/stable-diffusion-v1-5" \
--call-arg MODEL_PRECISION="fp16" \
--call-arg MODEL_URL="s3://" \
--model-arg prompt="A picture of a sks dog in a bucket" \
--model-arg seed=1 \
`# Specify the LoRA model` \
--call-arg attn_procs="patrickvonplaten/lora_dreambooth_dog_example" \
`# Optional; specify interpolation of LoRA with base model; 0.0 to 1.0 (default)` \
--model-arg cross_attention_kwargs='{"scale": 0.5}'
Outputs: (with scale 0.0, 0.5, 1.0, respectively)
Limitations:
- Currently only models hosted on huggingface are supported. S3 / URL support will come next.
- No way to replace the text_encoder, that will come later (but currently, custom text_encoders are pretty rare… diffusers team still working out how to best deal with them, probably with a separate HF repo for just the text_encoder).
Wins:
- Works with xformers (fix landed on diffusers for this 4 days ago
)
- The LoRA can be switched or disabled without reloading the entire base model again.
Feedback welcome Hope you have a good weekend, I’ll probably continue this early next week based on any feedback.
I will wait until models from URL is supported, to run it on Banana.
Shweet! Can’t promise but very likely this will be out tomorrow sometime.
Is it possible to determine from a .safetensors file what version of Stable Diffusion was used? This might be a useful feature if that information is unknown or convenient when users are uploading their own files.
Unfortunately not, at least not from a plain .safetensors or .bin file. However, when the LoRA is saved with diffusers, you get an adapter_config.json
which stores this as base_model_name_or_path
(example) - not sure if that’s in the current diffusers release yet or still being worked on.
Caveats:
- Could contain a directory name instead of a HuggingFace user/repo id.
- User might choose to send just the safetensors/bin without the other data.
- If accepting files from users, make sure to only accept safetensors (as the bin files can contain arbitrary code).
File download code is coming along nicely but still needs a bit more work. Couldn’t publish the current dev because no capacity on Lambda to run the prerequisite integration tests HTTP downloads work for single files (so probably the S3 code does too), still need some work on the code to download archives (.tar.zst for diffusers format, etc). More updates soon.
You might like knowing that Banana is running on 40GB GPUs now. They haven’t officially announced it yet though but I am a special agent that infiltrated their enterprise.
So you might be able to run your tests on Banana.
Iiiinteresting. We are very honoured to have such an esteemed special agent in the forums here!
That will open up a good bunch of fun possibilities I think. But for the automated tests, it’s nice to have a full system where we can quickly relaunch the container with different environment variables, host our own S3-compatible storage, etc. On the whole it works pretty well, but currently I manually specify the GPU type and geographical location in the script; need to make this more flexible in the future for if the requested system is not available.
In any event;
- Bumped diffusers to latest version
- Fixed a bug with model downloads that showed up in the automated tests.
- Tested S3 lora downloads locally and it indeed works as expected (since it goes through our “storage” library anyways).
- Skipping the “archive” code for now, since I’m not sure that diffusers team has settled on a final format for it yet. However, when I do lora training code, I’ll of course make sure that we can save/load archives of the data without going through HuggingFace.
Automated tests running now (should be done in about 10m but I need to go), and assuming those pass as expected, there’ll be a new :dev
release that you experiment with. Hopefully it will work first time, and let me know if any issues / feedback. Thanks!
Could you demonstrate for us how exactly to load a LoRA model and do inference?
Yeah, sure. I gave an example using test.py
above but the JSON equivalent would be:
{
"callInputs": {
// Typical, common options, but
// MODEL_ID should match base model that was fine-tuned with LoRA
"MODEL_ID": "runwayml/stable-diffusion-v1-5",
"MODEL_PRECISION": "fp16",
"MODEL_REVISION": "fp16",
// Specify the LoRA model
"attn_procs": "patrickvonplaten/lora_dreambooth_dog_example",
},
"modelInputs": {
"prompt": "A picture of a sks dog in a bucket",
"seed": 1 // To get same pictures as above,
// Optional; specify interpolation of LoRA with base model; 0.0 to 1.0 (default)
"cross_attention_kwargs": { "scale": 0.5 },
},
}
The “new” options are the attn_procs
callInput, and then ability to (optionally) tell the model how to use those weights with cross_attention_kwargs
modelInput. It will download the LoRA at runtime (from huggingface in the above example, but an http or s3 URL can be given too for a .bin
file (with .safetensors support for diffusers coming soon).
Don’t hesitate to ask about anything that’s not clear so we can get nice, super clear docs for everyone This is still a little new for diffusers too so things could change, but currently that’s how it all works.
Meh, it has to be a bin file.
What needs to be done to get my self hosted safetensors to work? Can I help in any way?
We just need the safetensors support from diffusers… shouldn’t need any more changes in docker-diffusers-api (which can already download the necessary files) and it should “just work” as soon they have the support their side and I bump the version.
The PR I linked to previously has the code done, has been approved, but I think is still waiting for final feedback from the team before they merge it. But it looks pretty close, I guess we’re a few days away, give or take
I was wrong about the above
Well, actually, it depends.
-
:dev
release has latest diffusers, with the safetensors support, and a workaround for the regression to load non-safetensors files, BUT: -
I couldn’t get it to work with LoRA’s from CivitAI
It seems there are a few different formats LoRA can be in (even in safetensors), and diffusers can’t load it. There are some issues open for this but the direction isn’t clear yet, at least that I could see.
-
However, I recall your LoRA was trained on a colab somewhere… so it’s possible it might work. Depending on the format. Don’t get your hopes up but its worth a shot.
Important note: it will load as safetensors only if the filename in the URL includes ".safetensors"
, otherwise you should specify callInput { "attn_procs_from_safetensors": True }
to force safetensor loading for other filenames.