Dreambooth training [first look]

Haven’t tried recently but optimization has been a big and constant pain point for me. I plan to experiment with some homegrown alternatives and - if it works - hope we’ll get a way to opt-out of banana’s optimization completely for faster builds. But do make sure you report it on the discord if you haven’t already, and even if others have too… Also, it can be worth trying to push a new dummy commit to trigger a new rebuild; sometimes (but not always) it will just start working again on its own (or after they fix something that didn’t affect existing stuck builds).

A post was split to a new topic: Adieyal/sd-dynamic-prompts: A custom script for AUTOMATIC1111

A post was split to a new topic: KeyError: ‘global_step’ in File “[…]convert_[sd]_to_diffusers.py”, line 799, in global_step = checkpoint[“global_step”]

In latest dev:

  • :white_check_mark: Dreambooth models now saved with safetensors
  • :white_check_mark: Loading of safetensors models works great :smiley:

I still have more work planned with safetensors, and will post more next week, hopefully with timing comparisons too :tada:

2 Likes

Thats cool! you’re very awesome. Thank you very much!

1 Like

My pleasure. If you’ve done any speed tests let us know, I haven’t had a chance yet (but I do have this planned… just working on a few related other fun things :grin:).

Also, I missed it before but in latest dev commit I’ve set TENSORS_FAST_GPU=1 which should result in even faster loads.

WARNING: Image Optimization Failed - cold boots may be slow

Very short lived, you were lucky :smiley:

Thanks @gadicc for this great work!

A few comments

  • it seems this is now on main, so you can remove the reference to the dev branch
  • can you copy this post into your docs folder? It’s a bit hard to find
  • Is it now required to use stabilityai/stable-diffusion-2? When I use 1.5 I get an error that the container only contains v2
  • if you’re still searching for a solution to Banana’s awful log handling: How about offering to send it to a log service? E.g. I’m using https://cloud.axiom.co/ - that’s just a simple POST
1 Like

Hey, thanks!

There’s an initial early release in main from when I last merged dev but a lot of work was / is still happening in dev so I haven’t advertised it on main yet :sweat_smile: I would still only use the dev release for dreambooth but the next merge is planned soon (as soon as I can debug a banana optimization issue).

That’s the plan… right now it’s purposefully only here as was being updated very frequently through user input… happy to say that things do seem to have stabilised now and yeah, it’s totally going to be moved to its on doc. If anything is still unclear though please let me know :slight_smile: It’s improved a loooot through feedback in this thread, as intended.

You can build it with any model (just set MODEL_ID build arg). The container will always assert that the container is running the requested model id, however, in the latest dev, you can now leave out the MODEL_ID call_input and it will just run with whatever you built it with (and return a $meta object in the result showing which defaults it used in case you want to assert on your side).

Oh and actually there was an issue at some point where the MODEL_ID in the Dockerfile and test.pydidn’t match… maybe I only fixed that in dev :sweat_smile: Will be merged to main soon! :sweat_smile: :sweat_smile:

Thanks, that’s probably a great solution. I do all my dev locally but this is mostly an issue with people who can’t dev locally and are trying for the first time on banana, and then things are failing and they have no idea why :joy: Any chance you’d like to create a post about it? :slight_smile: No pressure, and thanks for raising it either way!

Also, in dev, we now try...except EVERYTHING, so those unexplained 500s are a thing of the past.

In any event, thanks for kind words and all the feedback… agree on your points and this is very close to being merged to main with docs :pray: Wishing you some happy diffusing! :raised_hands:

Thank you for your super-fast response @gadicc!

I got some very nightmarish results and was wondering if the v1.5 vs v2.0 thing might be the culprit. But I guess I’ll just try the dev branch then. It’s really a shame you can’t select a branch in Banana…

I can definitely write something up once I added Axiom to my container (currently it’s only my other code).

Wishing you happy upcoming holidays!

1 Like

Shweet!

Trying to think back to “nighmarish” results lol, what comes to mind is:

  • It’s possible in main we’re still using the diffusers release from right around when SDv2 came out, where some schedulers would produce really bad results.

  • Be careful of resolutions… asking a 768 native model for 512 results, and vice versa, can produce very poor results.

That’s all that comes to mind but you might find stuff I don’t recall in the thread above. But yeah first just try dev and see how it compares (and let us know, I’m interested too).

In all my banana deploys I just have docker-diffusers-api repo set as upstream, and git merge upstream/dev etc as needed. Not sure if you’re familiar with this flow but I made a post about it at [HOWTO] Keeping your fork up-to-date. But yeah, choosing branches in banana would be a big help too (as would ability to redeploy after changing build vars without needing to push another commit), and a bunch of other things :sweat_smile:

That would be amazing… thanks so much. It’s definitely a… sticky point… for a lot of banana users :sweat_smile:

Thanks so much! Happy Holidays! :santa: :christmas_tree: