Issues deploying non official models

coffeeorgreentea · December 13, 2022, 1:17am

Today I’ve been going through and updating my models now that optimization is working.
I’m having some issues deploying some fintetuned models, and no issues with official ones.

Working:
All SD 2.1, 2.0, 1.5, and 1.4 models.
Linaqruf/anything-v3.0
dreamlike-art/dreamlike-diffusion-1.0
jinofcoolnes/sammod
Envvi/Inkpunk-Diffusion

Not working:
prompthero/openjourney
nitrosocke/classic-anim-diffusion
nitrosocke/mo-di-diffusion
PublicPrompts/Synthwave
ogkalu/Superhero-Diffusion
dreamlike-art/dreamlike-photoreal-1.0

I’ve tried using HF to download the model and directly linking to the .ckpt.

example log:

Downloading: 100%|██████████| 1.99G/1.99G [00:52<00:00, 40.4MiB/s]
Traceback (most recent call last):  File "/api/download.py", line 59,
in <module>    download_model()  File "/api/download.py",
line 23, in download_model    os.mkdir(MODEL_ID)
FileNotFoundError: [Errno 2] No such file or directory: 'PublicPrompts/Synthwave'
ERROR conda.cli.main_run:execute(47): `conda run /bin/bash -c python3 download.py` failed. (See above for error)
error building image: error building stage: failed to execute command: waiting for process to exit: exit status 1

Any thoughts? I’m a little stumped trying to figure out whats making only some go wrong.

edit: I should note that nitrosocke/classic-anim-diffusion and nitrosocke/mo-di-diffusion were both deploying normally about a month ago

coffeeorgreentea · December 13, 2022, 2:02am

i figured it out nevermind lol

gadicc · December 13, 2022, 6:11am

Well go on then, share with the group

I’ll take a stab out of curiousity… maybe something with RUNTIME_DOWNLOADS? Which is currently only developed for / tested on those model.tar.zst files.

Thanks for letting me know optimization is back up btw… I hope… already see some people saying optimization is failing but maybe it’s just them (deploying now so let’s see). I’m done with banana’s optimization, have something in the works which I hope can replace it entirely, but need to finish the code and do timing tests… and have an international flight today, so not sure when I’ll next get to work on it.

coffeeorgreentea · December 13, 2022, 10:51am

I was moving too quick and mixed up

ARG MODEL_URL=""

with

ARG CHECKPOINT_URL=""

completely ignoring your comment warning about that
Although I guess there is still some unexpected behavior because I still couldn’t get them loaded using HF.

I got around 10 optimized deployments today so it seems stable.
Very excited to see what you have in the works for optimization. I wouldn’t even know where to start with that.

Good luck with your travels!

gadicc · December 13, 2022, 11:10am

Haha great, thanks for reporting back… nothing I haven’t done before! Maybe moving forward MODEL_URL can be smart enough to realize that a URL ending with .ckpt should behave like CHECKPOINT_URL

WARNING: Image Optimization Failed - cold boots may be slow

Very short lived, glad you squeezed in all your deploys before it went down again!

Re everything else, thanks for the kind wishes, and will keep you posted. Most of the foundational code is done, now it’s just a matter of automated tests as there are now so many different ways you can run docker-diffusers-api, need to make sure everything is still working in all of them, and in various cases.

And re optimization, well, banana’s optimization is a bit of a black box, their private IP, but, diffusers loading has gotten a lot faster, I’ve spent a lot of time optimizing loads too in docker-diffusers-api, and now there’s this now safetensors stuff for even faster loads. So, won’t know how it compares until it’s up and we can do some comparison tests to banana’s stuff, but there’s a good chance it will be comparable if not better. Also, once I’m no longer beholden to “not breaking optimization”, there’s a lot of other performance stuff I’ve been wanting to look into for a long time (notably, Facebook / Meta’s AITemplate) - but I’ll finish up on everything else first.

Shweet! Chat soon