Hey! So firstly, @grf, welcome to the forums, and especially… welcome to your awesome DOTT avatar. Great memories! (and how did everything all go so down hill since the 90s? :)).
Also, thanks for detailed report, logs, etc, which makes such a big difference!
Does indeed look like there’s an upper limit on banana runtime logs, as indeed, training finished as you pointed out with the { message: 'success', modelOutputs: { /* ... */ } }
). I wonder if Banana will be willing to raise the limit, otherwise I guess I can just make it log less
It’s telling you that nothing was uploaded
Now, you might ask, why would we possibly ever train a model without uploading it anywhere? Which is a very fair question
It’s useful for making sure everything works and performing timing tests on training only. But I think it’s fair that I’ll add a big warning somewhere that this is what’s happening
In short, (and again, I’ll make this clearer), you didn’t actually specify anything to do with the model post training. For S3, you need a
{
callInputs: {
dest_url: "s3:///bucket/model-filename.tar.zst"
}
}
(pay especial attention to the triple ///
in the beginning). I know that dest_url
doesn’t appear in test.py
(because in testing we really are just checking that training works, not uploading) but in the example section you’ll see we call test.py dreambooth --call-arg dest_url="s3:///bucket/filename.tar.zst
to add it. Hope that’s clear! I’ll add in some comments to test.py
too for those learning from there.
Indeed, only the new fine-tuned model gets uploaded to S3. Then you can deploy that model to another instance (with optimized cold starts, etc) to do inference of new images with the new fine-tuned model. With docker-diffusers-api
, just set the MODEL_URL
build-arg with the same s3:///
url, and it will download it for you at build time.
Yes, exactly. I’ll make it clearer in the logs when training is called with no destination given. And if there was an error, it would indeed be in the logs, not on the dashboard.
Now on to the banana questions…
Yeah, that is correct. The model results don’t stick around after you’ve consumed them. I’m not sure how long they stick around after inference (or in this case, training) finished without consumption.
So it’s basically:
start
- start inference with the given optionscheck
- check if inference is done and return the resultsrun
=start
+check
so calling run()
absolutely should start everything for you. This is in banana’s SDK at least. I decided to use their REST API directly in test.py
, and that’s definitely a bit more work. It can be useful for long running tasks though (in kiri.art, we call start
in a serverless function with credentials, and then have the user’s web browser keep doing the check
s for the results).
Also just wanted to double check that you saw the guide at
https://banana-forums.dev/t/dreambooth-training-first-look/36/2
Hope that’s all clear! Let me know how it goes. Don’t hesitate to ask for any further clarifications especially as we shore up the docs for everyone.