Train_text_encoder

jochemstoel · February 16, 2023, 12:25pm

My GPU provider has powerful enough hardware to train the text encoder too. I’m used to setting the variable train_text_encoder to true or false in my various notebooks. However, what I don’t understand is that the source code of this repository says:

Whether to train the text encoder. If set, the text encoder should be float32 precision.

    "train_text_encoder": None

What does it mean that it should be float32 precision?

gadicc · February 17, 2023, 11:03am

Ah, yeah… I copied and pasted that comment directly from the description in diffusers, but I guess it’s not very clear. In short:

Yes, train_text_encoder is a boolean that can be set to True.
The text_encoder that you’re fine-tuning does need to be in float32 precision, but, that’s pretty much a given, since dreambooth training throws an error anyways if you try train on a fp16 model. And given a fp32 “model”, the model is made up of / includes a fp32 text_encoder, fp32 unet, etc.

In short, set { "train_text_encoder": True } and let me know if any errors I haven’t tried this before but it should work out the box

jochemstoel · March 5, 2023, 9:30pm

Hereby reporting back to confirm that training the text encoder using docker-diffusers-api works perfectly.

gadicc · March 6, 2023, 9:01am

Fantastic. Thanks for taking the time to report back