ControlNet pipelines

I have been exploring the docker-diffusers-api library and find it to be a valuable tool for diffusion-based pipelines. However, I noticed that ControlNet pipelines are not currently supported.

I am interested in extending the capabilities of the library to include ControlNet pipeline support, as it would greatly benefit my use case. Before diving into the implementation, I wanted to reach out and inquire about the best way to approach this addition.

Firstly, I would appreciate any insights you can provide regarding the feasibility of integrating ControlNet pipelines into the docker-diffusers-api library. Are there any specific challenges or considerations that I should be aware of before starting this development effort?

Additionally, I would be grateful if you could share any guidelines, suggestions, or recommended approaches for adding ControlNet pipeline support to the library. Are there any design principles or existing code components that I should leverage or build upon? Any tips or best practices would be immensely helpful in ensuring a smooth integration.

I believe that adding ControlNet pipeline support to docker-diffusers-api would enhance its versatility and make it even more valuable to the community. I am enthusiastic about contributing to this project and would be more than willing to collaborate closely with you to ensure a successful implementation.

I am new to python and if I do it I don’t mind paying expert python developers in http://upwork.com to get this done.

Edit: Looks like some of the community controlnet pipelines (stable_diffusion_controlnet_img2img, stable_diffusion_controlnet_inpaint, stable_diffusion_controlnet_inpaint_img2img) are already supported

Hey @aroop, great to see you here on the forums.

Firstly, thanks for all the kind words. And re ControlNet, we’re in total agreement! This is something I’ve wanted to add support for a while already, I’m just completely swamped with other work :confused:

ControlNet support is very doable, since upstream diffusers already supports it. As you noted in your edit. I think the best resource for it is possibly the diffusers release notes at Releases · huggingface/diffusers · GitHub (which don’t SEO so well on Google, unfortunately). You could read them bottom up: ControlNet was first introduced in v0.14 on Mar 3, and later releases include more features, models and info. And ControlNet in 🧨 Diffusers blog post.

I was very happy to read your message, both about the excitement around ControlNet, but also, your desire to contribute to the project (even at your own financial expense) and intention to work together closely on implementing this feature. That’s greatly appreciated.

I think, give me a week to look over things, and I should be able to provide some good guidance. In theory it’s just a few extra steps:

  1. Ability to send a picture to be used for the controlnet input (very similar to what we do already with img2img / inpaint).
  2. Ability to specify which controlnet model to use
  3. Call the pipeline with the above.

Obviously some of those steps might involve a little bit of complicated code, but hopefully not too much :slight_smile:

Anyway, let’s chat in a week after I’ve had more time to look at it. Maybe I can even provide some starter code for you to work with instead of having to outsource it, but let’s see how much time I have to look at it this week first.

Thanks again and looking forward to working together :sparkles:

I’m glad to hear that you found my message exciting and appreciated my willingness to contribute to the project. I understand that implementing ControlNet support is a priority for you, and I’m happy to work closely with you on this.

I’ve gone through the diffusers release notes on GitHub as you suggested, and it seems like a valuable resource for understanding ControlNet. I’ll read them from the bottom up to get a comprehensive understanding of the feature’s evolution. I’ll also check out the ControlNet section in the Diffusers blog post.

While you consider the optimal approach to implementing this feature, I’d like to share the initial steps I plan to take in the meantime. These steps will be added to a fork of the project:

  1. Given the specific requirements of ControlNet, particularly with MultiControlnets, it appears that custom pipelines specific to different applications will be necessary. Therefore, I intend to incorporate support for these custom pipelines.
  2. I will introduce a flag that enables the ControlNet feature. This flag will allow users to easily activate or deactivate ControlNet functionality as needed.
  3. Additionally, I will enhance the capability to download multiple ControlNet features based on user input, taking into account both runtime and build time considerations. This enhancement will ensure that users have access to the ControlNet features they require, tailored to their specific needs.

By implementing these steps, we can lay a foundation for the integration of ControlNet into the project. Please take your time to consider these approaches, and I’m open to any further suggestions or adjustments you may have.

Is there any specific reason you locked the diffusers package to a specific commit? I am thinking of using the latest version in my fork.

I guess you already answered the above question in Is there a plan to update to v0.15.1 for diffusers?

Hey @aroop. Thanks as usual for your great post. Re your questions:

  1. Happy to report that custom pipelines are already supported and working great. See e.g. ALL YOUR PIPELINES (are belong to us) and Lpw_stable_diffusion pipeline (longer prompts, prompt weights!) as an example.

  2. We’re on the same page. Maybe something like:

{
  callInputs: {
    controlnet: [
      "lllyasviel/sd-controlnet-openpose",
      "lllyasviel/sd-controlnet-canny",
    ],
    // and MODEL_ID, pipeline, etc.
  },
  modelInputs: {
    image: [
      "...base64encoded...",
      "...base64encoded...",
    ],
    controlnet_conditioning_scale: [1.0, 0.8],
  }
}

where the arrays values could also be a single string (non-array). But that’s just my thoughts from glancing through the blog post, let’s see what you encounter when actually implementing this.

  1. Sounds great! Thanks for thinking of this ahead of time.

Re version locking, yeah, you found the relevant posts, thanks for searching. You can use the latest version in your fork. I’ll get :dev up-to-date later this week and we can address any merge conflicts later. (I like to at least skim through upstream commit log when bumping versions to look out for anything that could affect us, and make sure all tests pass too of course).

Thanks again @aroop, looking forward to working together on this! :raised_hands:

Just finished this.

  • Bumped to v0.17.1 + latest commits to date (ce55049).
  • Reviewed upstream commit history to this point.
  • Merged in all upstream changes to our version of train_dreambooth.py which broke on the upgrade.
  • All tests are now passing.

(Unrelated to your work but I’ll just mention that there are some nice LoRA improvements in the above releases / commits, including, finally, ability to load (most?) A1111 trained LoRAs as are very common on CivitAI, which I’ll work on next when I have a chance (hopefully just small changes to existing DDA LoRA code))

Let me know if you need anything else from me else we’ll chat again on your next update. Thanks!

Hey @aroop, just wanted to check in and see how things are going or if you have any more questions I can help with.

Thanks for checking in! I appreciate your support. I’ve been making progress on my project, although I have to admit that I’ve been using somewhat hacky solutions. For instance, I had to utilize multiple stable diffusion models—one for inpainting and another for normal purposes. Additionally, due to bugs in the controlnet reference community pipeline, I had to resort to using a forked version of the diffusers project.

On a positive note, I’m planning to extract the code that can be pushed upstream. This way, it can be contributed back to the main project and help improve its stability and functionality.

If you have any suggestions or further questions, please let me know. I really appreciate your assistance throughout this process!

Ah thanks for the update! And my pleasure. I’m uh… so happy you’re dealing with all this mess instead of me :sweat_smile: It all sounds hectic but yeah that will be absolutely amazing if you contribute it all back upstream and then we get to use it cleanly. Thanks so much for all your work and efforts here!