This document outlines the new “split” architecture used in v1 of the repo.
Feedback welcome.
Goals:
- Faster / easier development
- Quicker builds / deployments
New image divisions:
-
Conda environments (Python, Pytorch, CUDA, xformers) take a while to setup (download, dependency solving) so I’ve moved this to its owner container. Available tags:
docker pull gadicc/diffusers-api-base:python3.9-pytorch1.12.1-cuda11.6-xformers
- in the future: an arm variant that can work with Apple silicon.
-
This will be the new main image. It will default to
RUNTIME_DOWNLOADS=1
, and can be run instantly straight from Docker, downloading what it needs at runtime. Much quicker to build and rebuild, as no baked in model… it basically installs all necessary python packages, diffusers, and has the docker-diffusers-api wrapper code. -
This is a small wrapper over the above that:
- Sets any necessary ENV vars via build-args
- Downloads the model at build time and stores in the image.
There’s also a runpod variant of the above tailored for RunPod’s Serverless AI offering.
I think this will be a big help for anyone wanting to be more involved with development, play locally, and people wanting more deployment options.