SEND_URL is missing progress

When setting the SEND_URL environment variable, it does not send inference progress (steps) to the endpoint but only start and endtimes. I’ve been looking for a way to send progress events too but I don’t know how to assign a callback function to the StableDiffusionPipeline.

This is something I’ve wanted to do for a while, it’s on my list. I will probably start off with sending current progress percent every 1s, but let me know if you had anything different in mind.

You don’t have to do that every second. The StableDiffusionPipeline has two properties callback and callback_steps. Callback is a runnable that does the send() and the callback_steps determines every x steps to do the callback. Based on the steps sent to Banana you can easily calculate progress percentage.

Ok, sure, that’s pretty easy then :slight_smile:

By default, it does nothing (i.e. existing behaviour). But if you specify the modelInput { callback_steps: 1 } or other desired number of steps, it will activate, and send something like this after every callback_steps steps (note status: "progress" and payload.step):

{
  'type': 'inference',
  'status': 'progress',
  'container_id': '[...]',
  'time': 1676012624907,
  't': 7285,
  'tsl': 0,
  'payload': {
    'startRequestId': None,  // If specified as a callInput.
    'step': 1     // 1...num_inference_steps every callback_steps
  }
}

It’s in dev branch, your favourite :slight_smile: However with the docker images now it’s pretty easy to try out, just change e.g. FROM gadicc/diffusers-api to FROM gadicc/docker-diffusers-api:dev.

But a reminder that this is still in dev, and v1 still isn’t officially released. This is nice to try out but I wouldn’t use it in production yet. Outside of this post, I haven’t documented this feature yet and it’s usage could change based on user feedback (including yours).

Currently this only applies to inference… will add it to dreambooth training next.

It is actually better if the SEND_URL can be an argument made in the request and not the docker file. That way I can create a task on my server and have the thing post updates to the right callback url. https://domain.com/cb/taskId

Seems reasonable :slight_smile:

No automated tests for this yet but I did test manually with https://webhook.site/ and seems to work well. Integration tests are running on the cloud now and if they pass it will be published in about 20m or so to docker registry with :dev tag like before (and my previous comments on using dev releases still stand).

Obviously this won’t work for the init() sends as that function is called before we actually get sent the request (to inference()), but should work everywhere else.

Also, I guess I should document it better, but you can also already pass a callInput called startRequestId, which is sent a long with any send()'s for inference. Doesn’t currently get sent for “sub” requests (like realtime downloads as part of the request), but I’ll probably add that too now that I’ve done that code for per-request URLs.

Anyway, in case you’re playing around with any of the dev releases, you can let me know how it goes and give any feedback.

Your dev repo doesn’t work. I set the SEND_URL in the dockerfile but it didn’t send anything. I also tried providing it as a property of the callInputs but that didn’t work either. My server is not receiving any events. Not even the init events are being sent anymore like they used to.

Hey, can you send your Dockerfile (with any private info removed)? And likewise with the request with the callInputs?

I tried a bunch of different things with both the regular repo locally and with the -build-download repo deployed to banana, and everything worked… for me, at least… so need to understand if I’m making any assumptions about your set up.

Can you also maybe try sending to an address on https://webhook.site/, just in case maybe there’s a networking / DNS issue in connecting to your server from banana? Thanks!

# Banana requires Cuda version 11+.  Below is banana default:
# FROM pytorch/pytorch:1.11.0-cuda11.3-cudnn8-devel as base
# xformers available precompiled for:
#   Python 3.9 or 3.10, CUDA 11.3 or 11.6, and PyTorch 1.12.1
#   https://github.com/facebookresearch/xformers/#getting-started
# Below: pytorch base images only have Python 3.7 :(
FROM pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime as base
# Below: our ideal image, but Optimization fails with it.
#FROM continuumio/miniconda3:4.12.0 as base

# Note, docker uses HTTP_PROXY and HTTPS_PROXY (uppercase)
# We purposefully want those managed independently, as we want docker
# to manage its own cache.  This is just for pip, models, etc.
ARG http_proxy
ENV http_proxy=${http_proxy}
ARG https_proxy
ENV https_proxy=${https_proxy}
RUN if [ -n "$http_proxy" ] ; then \
    echo quit \
    | openssl s_client -proxy $(echo ${https_proxy} | cut -b 8-) -servername google.com -connect google.com:443 -showcerts \
    | sed 'H;1h;$!d;x; s/^.*\(-----BEGIN CERTIFICATE-----.*-----END CERTIFICATE-----\)\n---\nServer certificate.*$/\1/' \
    > /usr/local/share/ca-certificates/squid-self-signed.crt ; \
    update-ca-certificates ; \
  fi
ENV REQUESTS_CA_BUNDLE=${http_proxy:+/usr/local/share/ca-certificates/squid-self-signed.crt}

ENV DEBIAN_FRONTEND=noninteractive
#RUN apt-get install gnupg2
#RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A4B469963BF863CC
RUN apt-get update && apt-get install -yqq git

# This would have been great but Python is via conda,
# and conda doesn't support python >= 3.7 for base.
#RUN apt install -yqq software-properties-common
#RUN add-apt-repository ppa:deadsnakes/ppa
#RUN apt update
#RUN apt-get install -yqq python3.10
#RUN ln -sf /usr/bin/python3.10 /usr/bin/python3
#RUN ln -sf /usr/bin/python3.10 /usr/bin/python

FROM base AS patchmatch
ARG USE_PATCHMATCH=0
WORKDIR /tmp
COPY scripts/patchmatch-setup.sh .
RUN sh patchmatch-setup.sh

FROM base as output
RUN mkdir /api
WORKDIR /api

## XXXX playing around a lot.
# pip installs pytorch 1.13 and uninstalls 1.12 (needed by xformers)
# recomment conda update; didn't help.  need to solve above issue.

RUN conda update -n base -c defaults conda
# We need python 3.9 or 3.10 for xformers
# Yes, we install pytorch twice... will switch base image in future
RUN conda create -n xformers python=3.10
SHELL ["/opt/conda/bin/conda", "run", "--no-capture-output", "-n", "xformers", "/bin/bash", "-c"]
RUN python --version
RUN conda install -c pytorch -c conda-forge cudatoolkit=11.6 pytorch=1.12.1
RUN conda install xformers -c xformers/label/dev

# Install python packages
# RUN pip3 install --upgrade pip
RUN https_proxy="" REQUESTS_CA_BUNDLE="" conda install pip
ADD requirements.txt requirements.txt
RUN pip install -r requirements.txt

# Not needed anymore, but, may be needed again in the future :D
# Turing: 7.5 (RTX 20s, Quadro), Ampere: 8.0 (A100), 8.6 (RTX 30s)
# https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/
# ENV TORCH_CUDA_ARCH_LIST="7.5 8.0 8.6"

RUN git clone https://github.com/huggingface/diffusers
WORKDIR /api/diffusers
RUN git checkout v0.9.0
WORKDIR /api
RUN pip install -e diffusers

# We add the banana boilerplate here
ADD server.py .
EXPOSE 8000
 
# Dev: docker build --build-arg HF_AUTH_TOKEN=${HF_AUTH_TOKEN} ...
# Banana: currently, comment out ARG and set by hand ENV line.
ARG HF_AUTH_TOKEN
ENV HF_AUTH_TOKEN=********************

# MODEL_ID, can be any of:
# 1) Hugging face model name
# 2) A directory containing a diffusers model
# 3) Your own unique model id if using CHECKPOINT_URL below.
# 4) "ALL" to download all known models (useful for dev)
# "runwayml/stable-diffusion-v1-5", "runwayml/stable-diffusion-inpainting"
# "CompVis/stable-diffusion-v1-4", "hakurei/waifu-diffusion",
# "stabilityai/stable-diffusion-2",
# "stabilityai/stable-diffusion-2-inpainting" etc.
ARG MODEL_ID="runwayml/stable-diffusion-v1-5"
ENV MODEL_ID=${MODEL_ID}

# "" = model default.
ARG PRECISION=""
ENV PRECISION=${PRECISION}
ADD precision.py .
 
# ARG PIPELINE="StableDiffusionInpaintPipeline"
ARG PIPELINE="ALL"
ENV PIPELINE=${PIPELINE}

ARG USE_DREAMBOOTH=1
ENV USE_DREAMBOOTH=${USE_DREAMBOOTH}

ARG AWS_ACCESS_KEY_ID
ARG AWS_SECRET_ACCESS_KEY
ARG AWS_DEFAULT_REGION
ARG AWS_S3_ENDPOINT_URL
ENV AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
ENV AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
ENV AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}
ENV AWS_S3_ENDPOINT_URL=${AWS_S3_ENDPOINT_URL}

COPY utils utils

# Download diffusers model from somewhere else (see Storage docs)
# Don't use this for checkpoints (.ckpt)!  Use CHECKPOINT_URL for that.
ARG MODEL_URL=""
ENV MODEL_URL=${MODEL_URL}
# If set, it will be downloaded and converted to diffusers format, and
# saved in a directory with same MODEL_ID name to be loaded by diffusers.
ARG CHECKPOINT_URL=""
ENV CHECKPOINT_URL=${CHECKPOINT_URL}
ARG CHECKPOINT_CONFIG_URL=""
ENV CHECKPOINT_CONFIG_URL=${CHECKPOINT_CONFIG_URL}

ADD download-checkpoint.py .
RUN python3 download-checkpoint.py
ARG _CONVERT_SPECIAL
ENV _CONVERT_SPECIAL=${_CONVERT_SPECIAL}
ADD convert-to-diffusers.py .
RUN python3 convert-to-diffusers.py
# RUN rm -rf checkpoints

# Add your model weight files 
# (in this case we have a python script)
ADD getScheduler.py .
ADD loadModel.py .
ADD download.py .
RUN python3 download.py

# Deps for RUNNING (not building) earlier options
ARG USE_PATCHMATCH=0
RUN if [ "$USE_PATCHMATCH" = "1" ] ; then apt-get install -yqq python3-opencv ; fi
COPY --from=patchmatch /tmp/PyPatchMatch PyPatchMatch

RUN if [ "$USE_DREAMBOOTH" = "1" ] ; then \
    # By specifying the same torch version as conda, it won't download again.
    # Without this, it will upgrade torch, break xformers, make bigger image.
    pip install -r diffusers/examples/dreambooth/requirements.txt bitsandbytes torch==1.12.1 ; \
  fi
RUN if [ "$USE_DREAMBOOTH" = "1" ] ; then apt-get install git-lfs ; fi

# Add your custom app code, init() and inference()
ADD train_dreambooth.py .
ADD send.py .
ADD app.py .

ARG SEND_URL="https://keenml.com/echo"
ENV SEND_URL=${SEND_URL}
ARG SIGN_KEY
ENV SIGN_KEY=${SIGN_KEY}

CMD python3 -u server.py

Whoa, I’m so confused… where is that Dockerfile from? :sweat_smile: If you’re talking about a dev release you downloaded a few months back, so no, I only added the new code a few days ago, so you need to update.

Also, for banana, you probably want to use this repo as a base: GitHub - kiri-art/docker-diffusers-api-build-download: Builds diffusers-api with a pre-downloaded model. But see also [WIP] Upgrading from v0 to v1 if you haven’t already.

I’m confused too! Your new dockerfile doesn’t even have a model name or HuggingFace token ENV variable so I used the old dockerfile I had.

Yeah, the main project repo now defaults to the “runtime downloads” behaviour, and the -build-download repo builds on top of that to download the model at build time and include it in the image. Please see the two links I gave before, which should explain all this better, but please do let me know if anything is not clear so I can improve it for everyone. Also, uh… a lot of stuff can change between versions, so mixing files from different versions is never a good idea :sweat_smile:

If you’re using the -build-download variant, and want to use a dev release, change the top line in the Dockerfile (or override with banana build arg vars, but I think that still requires a new push anyways to trigger the rebuild):

ARG FROM_IMAGE="gadicc/diffusers-api"      # from this
ARG FROM_IMAGE="gadicc/diffusers-api:dev"  # to this

I’m probably off creating my own repo because this one keeps changing a lot and I have no idea how to use it now.

Sure, makes sense :slight_smile:

I think there is a bug in GitHub - kiri-art/docker-diffusers-api-build-download: Builds diffusers-api with a pre-downloaded model

I entered a checkpoint URL in the dockerfile but it tried to download a model from HuggingFace in stead.

# Unlikely you'll ever want to change this.
ARG FROM_IMAGE="gadicc/diffusers-api"
FROM ${FROM_IMAGE} as base
ENV FROM_IMAGE=${FROM_IMAGE}

# Added by Super Jochem
ARG HF_AUTH_TOKEN="*****************"
ENV HF_AUTH_TOKEN=${HF_AUTH_TOKEN}


# Model id, precision, etc.
ARG MODEL_ID="bewbs"
ENV MODEL_ID=${MODEL_ID}
ARG HF_MODEL_ID=""
ENV HF_MODEL_ID=${HF_MODEL_ID}
ARG MODEL_PRECISION=""
ENV MODEL_PRECISION=${MODEL_PRECISION}
ARG MODEL_REVISION=""
ENV MODEL_REVISION=${MODEL_REVISION}
#ARG MODEL_URL="s3://"
ARG MODEL_URL=""
ENV MODEL_URL=${MODEL_URL}

# To use a .ckpt file, put the details here.
ARG CHECKPOINT_URL="https://keenml.com/models/sd/bewbs.ckpt"
ENV CHECKPOINT_URL=${CHECKPOINT_URL}
ARG CHECKPOINT_CONFIG_URL=""
ENV CHECKPOINT_CONFIG_URL=${CHECKPOINT_CONFIG_URL}

ARG PIPELINE="ALL"
ENV PIPELINE=${PIPELINE}

# AWS / S3-compatible storage (see docs)
ARG AWS_ACCESS_KEY_ID
ARG AWS_SECRET_ACCESS_KEY
# AWS, use "us-west-1" for banana; leave blank for Cloudflare R2.
ARG AWS_DEFAULT_REGION
ARG AWS_S3_DEFAULT_BUCKET
# Only if your non-AWS S3-compatible provider told you exactly what
# to put here (e.g. for Cloudflare R2, etc.)
ARG AWS_S3_ENDPOINT_URL

ENV AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
ENV AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
ENV AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}
ENV AWS_S3_DEFAULT_BUCKET=${AWS_S3_DEFAULT_BUCKET}
ENV AWS_S3_ENDPOINT_URL=${AWS_S3_ENDPOINT_URL}

# Download the model
ENV RUNTIME_DOWNLOADS=0
RUN python3 download.py

# Send (optionally signed) status updates to a REST endpoint
ARG SEND_URL
ENV SEND_URL=${SEND_URL}
ARG SIGN_KEY
ENV SIGN_KEY=${SIGN_KEY}

# Override only if you know you need to turn this off
ARG SAFETENSORS_FAST_GPU=1
ENV SAFETENSORS_FAST_GPU=${SAFETENSORS_FAST_GPU}

CMD python3 -u server.py

Hey sorry I thought I answered this, didn’t realise it was two different posts. Hopefully you saw my answer on GitHub.