[WIP] Can we get this working on Replicate?

gadicc · February 17, 2023, 10:55am

As suggested in Running on other cloud providers - #8 by Raj_Dhakad.

My response there:

We’d probably need to bypass a lot of the Replicate specific features, so, no cog file, and probably no typed arguments (we know all the types we take for callInputs, but the modelInputs are for the most part not touched at all and parsed straight to the appropriate diffusers pipeline) - instead we’ll just take a long JSON string with everything, which should be ok.

I’d love to do this but with current obligations I’m not sure when I’ll have a chance. If you or someone else would like to take this on, I’ll be around for guidance. Form a quick perusal of the docs, I think the rough steps would be:

Reference: Push your own model - Replicate docs

predict.py something along the lines of:

from cog import BasePredictor, Path, Input
from app import init, inference
import json

class Predictor(BasePredictor):
    def setup(self):
        """Load the model into memory to make running multiple predictions efficient"""
        init()

    def predict(self,
            inputs: str = Input(description="docker-diffusers-api '{ callInputs: {}, modelIputs: {}' JSON string")
    ) -> str:
        """Run a single prediction on the model"""
        output = inference(json.loads(inputs))
        return output

cog.yml

build:
  # we don't use cog for the build, hope that's ok `:)
predict: "predict.py:Predictor"

But I don’t have any prior experience with Replicate

Anyway, let’s see what happens. Regardless, I’ll try get this working when I have a chance, but that will probably only be next month at the earliest