As suggested in Running on other cloud providers - #8 by Raj_Dhakad.
My response there:
We’d probably need to bypass a lot of the Replicate specific features, so, no cog file, and probably no typed arguments (we know all the types we take for
callInputs
, but themodelInputs
are for the most part not touched at all and parsed straight to the appropriate diffusers pipeline) - instead we’ll just take a long JSON string with everything, which should be ok.
I’d love to do this but with current obligations I’m not sure when I’ll have a chance. If you or someone else would like to take this on, I’ll be around for guidance. Form a quick perusal of the docs, I think the rough steps would be:
Reference: Push your own model - Replicate docs
predict.py
something along the lines of:
from cog import BasePredictor, Path, Input
from app import init, inference
import json
class Predictor(BasePredictor):
def setup(self):
"""Load the model into memory to make running multiple predictions efficient"""
init()
def predict(self,
inputs: str = Input(description="docker-diffusers-api '{ callInputs: {}, modelIputs: {}' JSON string")
) -> str:
"""Run a single prediction on the model"""
output = inference(json.loads(inputs))
return output
cog.yml
build:
# we don't use cog for the build, hope that's ok `:)
predict: "predict.py:Predictor"
But I don’t have any prior experience with Replicate
Anyway, let’s see what happens. Regardless, I’ll try get this working when I have a chance, but that will probably only be next month at the earliest