Running on other cloud providers

If you don’t have a CUDA-compatible GPU on your dev machine, this is really something you can consider. Changing code and waiting for a deploy on banana for each code change is a total productivity killer! (In theory these instructions could help you run in production on these providers too, but you won’t get banana’s serverless scaling magic).

  • Banana.Dev

    • Serverless; shared A100 with 16GB RAM, $.00051992/second
    • 10-40% volume discounts, 1 hour of free credits
    • Forum guide to deploy docker-diffusers-api on Banana.
  • Brev.dev

    • Dev-environment focused, includes Banana deployment pipeline
    • Nvidia T4G 16GB for $0.20 / hr (“g5g.xlarge”)
  • LambdaLabs.com

    • Cheapest A100 pricing at $1.10 / hr
    • Forum guide to get started in 5 copy-paste commands.
  • RunPod.io (referral link)

    • Servered and serverless offering
    • Signing up with the referral link above directly supports docker-diffusers-api development!
    • Forum guide for a quick start.
  • Vast.ai (refferal link)

    • $0.292 / hr for an RTX 3090 24GB! (price adjusts based on demand)
    • Signing up with the referral link above directly supports docker-diffusers-api development!

What else have you used and what your experiences so far?

Ive used to use runpod.io back when Disco Diffusion was all the rage. Great experience but It seems their prices have gone up after checking a moment ago. I swear they used to have the cheapest A100 but Lambda has them beat now.

Ive been using LambdaLabs for finetuning Dance Diffusion lately and its a smooth experience.

Ive heard of brev.dev (i think on this forum actually) and I’ve been wanting to check them out. Maybe ill try them this weekend.

Thanks, @coffeeorgreentea, this is great feedback and I think will really help others too! :ok_hand:

1 Like

Here is another serverless provider

$0.0005833/s
$2.10/h on A100
$30 free credit

1 Like

Oh great, thanks! Any experience with it?

Inference takes about time same. you can store HF key in their sever, the rest is about the same. They got documentation with a fair amount of examples, but still very premature. Their slack channel is opened to public. You can get a direct response from the team.

1 Like

Can we have a Replicate.com implementation as well? :sweat_smile:

Lately Banana is having a lot of outage issues, and Runpod cold start is around ~16 seconds and inference is around ~20 seconds so despite being cheaper Runpod ends up charging more due to slow speed. Other hourly options are not suitable for the production app since traffic is mostly inconsistent.

Replicate seems like a stable option and their Nividia T4 has an excellent price compared to speed.

Any thoughts?

Hey @Raj_Dhakad, welcome to the forums! Sorry for the delay while I’ve been travelling, and thanks for that interesting feedback on the other providers.

Yeah, I think we could adapt docker-diffusers-api to Replicate too. We’d probably need to bypass a lot of the Replicate specific features, so, no cog file, and probably no typed arguments (we know all the types we take for callInputs, but the modelInputs are for the most part not touched at all and parsed straight to the appropriate diffusers pipeline) - instead we’ll just take a long JSON string with everything, which should be ok.

So, the short answer is this is very possible, but unfortunately I’m a bit pressured for time at the moment and can’t say when I’ll have a chance to work on this. I’ve added it to my list though and will keep you posted here. If it’s something you’d like to try on your own, I’ll be around for guidance. I’ve opened a new topic [WIP] Can we get this working on Replicate? where we can discuss further :pray:

1 Like

Hi, @gadicc I am trying to work with replicate but the cog file is having issues with docker diffusers. Thanks for the suggestion that we need to bypass those features. I am trying with replicate and pipeline.ai and will add details if either work. :slightly_smiling_face:

1 Like