Can we have a Replicate.com implementation as well?
Lately Banana is having a lot of outage issues, and Runpod cold start is around ~16 seconds and inference is around ~20 seconds so despite being cheaper Runpod ends up charging more due to slow speed. Other hourly options are not suitable for the production app since traffic is mostly inconsistent.
Replicate seems like a stable option and their Nividia T4 has an excellent price compared to speed.
Any thoughts?