On-prem S3-compatible Storage

Motivation

Particularly with e.g. dreambooth training, we lose a lot of time uploading the trained model to external storage, only then download it back again. This both wastes time affecting user experience, our billing as we’re paying for GPU seconds just for upload/time time, and presumably is also just a waste of banana’s bandwidth.

Proposal

An on-prem MINIO instance. I feel like this is low-hanging fruit as deployment is so easy (Banana is already running Kubernetes).

It could be firewalled for local-net-only access as the easiest way to prevent abuse and high costs.

Stage 1

Private test with select customers, who are manually provided with API credentials.

Stage 2

Credentials are provisioned automatically and made available automatically to containers as build/env vars, e.g.

BANANA_S3_ACCESS_KEY_ID
BANANA_S3_SECRET_ACCESS_KEY
BANANA_S3_ENDPOINT_URL

and are then available for use with no further setup.

Stage 3

Quotas and billing for excessive storage.