I want to minimize the cost involved in running a Qdrant vector database on Google Cloud.
Why Google Cloud Run for a running a vector database?
Running on Google Cloud Run offers the following advantages - Auto Scaling with load - We can keep the –min-instances to 0 to to stop running when there is no load.
But the challenge is - storage on the Cloud Run containers is ephemeral. So what would be the common persistent storage for Qdrant?
Luckily, from gen2, Cloud Run supports mounting a cloud storage bucket as a volume, which can be accessible across the cloud run instances.
So here are the gcloud command to setup create the storage bucket and to update/deploy the cloud run service.
# set the project name and region
export PROJECT_ID=project_xyz
export REGION=asia-southeast1
export BUCKET_NAME=bucket_name
# switch to the project
gcloud config set project $PROJECT_ID
# create a cloud storage bucket
gsutil mb -l $REGION gs://$BUCKET_NAME
# deploy the cloud run service
gcloud beta run deploy qdrant --image qdrant/qdrant:latest \
\
--execution-environment gen2 $BUCKET_NAME \
--add-volume=name=qdrant_storage,type=cloud-storage,bucket=\
--add-volume-mount=volume=qdrant_storage,mount-path=/qdrant/storage --max-instances 1 \
--allow-unauthenticated $QDRANT__SERVICE__API_KEY \
--update-env-vars QDRANT__SERVICE__HTTP_PORT=8080,QDRANT__SERVICE__API_KEY=\
--memory 512Mi $REGION --project $PROJECT_ID
--region
# to update the cloud run service
gcloud beta run services update qdrant \
--project $PROJECT_ID \
--region asia-southeast1 --max-instances 1 \ --memory 512Mi