Deploy AI Models on DigitalOcean
Complete guide to deploying AI models on DigitalOcean with Kubernetes and App Platform
Deploy AI Models on DigitalOcean
DigitalOcean provides simple, cost-effective infrastructure for deploying AI models.
Prerequisites
- DigitalOcean account
- doctl CLI installed
- Basic Kubernetes knowledge
- Docker installed
Deployment Options
1. App Platform
Simplest deployment method:
# .do/app.yaml
name: ai-model-app
services:
- name: api
github:
repo: your-username/ai-model
branch: main
build_command: pip install -r requirements.txt
run_command: uvicorn app:app --host 0.0.0.0 --port 8080
envs:
- key: HUGGING_FACE_TOKEN
scope: RUN_TIME
type: SECRET
instance_count: 2
instance_size_slug: professional-s
2. Droplets (VMs)
For full control:
# Create droplet
doctl compute droplet create ai-model --image ubuntu-22-04-x64 --size g-2vcpu-8gb --region nyc1 --ssh-keys your-ssh-key-id
# SSH into droplet
doctl compute ssh ai-model
# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
# Run model
docker run -d -p 8000:8000 -e HUGGING_FACE_TOKEN=your-token vllm/vllm-openai:latest --model meta-llama/Llama-3.1-8B-Instruct
3. Kubernetes (DOKS)
For production workloads:
# Create cluster
doctl kubernetes cluster create ai-cluster --region nyc1 --node-pool "name=worker-pool;size=s-2vcpu-4gb;count=3"
# Get kubeconfig
doctl kubernetes cluster kubeconfig save ai-cluster
# Deploy
kubectl apply -f deployment.yaml
Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: ai-model
spec:
replicas: 2
selector:
matchLabels:
app: ai-model
template:
metadata:
labels:
app: ai-model
spec:
containers:
- name: model
image: vllm/vllm-openai:latest
ports:
- containerPort: 8000
resources:
requests:
memory: "8Gi"
cpu: "2"
limits:
memory: "8Gi"
cpu: "2"
---
apiVersion: v1
kind: Service
metadata:
name: ai-model-service
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8000
selector:
app: ai-model
Spaces (Object Storage)
Store models in Spaces:
import boto3
s3 = boto3.client('s3',
endpoint_url='https://nyc3.digitaloceanspaces.com',
aws_access_key_id='your-key',
aws_secret_access_key='your-secret'
)
# Download model
s3.download_file('my-space', 'models/llama.bin', '/app/model.bin')
Monitoring
Built-in Monitoring
- CPU usage
- Memory usage
- Disk I/O
- Network traffic
Custom Alerts
doctl monitoring alert create --type v1/insights/droplet/cpu --compare GreaterThan --value 80 --window 5m --entities droplet-id
Cost Optimization
- Use appropriate droplet sizes
- Enable auto-scaling on DOKS
- Use Spaces for model storage
- Implement caching
- Use reserved instances for predictable workloads
Backup Strategy
# Create snapshot
doctl compute droplet-action snapshot ai-model --snapshot-name ai-model-backup-$(date +%Y%m%d)
# Backup volumes
doctl compute volume-action snapshot volume-id --snapshot-name volume-backup-$(date +%Y%m%d)
Production Checklist
- [ ] Set up load balancer
- [ ] Configure firewall rules
- [ ] Add monitoring and alerts
- [ ] Set up automated backups
- [ ] Configure auto-scaling
- [ ] Add custom domain
- [ ] Enable SSL/TLS
- [ ] Set up logging
- [ ] Implement rate limiting
- [ ] Document deployment
Related Guides
Deploy AI Models on AWS
Complete guide to deploying open-source AI models on Amazon Web Services
Deploy AI Models on Google Cloud Platform
Complete guide to deploying open-source AI models on GCP
Deploy AI Models on Microsoft Azure
Complete guide to deploying open-source AI models on Azure
Deploy AI Models with Docker
Complete guide to containerizing and deploying AI models with Docker