Storage Backendsο
PutPlace supports multiple storage backends for file content. File metadata is always stored in MongoDB, while actual file content can be stored in different backends.
Overviewο
Available Backendsο
Local Filesystem - Store files on local disk
AWS S3 - Store files in Amazon S3
Storage Architectureο
ββββββββββββββββ
β Client β
ββββββββ¬ββββββββ
β
β 1. Send metadata
βΌ
ββββββββββββββββ ββββββββββββββββ
β PutPlace ββββββΆβ MongoDB β
β API β β (metadata) β
ββββββββ¬ββββββββ ββββββββββββββββ
β
β 2. Upload content
βΌ
ββββββββββββββββ
β Storage β
β Backend β
ββββββββ¬ββββββββ
β
βββββ΄βββββ
β β
βΌ βΌ
ββββββββ ββββββ
βLocal β β S3 β
ββββββββ ββββββ
Local Filesystem Storageο
Overviewο
Store files on local disk. Best for:
Development and testing
Single-server deployments
Fast local access
No cloud dependencies
Configurationο
Using ppserver.toml (recommended):
[storage]
backend = "local"
path = "/var/putplace/files"
Environment variables:
STORAGE_BACKEND=local
STORAGE_PATH=/var/putplace/files
In .env file (legacy):
# Storage Backend
STORAGE_BACKEND=local
STORAGE_PATH=/var/putplace/files
Directory Structureο
Files are distributed across 256 subdirectories based on the first two characters of their SHA256 hash:
/var/putplace/files/
βββ 00/
β βββ 00a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6...
β βββ 00f9e8d7c6b5a4938271605948372615049382716050483726...
βββ 01/
β βββ 01b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5...
βββ 02/
β βββ ...
βββ ...
βββ ff/
βββ ffa1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4...
Why 256 subdirectories?
Prevents too many files in single directory
Better filesystem performance
Evenly distributes files (SHA256 is uniformly random)
Setupο
1. Create Storage Directoryο
# Create directory
sudo mkdir -p /var/putplace/files
# Set ownership
sudo chown $USER:$USER /var/putplace/files
# Set permissions
chmod 755 /var/putplace/files
2. Verify Permissionsο
# Test write access
touch /var/putplace/files/test
rm /var/putplace/files/test
# Check permissions
ls -ld /var/putplace/files
# Should show: drwxr-xr-x
3. Configure PutPlaceο
# In .env file
STORAGE_BACKEND=local
STORAGE_PATH=/var/putplace/files
4. Start Serverο
uvicorn putplace.main:app
You should see:
INFO: Initialized local storage backend at /var/putplace/files
Disk Space Managementο
Check Disk Usageο
# Total storage used
du -sh /var/putplace/files
# Per-directory usage
du -h --max-depth=1 /var/putplace/files | sort -hr
# Number of files
find /var/putplace/files -type f | wc -l
Cleanup Old Filesο
# Find files older than 90 days
find /var/putplace/files -type f -mtime +90
# Delete files older than 90 days (BE CAREFUL!)
find /var/putplace/files -type f -mtime +90 -delete
Monitor Disk Spaceο
Add to cron for daily monitoring:
#!/bin/bash
# /usr/local/bin/check-putplace-disk.sh
THRESHOLD=80 # Alert at 80% usage
USAGE=$(df -h /var/putplace/files | tail -1 | awk '{print $5}' | sed 's/%//')
if [ "$USAGE" -gt "$THRESHOLD" ]; then
echo "WARNING: PutPlace storage at ${USAGE}% usage"
# Send alert (e.g., email, Slack, etc.)
fi
Performance Considerationsο
Pros:
Very fast (no network latency)
Simple setup
No external dependencies
No usage costs
Cons:
Limited to single server
No built-in replication
Manual backup required
Disk space limited
Optimization tips:
Use SSD for storage path
Monitor disk I/O
Consider RAID for redundancy
Regular backups essential
AWS S3 Storageο
Overviewο
Store files in Amazon S3. Best for:
Multi-server deployments
Cloud-native infrastructure
Scalability requirements
High durability needs (99.999999999%)
Configurationο
Using ppserver.toml (recommended):
[storage]
backend = "s3"
s3_bucket_name = "my-putplace-bucket"
s3_region_name = "us-east-1"
s3_prefix = "files/"
[aws]
# Use IAM role (recommended) or AWS profile
profile = "putplace"
Environment variables:
STORAGE_BACKEND=s3
S3_BUCKET_NAME=my-putplace-bucket
S3_REGION_NAME=us-east-1
S3_PREFIX=files/
S3_STORAGE_CLASS=STANDARD
# Optional: AWS credentials (recommended to use IAM roles instead)
AWS_PROFILE=putplace
# OR
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
In .env file (legacy):
# Storage Backend
STORAGE_BACKEND=s3
S3_BUCKET_NAME=my-putplace-bucket
S3_REGION_NAME=us-east-1
S3_PREFIX=files/
S3_STORAGE_CLASS=STANDARD
# AWS Credentials (use IAM role if on EC2/ECS)
AWS_PROFILE=putplace
S3 Key Structureο
Files are stored with keys following the same distribution pattern:
s3://my-putplace-bucket/
βββ files/
βββ 00/
β βββ 00a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6...
β βββ 00f9e8d7c6b5a4938271605948372615049382716050483726...
βββ 01/
β βββ 01b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5...
βββ ...
Key format: {prefix}{sha256[:2]}/{sha256}
Setupο
1. Create S3 Bucketο
# Create bucket
aws s3 mb s3://my-putplace-bucket --region us-east-1
# Enable versioning (optional, for recovery)
aws s3api put-bucket-versioning \
--bucket my-putplace-bucket \
--versioning-configuration Status=Enabled
# Enable encryption
aws s3api put-bucket-encryption \
--bucket my-putplace-bucket \
--server-side-encryption-configuration '{
"Rules": [{
"ApplyServerSideEncryptionByDefault": {
"SSEAlgorithm": "AES256"
}
}]
}'
2. Configure IAM Policyο
For EC2/ECS (using IAM role):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::my-putplace-bucket/files/*"
},
{
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::my-putplace-bucket",
"Condition": {
"StringLike": {
"s3:prefix": "files/*"
}
}
}
]
}
For IAM user:
Same policy, then attach to user or create access keys.
3. Configure Credentialsο
Option A: IAM Role (Recommended for EC2/ECS)
No configuration needed! PutPlace automatically uses instance metadata:
# In .env - no AWS credentials needed
STORAGE_BACKEND=s3
S3_BUCKET_NAME=my-putplace-bucket
S3_REGION_NAME=us-east-1
Option B: AWS Profile
# Configure AWS CLI
aws configure --profile putplace
# Enter access key, secret key, region
# In .env
STORAGE_BACKEND=s3
S3_BUCKET_NAME=my-putplace-bucket
S3_REGION_NAME=us-east-1
AWS_PROFILE=putplace
Option C: Environment Variables
# In .env
STORAGE_BACKEND=s3
S3_BUCKET_NAME=my-putplace-bucket
S3_REGION_NAME=us-east-1
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
See SECURITY.md for detailed credential setup.
4. Test Connectionο
# Test S3 access
aws s3 ls s3://my-putplace-bucket --profile putplace
# Test upload
echo "test" > /tmp/test.txt
aws s3 cp /tmp/test.txt s3://my-putplace-bucket/test.txt --profile putplace
# Test download
aws s3 cp s3://my-putplace-bucket/test.txt /tmp/test-download.txt --profile putplace
# Cleanup
aws s3 rm s3://my-putplace-bucket/test.txt --profile putplace
rm /tmp/test.txt /tmp/test-download.txt
5. Start Serverο
uvicorn putplace.main:app
You should see:
INFO: Initialized S3 storage backend: bucket=my-putplace-bucket, region=us-east-1
Storage Classesο
S3 offers different storage classes for cost optimization:
STANDARD (Default)ο
S3_STORAGE_CLASS=STANDARD
Use case: Frequently accessed files
Durability: 99.999999999% (11 9βs)
Availability: 99.99%
Cost: $0.023/GB/month (us-east-1)
STANDARD_IA (Infrequent Access)ο
S3_STORAGE_CLASS=STANDARD_IA
Use case: Accessed less than monthly
Durability: 99.999999999%
Availability: 99.9%
Cost: $0.0125/GB/month + retrieval fees
INTELLIGENT_TIERINGο
S3_STORAGE_CLASS=INTELLIGENT_TIERING
Use case: Unknown or changing access patterns
Automatically moves between tiers
Cost: $0.023/GB/month + monitoring fee
GLACIERο
S3_STORAGE_CLASS=GLACIER
Use case: Archival, rarely accessed
Retrieval time: Minutes to hours
Cost: $0.004/GB/month + retrieval fees
Lifecycle Policiesο
Automatically transition files to cheaper storage classes:
# Create lifecycle policy
aws s3api put-bucket-lifecycle-configuration \
--bucket my-putplace-bucket \
--lifecycle-configuration '{
"Rules": [{
"Id": "TransitionToIA",
"Status": "Enabled",
"Prefix": "files/",
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
}
]
}]
}'
Effect:
Files accessed frequently: STANDARD
After 30 days: STANDARD_IA
After 90 days: GLACIER
Cost Estimationο
Example: 1TB storage, 100GB monthly uploads
STANDARD:
Storage: 1000 GB Γ $0.023 = $23/month
PUT requests: 100,000 Γ $0.005/1000 = $0.50/month
Total: ~$23.50/month
STANDARD_IA (after 30 days):
Storage: 1000 GB Γ $0.0125 = $12.50/month
PUT requests: $0.50/month
Retrieval: Depends on access patterns
Total: ~$13-15/month
INTELLIGENT_TIERING:
Storage: Varies based on access
Monitoring: 1000 GB Γ $0.0025 = $2.50/month
Total: ~$15-25/month
Use AWS Pricing Calculator for accurate estimates.
Performance Considerationsο
Pros:
Highly scalable (unlimited storage)
Highly durable (99.999999999%)
No server maintenance
Multi-region replication available
Cost-effective for large datasets
Cons:
Network latency for uploads/downloads
Data transfer costs
API request costs
More complex setup
Optimization tips:
Use same region as PutPlace server
Enable S3 Transfer Acceleration for global access
Use VPC endpoints to avoid data transfer costs
Monitor CloudWatch metrics
Use lifecycle policies for cost optimization
Monitoringο
CloudWatch Metricsο
# Monitor bucket size
aws cloudwatch get-metric-statistics \
--namespace AWS/S3 \
--metric-name BucketSizeBytes \
--dimensions Name=BucketName,Value=my-putplace-bucket Name=StorageType,Value=StandardStorage \
--start-time 2025-01-01T00:00:00Z \
--end-time 2025-01-15T00:00:00Z \
--period 86400 \
--statistics Average
# Monitor request count
aws cloudwatch get-metric-statistics \
--namespace AWS/S3 \
--metric-name NumberOfObjects \
--dimensions Name=BucketName,Value=my-putplace-bucket Name=StorageType,Value=AllStorageTypes \
--start-time 2025-01-01T00:00:00Z \
--end-time 2025-01-15T00:00:00Z \
--period 86400 \
--statistics Average
Cost Monitoringο
# Enable bucket metrics
aws s3api put-bucket-metrics-configuration \
--bucket my-putplace-bucket \
--id PutPlaceMetrics \
--metrics-configuration '{
"Id": "PutPlaceMetrics",
"Filter": {
"Prefix": "files/"
}
}'
Switching Storage Backendsο
From Local to S3ο
1. Set up S3 (see above)ο
2. Migrate existing filesο
#!/bin/bash
# migrate-local-to-s3.sh
LOCAL_PATH="/var/putplace/files"
S3_BUCKET="my-putplace-bucket"
S3_PREFIX="files/"
# Sync files to S3
aws s3 sync "$LOCAL_PATH" "s3://$S3_BUCKET/$S3_PREFIX" \
--storage-class STANDARD \
--metadata sha256verified=true
# Verify file count
LOCAL_COUNT=$(find "$LOCAL_PATH" -type f | wc -l)
S3_COUNT=$(aws s3 ls "s3://$S3_BUCKET/$S3_PREFIX" --recursive | wc -l)
echo "Local files: $LOCAL_COUNT"
echo "S3 files: $S3_COUNT"
if [ "$LOCAL_COUNT" -eq "$S3_COUNT" ]; then
echo "β Migration complete!"
else
echo "β File count mismatch!"
exit 1
fi
3. Update configurationο
# In .env
# STORAGE_BACKEND=local # Comment out
STORAGE_BACKEND=s3
S3_BUCKET_NAME=my-putplace-bucket
S3_REGION_NAME=us-east-1
4. Restart serverο
sudo systemctl restart putplace
5. Verifyο
# Test upload
python ppclient.py /tmp/test.txt
# Check server logs
sudo journalctl -u putplace -f
6. Cleanup (optional)ο
After verifying S3 works:
# Backup local files first!
tar -czf /backup/putplace-files-$(date +%Y%m%d).tar.gz /var/putplace/files
# Remove local files
rm -rf /var/putplace/files/*
From S3 to Localο
Reverse of above process.
Hybrid Storageο
Currently, PutPlace does not support hybrid storage (using both local and S3 simultaneously). You must choose one backend.
Workaround: Use multiple PutPlace instances with different storage backends.
Backup and Recoveryο
Local Filesystem Backupο
#!/bin/bash
# backup-local-storage.sh
BACKUP_DIR="/backup/putplace"
STORAGE_PATH="/var/putplace/files"
DATE=$(date +%Y%m%d)
# Create incremental backup
rsync -av --link-dest="$BACKUP_DIR/latest" \
"$STORAGE_PATH/" \
"$BACKUP_DIR/$DATE/"
# Update latest symlink
ln -snf "$BACKUP_DIR/$DATE" "$BACKUP_DIR/latest"
# Remove backups older than 30 days
find "$BACKUP_DIR" -maxdepth 1 -type d -mtime +30 -exec rm -rf {} \;
S3 Backupο
S3 is already highly durable. Additional backup options:
Cross-region replication:
# Enable replication to another region
aws s3api put-bucket-replication \
--bucket my-putplace-bucket \
--replication-configuration file://replication.json
Versioning:
# Enable versioning (keeps old versions)
aws s3api put-bucket-versioning \
--bucket my-putplace-bucket \
--versioning-configuration Status=Enabled
Backup to Glacier:
Use lifecycle policies to automatically archive to Glacier (see above).
Troubleshootingο
Local Storage Issuesο
Permission denied:
# Fix ownership
sudo chown -R $USER:$USER /var/putplace/files
# Fix permissions
chmod -R 755 /var/putplace/files
Disk full:
# Check disk space
df -h /var/putplace/files
# Find large files
du -h --max-depth=1 /var/putplace/files | sort -hr | head -20
# Cleanup old files (carefully!)
find /var/putplace/files -type f -mtime +90 -delete
S3 Storage Issuesο
Connection timeout:
# Check network connectivity
curl -I https://s3.us-east-1.amazonaws.com
# Check AWS credentials
aws sts get-caller-identity --profile putplace
# Check S3 access
aws s3 ls s3://my-putplace-bucket --profile putplace
Access denied:
# Verify IAM policy
aws iam get-user-policy --user-name putplace-user --policy-name putplace-s3-access
# Test permissions
aws s3 cp /tmp/test.txt s3://my-putplace-bucket/test.txt --profile putplace
High costs:
# Check storage class
aws s3api head-object --bucket my-putplace-bucket --key files/00/00abc123...
# Check lifecycle policies
aws s3api get-bucket-lifecycle-configuration --bucket my-putplace-bucket
# Review CloudWatch metrics
# See monitoring section above
Next Stepsο
Configuration Reference - Storage configuration options
Security Guide - AWS credentials and security
Deployment Guide - Production deployment
Troubleshooting - Common issues