Configuration Guide
Configure OnglX Deploy projects for multi-cloud AI inference deployment using the global project management system.
Configuration Overview
OnglX Deploy uses global project management with configurations stored in ~/.onglx-deploy/projects/
. Each project gets its own directory with a deploy.yaml
configuration file. This enables:
- Multi-cloud deployment: Deploy to AWS Bedrock (stable) or Google Vertex AI (beta)
- Cost optimization: Automatic provider selection based on cost analysis
- Global project management: Use
--project
flag from any directory - AI-first architecture: Designed for OpenAI-compatible inference APIs
- Cross-platform support: Windows, Linux, and macOS compatible
Configuration File Location
Project configurations are stored globally:
1~/.onglx-deploy/
2├── .gitignore # Global gitignore
3└── projects/
4 ├── my-ai-api/
5 │ ├── deploy.yaml # Main configuration
6 │ ├── state.json # Deployment state (auto-generated)
7 │ ├── terraform.tfvars # Infrastructure variables (auto-generated)
8 │ └── build/ # Build artifacts
9 └── customer-support-ai/
10 ├── deploy.yaml
11 ├── state.json
12 └── build/
Basic Configuration Structure
Minimal AI Inference Configuration
When you run onglx-deploy init
, you get a minimal multi-cloud AI configuration:
1version: "1"
2name: "my-ai-api"
3region: "us-east-1"
4inference:
5 enabled: true
6 runtime: "python3.12"
7 memory: 512
8 timeout: 60
9 models:
10 - name: "claude-3.5-sonnet"
11 provider: "anthropic"
12 model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"
13 default: true
Complete Multi-Cloud Configuration
Here's a comprehensive example with all available options:
1version: "1"
2name: "enterprise-ai-api"
3region: "us-east-1"
4profile: "my-aws-profile"
5
6# Multi-cloud inference configuration
7inference:
8 enabled: true
9 runtime: "python3.12"
10 memory: 1024
11 timeout: 120
12
13 models:
14 # AWS Bedrock models
15 - name: "claude-3.5-sonnet"
16 provider: "anthropic"
17 model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"
18 default: true
19 - name: "claude-3-haiku"
20 provider: "anthropic"
21 model_id: "anthropic.claude-3-haiku-20240307-v1:0"
22 - name: "titan-express"
23 provider: "amazon"
24 model_id: "amazon.titan-text-express-v1"
25
26 # Google Vertex AI models (when deployed to GCP)
27 - name: "gemini-pro"
28 provider: "google"
29 model_id: "gemini-pro"
30
31 rate_limiting:
32 requests_per_minute: 200
33 tokens_per_minute: 500000
34
35 cors:
36 allow_origins:
37 - "https://myapp.com"
38 - "http://localhost:3000"
39 allow_methods: ["POST", "OPTIONS"]
40 allow_headers:
41 - "Content-Type"
42 - "Authorization"
43 allow_credentials: false
44
45# Multi-cloud settings
46cloud:
47 primary_provider: "auto" # or "aws", "gcp"
48 enable_cost_optimization: true
49 failover_enabled: false
Configuration Fields
Project Settings
version: "1" # Configuration version (required)
name: "my-ai-api" # Project name (required, lowercase, hyphens allowed)
region: "us-east-1" # Primary deployment region (required)
profile: "my-aws-profile" # AWS profile name (optional)
gcp_project: "my-gcp-project" # GCP project ID (optional)
Naming Rules:
- Project name must be lowercase
- Hyphens allowed, no spaces or special characters
- Must be unique within your AWS account
- Used for naming AWS resources
Inference Configuration
inference:
enabled: true # Enable inference domain (required)
runtime: "python3.12" # Python runtime (python3.12 recommended)
memory: 1024 # Memory in MB (128-3008 AWS, 128-8192 GCP)
timeout: 120 # Timeout in seconds (1-900 AWS, 1-3600 GCP)
Model Configuration
1inference:
2 models:
3 # AWS Bedrock Models
4 - name: "claude-3.5-sonnet" # API name
5 provider: "anthropic" # Model provider
6 model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0" # Bedrock model ID
7 default: true # Default model
8 - name: "titan-express"
9 provider: "amazon"
10 model_id: "amazon.titan-text-express-v1"
11
12 # Google Vertex AI Models (when using GCP provider)
13 - name: "gemini-pro"
14 provider: "google"
15 model_id: "gemini-pro"
16 - name: "gemini-flash"
17 provider: "google"
18 model_id: "gemini-1.5-flash"
Rate Limiting Configuration
inference:
rate_limiting:
requests_per_minute: 100 # Max requests per minute
tokens_per_minute: 100000 # Max tokens per minute (optional)
burst_limit: 10 # Allow burst requests
Inference Configuration
✅ Now Available: Deploy OpenAI-compatible AI APIs using AWS Bedrock with 40-85% cost savings!
Deploy AI inference APIs with OpenAI compatibility:
1inference:
2 enabled: true # Enable inference domain (required)
3 runtime: "python3.12" # Python runtime (python3.12 recommended)
4 memory: 512 # Lambda memory in MB (128-3008)
5 timeout: 60 # Lambda timeout in seconds (1-900)
6
7 models: # Available models
8 - name: "claude-3.5-sonnet" # Model identifier
9 provider: "anthropic" # Model provider
10 model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0" # Bedrock model ID
11 default: true # Default model for requests
12
13 - name: "claude-3-haiku"
14 provider: "anthropic"
15 model_id: "anthropic.claude-3-haiku-20240307-v1:0"
16 default: false
17
18 rate_limiting: # Optional rate limiting
19 requests_per_minute: 100 # Max requests per minute
20 tokens_per_minute: 100000 # Max tokens per minute (optional)
21
22 cors: # CORS configuration
23 allow_origins:
24 - "https://myapp.com" # Allowed origins
25 - "http://localhost:3000" # Development origin
26 allow_methods: ["POST", "OPTIONS"] # Allowed HTTP methods
27 allow_headers:
28 - "Content-Type"
29 - "Authorization"
30 allow_credentials: false # Allow credentials
Quick Example - Multi-Cloud AI API:
1version: "1"
2name: "my-ai-api"
3region: "us-east-1"
4inference:
5 enabled: true
6 models:
7 - name: "claude-3.5-sonnet" # AWS Bedrock
8 provider: "anthropic"
9 model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"
10 default: true
11 - name: "gemini-pro" # Google Vertex AI
12 provider: "google"
13 model_id: "gemini-pro"
14cloud:
15 primary_provider: "auto" # Auto-select best provider
Advanced Example - Multi-Model with Rate Limiting:
1version: "1"
2name: "ai-platform"
3region: "us-east-1"
4inference:
5 enabled: true
6 runtime: "python3.12"
7 memory: 1024
8 timeout: 120
9 models:
10 - name: "claude-3.5-sonnet"
11 provider: "anthropic"
12 model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"
13 default: true
14 - name: "claude-3-haiku"
15 provider: "anthropic"
16 model_id: "anthropic.claude-3-haiku-20240307-v1:0"
17 - name: "titan-express"
18 provider: "amazon"
19 model_id: "amazon.titan-text-express-v1"
20 rate_limiting:
21 requests_per_minute: 200
22 tokens_per_minute: 500000
23 cors:
24 allow_origins: ["https://app.mycompany.com"]
25 allow_methods: ["POST", "OPTIONS"]
26 allow_headers: ["Content-Type", "Authorization", "User-Agent"]
Multi-Cloud Provider Configuration
Deploy to multiple cloud providers with failover:
1version: "1"
2name: "enterprise-ai-api"
3region: "us-east-1"
4
5# Multi-cloud inference API
6inference:
7 enabled: true
8 models:
9 # Primary: AWS Bedrock
10 - name: "claude-3.5-sonnet"
11 provider: "anthropic"
12 model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"
13 default: true
14 # Backup: Google Vertex AI
15 - name: "gemini-pro"
16 provider: "google"
17 model_id: "gemini-pro"
18
19# Cloud configuration
20cloud:
21 primary_provider: "aws" # Primary deployment target
22 backup_providers: ["gcp"] # Failover providers
23 failover_enabled: true # Enable automatic failover
24 cost_optimization: true # Auto-switch for cost savings
CORS Configuration
1inference:
2 cors:
3 allow_origins: # Allowed origins
4 - "https://myapp.com"
5 - "https://www.myapp.com"
6 - "http://localhost:3000" # Development
7 allow_methods: # HTTP methods
8 - "POST"
9 - "OPTIONS"
10 allow_headers: # Headers
11 - "Content-Type"
12 - "Authorization"
13 - "User-Agent"
14 allow_credentials: false # Credentials support
CLI Configuration Management
Project Initialization
Create a new AI API project:
1# Interactive initialization (recommended)
2onglx-deploy init
3# Choose: project name, enable inference, select models
4
5# Initialize with specific region
6onglx-deploy --region eu-west-1 init
7
8# List all configured projects
9onglx-deploy projects list
10
11# Compare cloud providers before deploy
12onglx-deploy --project my-app cloud compare
Validation
Validate project configurations:
# Validate specific project configuration
onglx-deploy --project my-app config validate
# Validate with verbose output
onglx-deploy --verbose --project my-app config validate
Common validation errors:
- Invalid AWS region
- Missing required fields
- Invalid memory/timeout values
- Malformed YAML syntax
- Invalid project name format
Environment Management
Multiple Environments
Manage multiple environments with different cloud providers:
1# Development on GCP (free credits)
2onglx-deploy --project my-app-dev cloud deploy --provider gcp
3
4# Staging on AWS
5onglx-deploy --project my-app-staging cloud deploy --provider aws
6
7# Production with multi-cloud failover
8onglx-deploy --project my-app-prod cloud deploy --provider aws,gcp --failover
9
10# Compare costs across environments
11onglx-deploy --project my-app-dev cloud compare
12onglx-deploy --project my-app-staging cloud compare
Environment-Specific Configuration
Create environment-specific AI API projects:
1# Create projects for different environments
2onglx-deploy init --project customer-ai-dev
3onglx-deploy init --project customer-ai-staging
4onglx-deploy init --project customer-ai-prod
5
6# Deploy with different models per environment
7# Dev: Use fast, cheap model
8onglx-deploy --project customer-ai-dev deploy # Uses Gemini Flash/Haiku
9# Prod: Use best model
10onglx-deploy --project customer-ai-prod deploy # Uses Claude 3.5/Gemini Pro
Environment Variables
Set environment variables for deployment:
# Set environment variables before deployment
export NODE_ENV=production
export API_URL=https://api.myapp.com
export ENVIRONMENT=prod
# Deploy with environment variables
onglx-deploy --profile production deploy
AWS Resource Configuration
Function Resource Settings
inference:
runtime: "python3.12" # Python runtime for AI inference
memory: 1024 # Memory allocation (128-8192 MB)
timeout: 120 # Timeout in seconds (1-3600)
architecture: "arm64" # Architecture (arm64 recommended)
Memory Guidelines for AI Inference:
- 512 MB: Simple text completion, short responses
- 1024 MB: Standard conversations, moderate complexity
- 2048 MB: Long conversations, complex reasoning
- 4096 MB+: Large context windows, document processing
Multi-Cloud Settings
Configure behavior across cloud providers:
1cloud:
2 primary_provider: "gcp" # Primary deployment target
3 backup_providers: ["aws", "azure"] # Failover sequence
4
5 cost_optimization:
6 enabled: true # Enable automatic cost optimization
7 check_interval: "daily" # How often to check costs
8 switch_threshold: 20 # Switch if 20% cheaper available
9
10 failover:
11 enabled: true # Enable automatic failover
12 health_check_interval: "5m" # Health check frequency
13 failure_threshold: 3 # Failures before failover
Configuration Best Practices
1. Environment Separation
Use different AWS profiles for different environments:
# Configure profiles
aws configure --profile onglx-dev
aws configure --profile onglx-staging
aws configure --profile onglx-prod
# Deploy to appropriate environment
onglx-deploy --profile onglx-prod deploy
2. Secure Environment Variables
Never commit secrets to your configuration file:
1# ❌ Bad - secrets in config
2compute:
3 nextjs:
4 environment:
5 DATABASE_URL: "postgresql://user:password123@host/db"
6
7# ✅ Good - use environment variables
8compute:
9 nextjs:
10 environment:
11 DATABASE_URL: "${DATABASE_URL}"
3. Descriptive Naming
Use descriptive project names:
# ✅ Good - clear and descriptive
name: "mycompany-customer-portal"
# ❌ Avoid - too generic
name: "app"
4. Resource Right-Sizing
Start with conservative resources and scale up:
compute:
memory: 1024 # Start with 1GB
timeout: 30 # 30 seconds is usually sufficient
5. Build Command Optimization
Use the most efficient build command:
compute:
nextjs:
auto_build: true # CLI automatically optimizes build process
# CLI will auto-detect the best build method (npm, pnpm, yarn)
Advanced Configuration
Custom Domains with SSL
1domain:
2 name: "myapp.com"
3 certificate_arn: "arn:aws:acm:us-east-1:123456789012:certificate/12345678-1234-1234-1234-123456789012"
4 aliases:
5 - "www.myapp.com"
6 - "app.myapp.com"
7 redirect_www: true # Redirect www to apex domain
8 security_policy: "TLSv1.2_2021" # Minimum TLS version
Multi-Region Deployment
Deploy to multiple regions:
1# Deploy to US East
2onglx-deploy --region us-east-1 deploy
3
4# Deploy to Europe
5onglx-deploy --region eu-west-1 deploy
6
7# Deploy to Asia Pacific
8onglx-deploy --region ap-southeast-1 deploy
Storage Configuration
1storage:
2 buckets:
3 - name: "static-assets"
4 public: true
5 cdn: true
6 versioning: true
7 lifecycle:
8 - rule: "cleanup-old-versions"
9 days: 30
10 - name: "user-uploads"
11 public: false
12 cdn: false
13 encryption: true
Configuration Validation
Automatic Validation
OnglX Deploy validates your configuration automatically:
# Validation happens during deployment
onglx-deploy deploy
# Explicit validation
onglx-deploy config validate
Manual Validation
Validate specific aspects:
1# Check AWS permissions
2onglx-deploy auth
3
4# Test build process
5onglx-deploy deploy --dry-run
6
7# Test AWS connectivity
8aws sts get-caller-identity --profile your-profile
Troubleshooting Configuration
Common Configuration Issues
Invalid YAML Syntax
# Validate configuration with CLI
onglx-deploy config validate
AWS Region Issues
# Valid AWS regions
region: "us-east-1" # ✅ Virginia
region: "us-west-2" # ✅ Oregon
region: "eu-west-1" # ✅ Ireland
region: "ap-southeast-1" # ✅ Singapore
region: "us-east" # ❌ Invalid
Memory/Timeout Limits
compute:
memory: 128 # Minimum
memory: 10240 # Maximum
timeout: 1 # Minimum (seconds)
timeout: 900 # Maximum (15 minutes)
Debug Configuration
1# View project configuration
2cat ~/.onglx-deploy/projects/my-app/deploy.yaml
3
4# Validate with verbose output
5onglx-deploy --verbose --project my-app config validate
6
7# List all projects
8onglx-deploy list
9
10# Test deployment without applying
11onglx-deploy --project my-app deploy --dry-run
Configuration Examples
Simple Blog
1version: "1"
2name: "my-blog"
3region: "us-east-1"
4compute:
5 type: "nextjs"
6 memory: 512
7 nextjs:
8 auto_build: true # CLI handles build automatically
9 environment:
10 NEXT_PUBLIC_SITE_URL: "https://myblog.com"
11domain:
12 name: "myblog.com"
E-commerce Application
1version: "1"
2name: "ecommerce-app"
3region: "us-east-1"
4compute:
5 type: "nextjs"
6 memory: 2048
7 timeout: 60
8 nextjs:
9 auto_build: true # CLI optimizes build process
10 edge: true
11 environment:
12 NODE_ENV: "production"
13 NEXT_PUBLIC_STRIPE_KEY: "${STRIPE_PUBLIC_KEY}"
14 NEXT_PUBLIC_API_URL: "https://api.mystore.com"
15 DATABASE_URL: "${DATABASE_URL}"
16 STRIPE_SECRET_KEY: "${STRIPE_SECRET_KEY}"
17domain:
18 name: "mystore.com"
19 aliases:
20 - "www.mystore.com"
Enterprise Application
1version: "1"
2name: "enterprise-dashboard"
3region: "us-east-1"
4profile: "company-production"
5compute:
6 type: "nextjs"
7 memory: 4096
8 timeout: 120
9 nextjs:
10 auto_build: true # CLI optimizes build process
11 edge: true
12 environment:
13 NODE_ENV: "production"
14 NEXT_PUBLIC_API_URL: "https://api.company.com"
15 NEXT_PUBLIC_APP_VERSION: "2.1.0"
16 DATABASE_URL: "${DATABASE_URL}"
17 REDIS_URL: "${REDIS_URL}"
18 JWT_SECRET: "${JWT_SECRET}"
19domain:
20 name: "dashboard.company.com"
21 certificate_arn: "${SSL_CERTIFICATE_ARN}"
22storage:
23 buckets:
24 - name: "user-documents"
25 public: false
26 cdn: false
27 encryption: true
Cloud Provider Configuration
AWS Configuration
1aws:
2 region: "us-east-1"
3 profile: "default"
4
5 # Bedrock-specific settings
6 bedrock:
7 region: "us-east-1" # Bedrock region (may differ from deployment)
8 models_enabled: # Models you've requested access for
9 - "anthropic.claude-3-5-sonnet-20241022-v2:0"
10 - "amazon.titan-text-express-v1"
GCP Configuration
1gcp:
2 project_id: "my-gcp-project"
3 region: "us-central1"
4
5 # Vertex AI settings
6 vertex_ai:
7 location: "us-central1" # Vertex AI location
8 models_enabled: # Available models
9 - "gemini-pro"
10 - "gemini-1.5-flash"
Cost Optimization Configuration
cost_optimization:
enabled: true
strategy: "cheapest_first" # Options: cheapest_first, balanced, performance_first
budget_alerts:
monthly_limit: 100 # Alert if monthly cost exceeds
alert_thresholds: [50, 80, 95] # Alert at these percentages
Next Steps
- Inference Guide - Deploy your AI API
- AWS Setup Guide - Configure AWS for deployment
- AWS Setup Guide - Configure AWS cloud
- CLI Reference - Command line interface reference
- Security Best Practices - Secure your deployment
Need help with configuration? Use onglx-deploy --project my-app config validate
to check your settings or onglx-deploy --help
for CLI assistance.