Configuration Guide

Configure OnglX Deploy projects for multi-cloud AI inference deployment using the global project management system.

Configuration Overview

OnglX Deploy uses global project management with configurations stored in ~/.onglx-deploy/projects/. Each project gets its own directory with a deploy.yaml configuration file. This enables:

Multi-cloud deployment: Deploy to AWS Bedrock (stable) or Google Vertex AI (beta)
Cost optimization: Automatic provider selection based on cost analysis
Global project management: Use --project flag from any directory
AI-first architecture: Designed for OpenAI-compatible inference APIs
Cross-platform support: Windows, Linux, and macOS compatible

Configuration File Location

Project configurations are stored globally:

1~/.onglx-deploy/
2├── .gitignore               # Global gitignore
3└── projects/
4    ├── my-ai-api/
5    │   ├── deploy.yaml      # Main configuration
6    │   ├── state.json       # Deployment state (auto-generated)
7    │   ├── terraform.tfvars # Infrastructure variables (auto-generated)
8    │   └── build/          # Build artifacts
9    └── customer-support-ai/
10        ├── deploy.yaml
11        ├── state.json
12        └── build/

Basic Configuration Structure

Minimal AI Inference Configuration

When you run onglx-deploy init, you get a minimal multi-cloud AI configuration:

YAML

1version: "1"
2name: "my-ai-api"
3region: "us-east-1"
4inference:
5  enabled: true
6  runtime: "python3.12"
7  memory: 512
8  timeout: 60
9  models:
10    - name: "claude-3.5-sonnet"
11      provider: "anthropic"
12      model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"
13      default: true

Complete Multi-Cloud Configuration

Here's a comprehensive example with all available options:

YAML

1version: "1"
2name: "enterprise-ai-api"
3region: "us-east-1"
4profile: "my-aws-profile"
5
6# Multi-cloud inference configuration
7inference:
8  enabled: true
9  runtime: "python3.12"
10  memory: 1024
11  timeout: 120
12  
13  models:
14    # AWS Bedrock models
15    - name: "claude-3.5-sonnet"
16      provider: "anthropic"
17      model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"
18      default: true
19    - name: "claude-3-haiku"
20      provider: "anthropic" 
21      model_id: "anthropic.claude-3-haiku-20240307-v1:0"
22    - name: "titan-express"
23      provider: "amazon"
24      model_id: "amazon.titan-text-express-v1"
25    
26    # Google Vertex AI models (when deployed to GCP)
27    - name: "gemini-pro"
28      provider: "google"
29      model_id: "gemini-pro"
30  
31  rate_limiting:
32    requests_per_minute: 200
33    tokens_per_minute: 500000
34    
35  cors:
36    allow_origins: 
37      - "https://myapp.com"
38      - "http://localhost:3000"
39    allow_methods: ["POST", "OPTIONS"]
40    allow_headers: 
41      - "Content-Type"
42      - "Authorization"
43    allow_credentials: false
44
45# Multi-cloud settings
46cloud:
47  primary_provider: "auto"  # or "aws", "gcp"
48  enable_cost_optimization: true
49  failover_enabled: false

Configuration Fields

Project Settings

YAML

version: "1"                    # Configuration version (required)
name: "my-ai-api"              # Project name (required, lowercase, hyphens allowed)
region: "us-east-1"            # Primary deployment region (required)
profile: "my-aws-profile"      # AWS profile name (optional)
gcp_project: "my-gcp-project"  # GCP project ID (optional)

Naming Rules:

Project name must be lowercase
Hyphens allowed, no spaces or special characters
Must be unique within your AWS account
Used for naming AWS resources

Inference Configuration

YAML

inference:
  enabled: true                # Enable inference domain (required)
  runtime: "python3.12"        # Python runtime (python3.12 recommended)
  memory: 1024                 # Memory in MB (128-3008 AWS, 128-8192 GCP)
  timeout: 120                 # Timeout in seconds (1-900 AWS, 1-3600 GCP)

Model Configuration

YAML

1inference:
2  models:
3    # AWS Bedrock Models
4    - name: "claude-3.5-sonnet"                        # API name
5      provider: "anthropic"                           # Model provider
6      model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"  # Bedrock model ID
7      default: true                                    # Default model
8    - name: "titan-express"
9      provider: "amazon"
10      model_id: "amazon.titan-text-express-v1"
11      
12    # Google Vertex AI Models (when using GCP provider)
13    - name: "gemini-pro"
14      provider: "google"
15      model_id: "gemini-pro"
16    - name: "gemini-flash"
17      provider: "google"
18      model_id: "gemini-1.5-flash"

Rate Limiting Configuration

YAML

inference:
  rate_limiting:
    requests_per_minute: 100          # Max requests per minute
    tokens_per_minute: 100000         # Max tokens per minute (optional)
    burst_limit: 10                   # Allow burst requests

Inference Configuration

✅ Now Available: Deploy OpenAI-compatible AI APIs using AWS Bedrock with 40-85% cost savings!

Deploy AI inference APIs with OpenAI compatibility:

YAML

1inference:
2  enabled: true                        # Enable inference domain (required)
3  runtime: "python3.12"               # Python runtime (python3.12 recommended)
4  memory: 512                          # Lambda memory in MB (128-3008)
5  timeout: 60                          # Lambda timeout in seconds (1-900)
6  
7  models:                              # Available models
8    - name: "claude-3.5-sonnet"       # Model identifier
9      provider: "anthropic"            # Model provider
10      model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"  # Bedrock model ID
11      default: true                    # Default model for requests
12      
13    - name: "claude-3-haiku"
14      provider: "anthropic"
15      model_id: "anthropic.claude-3-haiku-20240307-v1:0"
16      default: false
17      
18  rate_limiting:                       # Optional rate limiting
19    requests_per_minute: 100           # Max requests per minute
20    tokens_per_minute: 100000          # Max tokens per minute (optional)
21    
22  cors:                                # CORS configuration
23    allow_origins: 
24      - "https://myapp.com"           # Allowed origins
25      - "http://localhost:3000"       # Development origin
26    allow_methods: ["POST", "OPTIONS"] # Allowed HTTP methods
27    allow_headers: 
28      - "Content-Type"
29      - "Authorization"
30    allow_credentials: false           # Allow credentials

Quick Example - Multi-Cloud AI API:

YAML

1version: "1"
2name: "my-ai-api"
3region: "us-east-1"
4inference:
5  enabled: true
6  models:
7    - name: "claude-3.5-sonnet"  # AWS Bedrock
8      provider: "anthropic"
9      model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"
10      default: true
11    - name: "gemini-pro"         # Google Vertex AI
12      provider: "google"
13      model_id: "gemini-pro"
14cloud:
15  primary_provider: "auto"       # Auto-select best provider

Advanced Example - Multi-Model with Rate Limiting:

YAML

1version: "1"
2name: "ai-platform"
3region: "us-east-1"
4inference:
5  enabled: true
6  runtime: "python3.12"
7  memory: 1024
8  timeout: 120
9  models:
10    - name: "claude-3.5-sonnet"
11      provider: "anthropic"
12      model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"
13      default: true
14    - name: "claude-3-haiku"
15      provider: "anthropic" 
16      model_id: "anthropic.claude-3-haiku-20240307-v1:0"
17    - name: "titan-express"
18      provider: "amazon"
19      model_id: "amazon.titan-text-express-v1"
20  rate_limiting:
21    requests_per_minute: 200
22    tokens_per_minute: 500000
23  cors:
24    allow_origins: ["https://app.mycompany.com"]
25    allow_methods: ["POST", "OPTIONS"]
26    allow_headers: ["Content-Type", "Authorization", "User-Agent"]

Multi-Cloud Provider Configuration

Deploy to multiple cloud providers with failover:

YAML

1version: "1"
2name: "enterprise-ai-api"
3region: "us-east-1"
4
5# Multi-cloud inference API
6inference:
7  enabled: true
8  models:
9    # Primary: AWS Bedrock
10    - name: "claude-3.5-sonnet"
11      provider: "anthropic"
12      model_id: "anthropic.claude-3-5-sonnet-20241022-v2:0"
13      default: true
14    # Backup: Google Vertex AI
15    - name: "gemini-pro"
16      provider: "google"
17      model_id: "gemini-pro"
18
19# Cloud configuration
20cloud:
21  primary_provider: "aws"        # Primary deployment target
22  backup_providers: ["gcp"]      # Failover providers
23  failover_enabled: true         # Enable automatic failover
24  cost_optimization: true        # Auto-switch for cost savings

CORS Configuration

YAML

1inference:
2  cors:
3    allow_origins:               # Allowed origins
4      - "https://myapp.com"
5      - "https://www.myapp.com"
6      - "http://localhost:3000"   # Development
7    allow_methods:               # HTTP methods
8      - "POST"
9      - "OPTIONS"
10    allow_headers:               # Headers
11      - "Content-Type"
12      - "Authorization"
13      - "User-Agent"
14    allow_credentials: false     # Credentials support

CLI Configuration Management

Project Initialization

Create a new AI API project:

BASH

1# Interactive initialization (recommended) 
2onglx-deploy init
3# Choose: project name, enable inference, select models
4
5# Initialize with specific region
6onglx-deploy --region eu-west-1 init
7
8# List all configured projects
9onglx-deploy projects list
10
11# Compare cloud providers before deploy
12onglx-deploy --project my-app cloud compare

Validation

Validate project configurations:

BASH

# Validate specific project configuration
onglx-deploy --project my-app config validate

# Validate with verbose output
onglx-deploy --verbose --project my-app config validate

Common validation errors:

Invalid AWS region
Missing required fields
Invalid memory/timeout values
Malformed YAML syntax
Invalid project name format

Environment Management

Multiple Environments

Manage multiple environments with different cloud providers:

BASH

1# Development on GCP (free credits)
2onglx-deploy --project my-app-dev cloud deploy --provider gcp
3
4# Staging on AWS
5onglx-deploy --project my-app-staging cloud deploy --provider aws
6
7# Production with multi-cloud failover
8onglx-deploy --project my-app-prod cloud deploy --provider aws,gcp --failover
9
10# Compare costs across environments
11onglx-deploy --project my-app-dev cloud compare
12onglx-deploy --project my-app-staging cloud compare

Environment-Specific Configuration

Create environment-specific AI API projects:

BASH

1# Create projects for different environments
2onglx-deploy init --project customer-ai-dev
3onglx-deploy init --project customer-ai-staging  
4onglx-deploy init --project customer-ai-prod
5
6# Deploy with different models per environment
7# Dev: Use fast, cheap model
8onglx-deploy --project customer-ai-dev deploy  # Uses Gemini Flash/Haiku
9# Prod: Use best model
10onglx-deploy --project customer-ai-prod deploy  # Uses Claude 3.5/Gemini Pro

Environment Variables

Set environment variables for deployment:

BASH

# Set environment variables before deployment
export NODE_ENV=production
export API_URL=https://api.myapp.com
export ENVIRONMENT=prod

# Deploy with environment variables
onglx-deploy --profile production deploy

AWS Resource Configuration

Function Resource Settings

YAML

inference:
  runtime: "python3.12"     # Python runtime for AI inference
  memory: 1024              # Memory allocation (128-8192 MB)
  timeout: 120              # Timeout in seconds (1-3600)
  architecture: "arm64"     # Architecture (arm64 recommended)

Memory Guidelines for AI Inference:

512 MB: Simple text completion, short responses
1024 MB: Standard conversations, moderate complexity
2048 MB: Long conversations, complex reasoning
4096 MB+: Large context windows, document processing

Multi-Cloud Settings

Configure behavior across cloud providers:

YAML

1cloud:
2  primary_provider: "gcp"              # Primary deployment target
3  backup_providers: ["aws", "azure"]   # Failover sequence
4  
5  cost_optimization:
6    enabled: true                      # Enable automatic cost optimization
7    check_interval: "daily"            # How often to check costs
8    switch_threshold: 20               # Switch if 20% cheaper available
9    
10  failover:
11    enabled: true                      # Enable automatic failover
12    health_check_interval: "5m"        # Health check frequency
13    failure_threshold: 3               # Failures before failover

Configuration Best Practices

1. Environment Separation

Use different AWS profiles for different environments:

BASH

# Configure profiles
aws configure --profile onglx-dev
aws configure --profile onglx-staging  
aws configure --profile onglx-prod

# Deploy to appropriate environment
onglx-deploy --profile onglx-prod deploy

2. Secure Environment Variables

Never commit secrets to your configuration file:

YAML

1# ❌ Bad - secrets in config
2compute:
3  nextjs:
4    environment:
5      DATABASE_URL: "postgresql://user:password123@host/db"
6
7# ✅ Good - use environment variables
8compute:
9  nextjs:
10    environment:
11      DATABASE_URL: "${DATABASE_URL}"

3. Descriptive Naming

Use descriptive project names:

YAML

# ✅ Good - clear and descriptive
name: "mycompany-customer-portal"

# ❌ Avoid - too generic
name: "app"

4. Resource Right-Sizing

Start with conservative resources and scale up:

YAML

compute:
  memory: 1024      # Start with 1GB
  timeout: 30       # 30 seconds is usually sufficient

5. Build Command Optimization

Use the most efficient build command:

YAML

compute:
  nextjs:
    auto_build: true              # CLI automatically optimizes build process
    # CLI will auto-detect the best build method (npm, pnpm, yarn)

Advanced Configuration

Custom Domains with SSL

YAML

1domain:
2  name: "myapp.com"
3  certificate_arn: "arn:aws:acm:us-east-1:123456789012:certificate/12345678-1234-1234-1234-123456789012"
4  aliases:
5    - "www.myapp.com"
6    - "app.myapp.com"
7  redirect_www: true        # Redirect www to apex domain
8  security_policy: "TLSv1.2_2021"  # Minimum TLS version

Multi-Region Deployment

Deploy to multiple regions:

BASH

1# Deploy to US East
2onglx-deploy --region us-east-1 deploy
3
4# Deploy to Europe
5onglx-deploy --region eu-west-1 deploy
6
7# Deploy to Asia Pacific
8onglx-deploy --region ap-southeast-1 deploy

Storage Configuration

YAML

1storage:
2  buckets:
3    - name: "static-assets"
4      public: true
5      cdn: true
6      versioning: true
7      lifecycle:
8        - rule: "cleanup-old-versions"
9          days: 30
10    - name: "user-uploads"
11      public: false
12      cdn: false
13      encryption: true

Configuration Validation

Automatic Validation

OnglX Deploy validates your configuration automatically:

BASH

# Validation happens during deployment
onglx-deploy deploy

# Explicit validation
onglx-deploy config validate

Manual Validation

Validate specific aspects:

BASH

1# Check AWS permissions
2onglx-deploy auth
3
4# Test build process
5onglx-deploy deploy --dry-run
6
7# Test AWS connectivity
8aws sts get-caller-identity --profile your-profile

Troubleshooting Configuration

Common Configuration Issues

Invalid YAML Syntax

BASH

# Validate configuration with CLI
onglx-deploy config validate

AWS Region Issues

YAML

# Valid AWS regions
region: "us-east-1"      # ✅ Virginia
region: "us-west-2"      # ✅ Oregon
region: "eu-west-1"      # ✅ Ireland
region: "ap-southeast-1" # ✅ Singapore

region: "us-east"        # ❌ Invalid

Memory/Timeout Limits

YAML

compute:
  memory: 128      # Minimum
  memory: 10240    # Maximum
  timeout: 1       # Minimum (seconds)
  timeout: 900     # Maximum (15 minutes)

Debug Configuration

BASH

1# View project configuration
2cat ~/.onglx-deploy/projects/my-app/deploy.yaml
3
4# Validate with verbose output
5onglx-deploy --verbose --project my-app config validate
6
7# List all projects
8onglx-deploy list
9
10# Test deployment without applying
11onglx-deploy --project my-app deploy --dry-run

Configuration Examples

Simple Blog

YAML

1version: "1"
2name: "my-blog"
3region: "us-east-1"
4compute:
5  type: "nextjs"
6  memory: 512
7  nextjs:
8    auto_build: true  # CLI handles build automatically
9    environment:
10      NEXT_PUBLIC_SITE_URL: "https://myblog.com"
11domain:
12  name: "myblog.com"

E-commerce Application

YAML

1version: "1"
2name: "ecommerce-app"
3region: "us-east-1"
4compute:
5  type: "nextjs"
6  memory: 2048
7  timeout: 60
8  nextjs:
9    auto_build: true  # CLI optimizes build process
10    edge: true
11    environment:
12      NODE_ENV: "production"
13      NEXT_PUBLIC_STRIPE_KEY: "${STRIPE_PUBLIC_KEY}"
14      NEXT_PUBLIC_API_URL: "https://api.mystore.com"
15      DATABASE_URL: "${DATABASE_URL}"
16      STRIPE_SECRET_KEY: "${STRIPE_SECRET_KEY}"
17domain:
18  name: "mystore.com"
19  aliases:
20    - "www.mystore.com"

Enterprise Application

YAML

1version: "1"
2name: "enterprise-dashboard"
3region: "us-east-1"
4profile: "company-production"
5compute:
6  type: "nextjs"
7  memory: 4096
8  timeout: 120
9  nextjs:
10    auto_build: true  # CLI optimizes build process
11    edge: true
12    environment:
13      NODE_ENV: "production"
14      NEXT_PUBLIC_API_URL: "https://api.company.com"
15      NEXT_PUBLIC_APP_VERSION: "2.1.0"
16      DATABASE_URL: "${DATABASE_URL}"
17      REDIS_URL: "${REDIS_URL}"
18      JWT_SECRET: "${JWT_SECRET}"
19domain:
20  name: "dashboard.company.com"
21  certificate_arn: "${SSL_CERTIFICATE_ARN}"
22storage:
23  buckets:
24    - name: "user-documents"
25      public: false
26      cdn: false
27      encryption: true

Cloud Provider Configuration

AWS Configuration

YAML

1aws:
2  region: "us-east-1"
3  profile: "default"
4  
5  # Bedrock-specific settings
6  bedrock:
7    region: "us-east-1"            # Bedrock region (may differ from deployment)
8    models_enabled:                # Models you've requested access for
9      - "anthropic.claude-3-5-sonnet-20241022-v2:0"
10      - "amazon.titan-text-express-v1"

GCP Configuration

YAML

1gcp:
2  project_id: "my-gcp-project"
3  region: "us-central1"
4  
5  # Vertex AI settings
6  vertex_ai:
7    location: "us-central1"        # Vertex AI location
8    models_enabled:                # Available models
9      - "gemini-pro"
10      - "gemini-1.5-flash"

Cost Optimization Configuration

YAML

cost_optimization:
  enabled: true
  strategy: "cheapest_first"       # Options: cheapest_first, balanced, performance_first
  budget_alerts:
    monthly_limit: 100            # Alert if monthly cost exceeds
    alert_thresholds: [50, 80, 95] # Alert at these percentages

Next Steps

Inference Guide - Deploy your AI API
AWS Setup Guide - Configure AWS for deployment
AWS Setup Guide - Configure AWS cloud
CLI Reference - Command line interface reference
Security Best Practices - Secure your deployment

Need help with configuration? Use onglx-deploy --project my-app config validate to check your settings or onglx-deploy --help for CLI assistance.