Chat Interfaces

Deploy AI-powered chat systems with both API and web interfaces powered by AWS Bedrock foundation models.

OnglX Deploy offers two ways to interact with AI models:

Chat API: OpenAI-compatible /v1/chat/completions endpoints for applications
Web Interface: OpenWebUI for direct browser-based chat interaction

Quick Start

Deploy Chat API

BASH

1# Initialize project
2onglx-deploy init --host aws
3
4# Add chat API component
5onglx-deploy add inference --component api --type openai
6
7# Deploy to AWS
8onglx-deploy deploy

Testing Your Chat Interface

Test Chat API

BASH

1# Get your endpoint URL and API key
2onglx-deploy status
3
4# Test the chat completion endpoint
5curl -X POST https://your-endpoint/v1/chat/completions \
6  -H 'Authorization: Bearer sk-onglx-your-api-key' \
7  -H 'Content-Type: application/json' \
8  -d '{
9    "model": "claude-3.5-sonnet",
10    "messages": [
11      {"role": "user", "content": "Hello! Can you help me with AWS Bedrock?"}
12    ],
13    "max_tokens": 150,
14    "temperature": 0.7
15  }'

Configuration Options

Supported Parameters

The Chat API supports all standard OpenAI parameters:

model - Available models: claude-3.5-sonnet, claude-3-haiku, amazon.titan-text-express-v1
messages - Array of conversation messages with role and content
max_tokens - Maximum tokens in the response (1-4096)
temperature - Controls randomness (0.0-2.0)
top_p - Controls nucleus sampling (0.0-1.0)
stream - Enable streaming responses (boolean)

Response Format

Standard OpenAI-compatible response:

JSON

1{
2  "id": "chatcmpl-abc123",
3  "object": "chat.completion",
4  "created": 1677652288,
5  "model": "claude-3.5-sonnet",
6  "choices": [
7    {
8      "index": 0,
9      "message": {
10        "role": "assistant",
11        "content": "Hello! I'd be happy to help you with AWS Bedrock..."
12      },
13      "finish_reason": "stop"
14    }
15  ],
16  "usage": {
17    "prompt_tokens": 12,
18    "completion_tokens": 8,
19    "total_tokens": 20
20  }
21}

Streaming Responses

Enable streaming for real-time responses:

BASH

1curl -X POST https://your-endpoint/v1/chat/completions \
2  -H 'Authorization: Bearer sk-onglx-your-api-key' \
3  -H 'Content-Type: application/json' \
4  -d '{
5    "model": "claude-3.5-sonnet",
6    "messages": [{"role": "user", "content": "Tell me about AI"}],
7    "stream": true
8  }'

SDK Integration

Python

PYTHON

1import openai
2
3client = openai.OpenAI(
4    api_key="sk-onglx-your-key",
5    base_url="https://your-endpoint"
6)
7
8response = client.chat.completions.create(
9    model="claude-3.5-sonnet",
10    messages=[
11        {"role": "user", "content": "What is AWS Bedrock?"}
12    ],
13    max_tokens=100
14)
15
16print(response.choices[0].message.content)

JavaScript

JAVASCRIPT

1import OpenAI from 'openai';
2
3const openai = new OpenAI({
4  apiKey: process.env.ONGLX_API_KEY,
5  baseURL: process.env.ONGLX_API_BASE_URL
6});
7
8const completion = await openai.chat.completions.create({
9  model: 'claude-3.5-sonnet',
10  messages: [
11    { role: 'user', content: 'Explain machine learning briefly' }
12  ],
13  max_tokens: 100
14});
15
16console.log(completion.choices[0].message.content);

Error Handling

The API returns standard HTTP status codes and OpenAI-compatible error responses:

JSON

{
  "error": {
    "message": "Invalid model specified",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}

Common error codes:

400 - Bad Request (invalid parameters)
401 - Unauthorized (invalid API key)
429 - Rate limit exceeded
500 - Internal server error

Next Steps

Inference Overview - Complete deployment guide with component comparison
AWS Setup Guide - Configure your AWS account
CLI Reference - Complete command reference
Security - Best practices for production