Advanced
Advanced configuration and customization options for Knull.
Header Mutation
Knull can modify HTTP headers before sending requests to backends:
models:
- id: gpt-4o-mini
provider: azure
endpoint: ${AZURE_OPENAI_ENDPOINT_HOSTNAME}
apiKey: ${AZURE_OPENAI_API_KEY}
headerMutation:
set:
- name: x-custom-header
value: "custom-value"
- name: x-azure-api-version
value: "2024-05-01-preview"
remove:
- authorization
- x-api-keyHeader Mutation Fields
| Field | Description |
|---|---|
set | Headers to add or overwrite |
remove | Headers to remove from request |
Body Mutation
Modify request body fields before sending to backends:
models:
- id: gpt-4o-mini
provider: azure
endpoint: ${AZURE_OPENAI_ENDPOINT_HOSTNAME}
bodyMutation:
set:
- path: max_tokens
value: "1000"
- path: temperature
value: "0.7"
remove:
- userBody Mutation Fields
| Field | Description |
|---|---|
set | Fields to add or overwrite |
remove | Top-level fields to remove |
Body Field Value Types
Values are parsed as JSON:
bodyMutation:
set:
- path: max_tokens
value: "1000" # Number
- path: temperature
value: "0.7" # Number
- path: metadata
value: '{"source": "knull"}' # Object
- path: stop
value: '["\\n", "##"]' # ArrayCustom API Schemas
Override the default API schema for a model:
models:
- id: custom-model
provider: openai_compatible
endpoint: http://custom-api:8080
schema:
name: OpenAI
prefix: /api/v2Schema Options
| Schema | Description |
|---|---|
OpenAI | Standard OpenAI API (/v1/chat/completions) |
AzureOpenAI | Azure OpenAI API |
AWSBedrock | AWS Bedrock API |
AWSAnthropic | AWS Bedrock Anthropic API |
Anthropic | Anthropic API (/v1/messages) |
GCPVertexAI | Google VertexAI API |
GCPAnthropic | Google VertexAI Anthropic API |
Cohere | Cohere API |
Custom Prefix
models:
- id: custom-model
provider: openai_compatible
endpoint: http://custom-api:8080
schema:
name: OpenAI
prefix: /custom/api/v1Requests to /v1/chat/completions will be routed to /custom/api/v1/chat/completions.
Backend Authentication
Configure how Knull authenticates to backend services:
Azure API Key
models:
- id: azure-model
provider: azure
endpoint: ${AZURE_OPENAI_ENDPOINT}
auth:
azureAPIKey:
key: ${AZURE_API_KEY}AWS Credentials
models:
- id: bedrock-model
provider: aws_bedrock
endpoint: ${BEDROCK_ENDPOINT}
awsRegion: us-east-1
auth:
aws:
credentialFileLiteral: |
[default]
aws_access_key_id = ${AWS_ACCESS_KEY}
aws_secret_access_key = ${AWS_SECRET_KEY}
region: us-east-1GCP Authentication
models:
- id: vertex-model
provider: gcp_vertex
endpoint: ${VERTEX_ENDPOINT}
gcpProject: my-project
gcpLocation: us-central1
auth:
gcp:
accessToken: ${GCP_TOKEN}
region: us-central1
projectName: my-projectAdvanced Routing
Model Aliases
Use aliases to simplify model names:
models:
- id: gpt-4o-mini
provider: azure
endpoint: ${AZURE_ENDPOINT}
alias: gptNow clients can use either gpt-4o-mini or gpt:
curl http://localhost:1975/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "gpt", "messages": [...]}'Header-based Routing
Route based on custom headers:
# Requires custom Envoy configurationToken Counting
Knull counts tokens for usage tracking:
Token Types
| Type | Description |
|---|---|
InputToken | Prompt tokens |
OutputToken | Response tokens |
CachedInputToken | Cached prompt tokens |
TotalToken | Input + Output |
CEL Cost Calculation
Knull allows you to calculate request costs using complex logic via Common Expression Language (CEL). This is useful when backends have tiered pricing or when you want to apply custom multipliers.
Configuration
# Inside your gateway or model configuration
llmRequestCosts:
- metadataKey: "custom_cost"
type: CEL
cel: "model == 'gpt-4' ? input_tokens * 2 + output_tokens * 4 : total_tokens"Available Variables
| Variable | Type | Description |
|---|---|---|
model | string | The model name in the request. |
backend | string | The target backend name (name.namespace). |
input_tokens | uint | Count of input (prompt) tokens. |
output_tokens | uint | Count of output (completion) tokens. |
cached_input_tokens | uint | Count of cached input tokens. |
total_tokens | uint | Total tokens processed. |
Example Expressions
- Tiered Pricing:
backend == 'premium' ? total_tokens * 10 : total_tokens - Cache Discount:
(input_tokens - cached_input_tokens) + (cached_input_tokens * 0.1) - Safety Margin:
total_tokens * 1.1
Metrics and Observability
Prometheus Metrics
Knull exposes Prometheus metrics at /metrics:
curl http://localhost:1064/metricsAvailable metrics:
knull_requests_total- Total requestsknull_tokens_total- Total tokens processedknull_latency_seconds- Request latencyknull_errors_total- Total errors
OpenTelemetry
Enable OpenTelemetry tracing:
OTEL_AIGW_METRICS_REQUEST_HEADER_ATTRIBUTES="user_id,api_key" \
OTEL_AIGW_SPAN_REQUEST_HEADER_ATTRIBUTES="user_id,api_key" \
./bin/knull run config.yaml --debugPerformance Tuning
Worker Threads
Knull uses Go's runtime for concurrency. For high throughput:
GOMAXPROCS=4 ./bin/knull run config.yamlConnection Pooling
Backend connection pooling is handled by Envoy. Configure in Envoy resources.
Memory Usage
Default memory is sufficient for moderate loads. For high traffic:
resources:
limits:
memory: 2Gi
cpu: 2000mDebugging
Enable Debug Mode
./bin/knull run config.yaml --debugView Logs
Knull follows XDG standards for log placement.
# Follow logs in the state directory
tail -f ~/.local/state/knull/runs/*/aigw.logCheck Health
# Basic health
curl http://localhost:1064/health
# Readiness
curl http://localhost:1064/readyView Configuration
# View current active configuration
curl http://localhost:1064/configView Logs
# Follow logs
tail -f ~/.local/state/knull/runs/*/aigw.log
# Or use journald
journalctl -u knull -fTest Configuration
# Validate YAML
./bin/knull validate config.yaml
# Dry run
./bin/knull run config.yaml --dry-runTroubleshooting
Check Configuration
# View loaded configuration
curl http://localhost:1064/configCheck Health
# Basic health
curl http://localhost:1064/health
# Readiness
curl http://localhost:1064/readyCommon Issues
Connection Refused
Error: connection refused to backendCheck:
- Backend endpoint is correct
- Backend is running
- Network connectivity
API Key Invalid
Error: invalid API keyCheck:
- API key format is correct (
sk-prefix) - API key exists in configuration
- Policy allows the model
Budget Exceeded
Error: budget exceededCheck:
- Policy has remaining budget
- Usage is correctly tracked
- Consider increasing budget limit