Bastion

Observability

Prometheus metrics, structured access logging, and OpenTelemetry trace propagation.

Bastion provides comprehensive observability through three pillars: Prometheus-compatible metrics, structured access logging, and OpenTelemetry trace propagation. Together they give full visibility into gateway traffic, upstream health, and request flows.

Prometheus Metrics

Configuration

bastion.WithMetrics(bastion.MetricsConfig{
    Enabled: true,
    Prefix:  "gateway",
})
FieldTypeDefaultDescription
EnabledbooltrueEnable or disable metrics collection
Prefixstring"gateway"Prefix for all metric names

Exported Metrics

The gateway exports the following metrics through the Forge metrics system:

Request Metrics

MetricTypeDescription
gateway.requests_totalCounterTotal number of proxied requests
gateway.request_duration_secondsHistogramRequest latency distribution
gateway.active_connections.<protocol>GaugeActive connections by protocol (http, websocket, sse)

Resilience Metrics

MetricTypeDescription
gateway.retry_totalCounterTotal retry attempts
gateway.rate_limited_totalCounterTotal rate-limited requests
gateway.circuit_breaker_state.<targetID>GaugeCircuit breaker state (0=closed, 1=half-open, 2=open)

Upstream Metrics

MetricTypeDescription
gateway.upstream_health.<targetID>GaugeTarget health (1.0=healthy, 0.0=unhealthy)
gateway.discovery_routes.<source>GaugeNumber of routes by source (manual, farp, discovery)

Cache Metrics

MetricTypeDescription
gateway.cache_hits_totalCounterTotal cache hits
gateway.cache_misses_totalCounterTotal cache misses

Custom Prefix

Change the metric prefix to avoid collisions when running multiple gateways:

bastion.WithMetrics(bastion.MetricsConfig{
    Enabled: true,
    Prefix:  "api_gateway",
})
// Metrics: api_gateway.requests_total, api_gateway.request_duration_seconds, etc.

Structured Access Logging

Configuration

bastion.WithAccessLog(bastion.AccessLogConfig{
    Enabled:        true,
    RedactHeaders:  []string{"Authorization", "Cookie", "Set-Cookie"},
    IncludeBody:    false,
    MaxBodyLogSize: 4096,
})
FieldTypeDefaultDescription
EnabledbooltrueEnable or disable access logging
RedactHeaders[]string["Authorization", "Cookie", "Set-Cookie"]Headers to redact in logs
IncludeBodyboolfalseInclude request/response body in logs
MaxBodyLogSizeint4096Maximum body bytes to log

Log Fields

Every proxied request produces a structured log entry with these fields:

Request Fields

FieldDescription
methodHTTP method (GET, POST, etc.)
pathRequest path
statusHTTP response status code
latency_msTotal request latency in milliseconds
client_ipClient IP address (from X-Forwarded-For, X-Real-IP, or RemoteAddr)
user_agentClient User-Agent header
request_idValue of X-Request-ID header (if present)
queryQuery string (truncated to 200 chars)

Route Fields

FieldDescription
route_idMatched route ID
route_pathRoute pattern
route_protocolRoute protocol (http, websocket, sse, grpc)
route_sourceRoute source (manual, farp, discovery)

Upstream Fields

FieldDescription
upstream_urlSelected target URL
upstream_idSelected target ID

Safe Headers

These headers are always included when present:

  • Accept, Content-Type, Content-Length, Referer, Origin

Sensitive headers listed in RedactHeaders are logged as [REDACTED].

Log Levels

The access logger uses different log levels based on the response status code:

Status RangeLog Level
200-399Info
400-499Warn
500+Error

Admin Action Logging

Administrative actions (route creation, config changes, cache invalidation) are logged separately:

{
    "msg": "gateway admin action",
    "action": "route_created",
    "resource": "route-abc123",
    "result": "success",
    "client_ip": "10.0.0.1",
    "user_agent": "curl/7.88.0"
}

OpenTelemetry Trace Propagation

Configuration

bastion.WithTracing(bastion.TracingConfig{
    Enabled:         true,
    PropagateFormat: "w3c",
    SampleRate:      1.0,
})
FieldTypeDefaultDescription
EnabledbooltrueEnable or disable trace propagation
PropagateFormatstring"w3c"Trace context format to propagate
SampleRatefloat641.0Sampling rate (0.0 to 1.0)

Trace Context Propagation

When tracing is enabled, the gateway propagates trace context headers between the client and upstream:

  • W3C Trace Context ("w3c"): Propagates traceparent and tracestate headers.

The gateway acts as a pass-through for trace context -- it does not create new spans but ensures that distributed tracing works end-to-end across the gateway boundary.

Sample Rate

The SampleRate controls what fraction of requests participate in tracing:

ValueBehavior
1.0All requests are traced
0.550% of requests are traced
0.110% of requests are traced
0.0Tracing is effectively disabled

Gateway Stats

The proxy engine maintains an aggregated stats structure available via the admin API:

type GatewayStats struct {
    TotalRequests    int64
    TotalErrors      int64
    ActiveConns      int64
    ActiveWSConns    int64
    ActiveSSEConns   int64
    AvgLatencyMs     float64
    P99LatencyMs     float64
    RequestsPerSec   float64
    CacheHits        int64
    CacheMisses      int64
    RateLimited      int64
    CircuitBreaks    int64
    RetryAttempts    int64
    TotalRoutes      int
    HealthyUpstreams int
    TotalUpstreams   int
    Uptime           int64
    StartedAt        time.Time
}

Access via the admin API:

GET /gateway/api/stats

Per-Route Stats

Each route tracks its own metrics:

type RouteStats struct {
    RouteID       string
    Path          string
    TotalRequests int64
    TotalErrors   int64
    AvgLatencyMs  float64
    P99LatencyMs  float64
    CacheHits     int64
    CacheMisses   int64
    RateLimited   int64
}

Per-route stats are included in GatewayStats.RouteStats keyed by route ID.

Dashboard Integration

When the dashboard is enabled, all observability data feeds the real-time admin UI:

bastion.WithDashboard(bastion.DashboardConfig{
    Enabled:  true,
    BasePath: "/gateway",
    Realtime: true,
})

The dashboard displays:

  • Live request throughput and error rates
  • Latency percentiles (p50, p95, p99)
  • Active connections by protocol
  • Upstream health status per target
  • Circuit breaker states
  • Cache hit/miss ratios
  • Route-level traffic breakdown

Audit Logging

Administrative actions are recorded as audit events:

type AuditEvent struct {
    Action    AuditAction
    Resource  string
    Actor     string
    Timestamp time.Time
    Details   map[string]any
}

Supported audit actions:

ActionDescription
AuditRouteCreatedA route was created
AuditRouteUpdatedA route was updated
AuditRouteDeletedA route was deleted
AuditRouteToggledA route was enabled/disabled
AuditConfigChangedGateway configuration was changed
AuditDiscoveryForcedDiscovery refresh was manually triggered
AuditCircuitResetA circuit breaker was manually reset
AuditCacheClearedCache was cleared

Audit events can be consumed via the AuditSink interface for forwarding to external audit systems.

On this page