Monitoring & Logging

This guide covers configuring monitoring and centralized logging for your Rulebricks deployment.

Monitoring Overview

Rulebricks supports flexible monitoring configurations:

Local Mode: Full Prometheus and Grafana stack in your cluster
Remote Mode: Minimal Prometheus that forwards to external monitoring
Disabled: No monitoring infrastructure (use alternative solutions)

Monitoring Modes

Local Mode (Default)

Deploys a complete monitoring stack in your cluster:

Prometheus: Metrics collection and storage
Grafana: Visualization and dashboards
Retention: 30 days (configurable)
Storage: 50Gi (configurable)

Access:

Grafana: https://grafana.your-domain.com
Default credentials: Check your deployment notes or secrets

Best for:

Development environments
Isolated deployments
Full control over monitoring

Configuration:

monitoring:
  enabled: true
  mode: local
  local:
    prometheus_enabled: true
    grafana_enabled: true
    retention: "30d"
    storage_size: "50Gi"

Remote Mode

Deploys minimal Prometheus that forwards metrics to external systems:

Prometheus: Lightweight deployment (7-day retention, 10Gi storage)
Remote Write: Forwards all metrics to external endpoint
No Grafana: Use your existing monitoring dashboards

Best for:

Production environments
Existing monitoring infrastructure
Cost optimization (no local storage)

Supported Providers:

Grafana Cloud: Full Prometheus remote write support
New Relic: Native Prometheus integration
Generic Prometheus: Any Prometheus-compatible endpoint
Custom: Your own remote write endpoint

Configuration Example (Grafana Cloud):

monitoring:
  enabled: true
  mode: remote
  remote:
    provider: grafana-cloud
    prometheus_write:
      url: https://prometheus-us-central1.grafana.net/api/prom/push
      username: "123456"
      password_from: env:MONITORING_PASSWORD

Configuration Example (New Relic):

monitoring:
  enabled: true
  mode: remote
  remote:
    provider: newrelic
    newrelic:
      license_key_from: env:NEWRELIC_LICENSE_KEY
      region: "US"  # or "EU"

Configuration Example (Custom Prometheus):

monitoring:
  enabled: true
  mode: remote
  remote:
    provider: prometheus
    prometheus_write:
      url: https://prometheus.example.com/api/v1/write
      bearer_token_from: env:PROMETHEUS_TOKEN

Disabled

No monitoring infrastructure deployed:

monitoring:
  enabled: false

Use this if you have alternative monitoring solutions or don't need monitoring.

Metrics Configuration

💡

The monitoring.metrics configuration section exists in the schema but is not fully implemented. Retention is configured via monitoring.local.retention for local mode, and defaults are used for remote mode.

For local mode, configure retention:

monitoring:
  enabled: true
  mode: local
  local:
    retention: "30d"    # Metrics retention period
    storage_size: "50Gi"

The monitoring.metrics.interval field (scrape interval) is defined in the schema but not currently used - Prometheus uses its default scrape interval.

Filtering Metrics (Remote Mode)

Reduce costs by filtering metrics sent to remote endpoints:

monitoring:
  enabled: true
  mode: remote
  remote:
    provider: grafana-cloud
    prometheus_write:
      url: https://prometheus-us-central1.grafana.net/api/prom/push
      username: "123456"
      password_from: env:MONITORING_PASSWORD
      write_relabel_configs:
        - source_labels: [__name__]
          regex: "kubernetes_.*|node_.*|up|traefik_.*"
          action: keep

This example only sends Kubernetes, node, and Traefik metrics.

Logging Overview

Rulebricks uses Vector for centralized log collection and forwarding. Logs are collected from all components and can be forwarded to various destinations.

Enabling Logging

Enable centralized logging:

logging:
  enabled: true
  vector:
    sink:
      type: console  # Default: output to stdout

Log Sink Types

API Key/Token Based (No IAM Required)

These sinks don't require cloud provider IAM setup:

Elasticsearch

logging:
  enabled: true
  vector:
    sink:
      type: elasticsearch
      endpoint: "https://elastic.example.com:9200"
      api_key: env:ELASTIC_API_KEY
      config:
        index: "rulebricks-logs"
        auth_user: "elastic"  # Optional

Datadog

logging:
  enabled: true
  vector:
    sink:
      type: datadog_logs
      api_key: env:DATADOG_API_KEY
      config:
        site: "datadoghq.com"  # or "datadoghq.eu"

Splunk HEC

logging:
  enabled: true
  vector:
    sink:
      type: splunk_hec
      endpoint: "https://splunk.example.com:8088"
      api_key: env:SPLUNK_HEC_TOKEN
      config:
        index: "main"

New Relic Logs

logging:
  enabled: true
  vector:
    sink:
      type: new_relic_logs
      api_key: env:NEW_RELIC_LICENSE_KEY
      config:
        region: "US"  # or "EU"

Cloud Storage (IAM Required)

These sinks require cloud provider IAM setup. See Vector Logging Setup for IAM configuration.

AWS S3

logging:
  enabled: true
  vector:
    sink:
      type: aws_s3
      config:
        bucket: "my-logs-bucket"
        region: "us-east-1"
        setup_iam: true  # Enable automatic IAM setup prompt

After deployment, run:

rulebricks vector setup-s3

Google Cloud Storage

logging:
  enabled: true
  vector:
    sink:
      type: gcp_cloud_storage
      config:
        bucket: "my-gcs-bucket"
        use_workload_identity: true
        setup_iam: true

After deployment, run:

rulebricks vector setup-gcs

Azure Blob Storage

logging:
  enabled: true
  vector:
    sink:
      type: azure_blob
      config:
        container_name: "logs"
        storage_account: "mylogs"
        use_managed_identity: true
        setup_iam: true

After deployment, run:

rulebricks vector setup-azure

Other Sinks

Loki

logging:
  enabled: true
  vector:
    sink:
      type: loki
      endpoint: "http://loki.example.com:3100"

HTTP Endpoint

logging:
  enabled: true
  vector:
    sink:
      type: http
      endpoint: "https://logs.example.com/ingest"
      config:
        auth_header: "Bearer <token>"

Console (Default)

logging:
  enabled: true
  vector:
    sink:
      type: console

Outputs logs to stdout (viewable via rulebricks logs).

Log Configuration

💡

The monitoring.logs configuration section exists in the schema but is not currently implemented. Log levels and retention are managed by Vector and the logging sink configuration.

Viewing Logs

Using the CLI

View logs from any component:

# View app logs
rulebricks logs app
 
# Follow logs in real-time
rulebricks logs app -f
 
# View last 500 lines
rulebricks logs app --tail 500
 
# View all components
rulebricks logs all -f

Available Components

app - Main Rulebricks application
hps - HPS service (rule processing)
workers - Worker pods
database - PostgreSQL database
supabase - All Supabase services
traefik - Ingress controller
prometheus - Metrics collection
grafana - Monitoring dashboards
all - Combined logs from all components

Using kubectl

You can also use kubectl directly:

# List pods
kubectl get pods --all-namespaces
 
# View logs
kubectl logs <pod-name> -n <namespace> -f
 
# View logs from all pods in a deployment
kubectl logs -l app=rulebricks-app -n <namespace> -f

Monitoring Best Practices

Production: Use remote mode with external monitoring
Development: Use local mode for full visibility
Cost Optimization: Filter metrics in remote mode
Retention: Adjust retention based on your needs
Alerts: Set up alerts in your monitoring system

Logging Best Practices

Use cloud storage for production (S3, GCS, Azure Blob)
Set up IAM properly for cloud storage sinks
Use API key sinks for simplicity (Elasticsearch, Datadog)
Monitor log volume to control costs
Retain logs appropriately based on compliance needs

Troubleshooting

Prometheus Not Collecting Metrics

Check Prometheus pod status: kubectl get pods -n <monitoring-namespace>
Review Prometheus logs: rulebricks logs prometheus
Verify service discovery: Check Prometheus targets in Grafana

Remote Write Failing

Verify endpoint URL is correct
Check authentication credentials
Review network connectivity
Check Prometheus logs for errors

Logs Not Appearing in Sink

Verify Vector pod is running: kubectl get pods -n <logging-namespace>
Check Vector logs: rulebricks logs vector
Verify sink configuration
For cloud storage: Ensure IAM is configured correctly

High Log Volume

Review log levels (reduce from debug to info)
Filter logs at the Vector level
Consider log sampling for high-volume components
Review retention policies

Next Steps

Set up cloud storage logging: See Vector Logging Setup
Configure alerts: Set up alerts in your monitoring system
Customize dashboards: Create custom Grafana dashboards (local mode)
Review metrics: Understand what metrics are available

Deployment Rule Log Draining