Monitoring & Logging
This guide covers configuring monitoring and centralized logging for your Rulebricks deployment.
Monitoring Overview
Rulebricks supports flexible monitoring configurations:
- Local Mode: Full Prometheus and Grafana stack in your cluster
- Remote Mode: Minimal Prometheus that forwards to external monitoring
- Disabled: No monitoring infrastructure (use alternative solutions)
Monitoring Modes
Local Mode (Default)
Deploys a complete monitoring stack in your cluster:
- Prometheus: Metrics collection and storage
- Grafana: Visualization and dashboards
- Retention: 30 days (configurable)
- Storage: 50Gi (configurable)
Access:
- Grafana:
https://grafana.your-domain.com - Default credentials: Check your deployment notes or secrets
Best for:
- Development environments
- Isolated deployments
- Full control over monitoring
Configuration:
monitoring:
enabled: true
mode: local
local:
prometheus_enabled: true
grafana_enabled: true
retention: "30d"
storage_size: "50Gi"Remote Mode
Deploys minimal Prometheus that forwards metrics to external systems:
- Prometheus: Lightweight deployment (7-day retention, 10Gi storage)
- Remote Write: Forwards all metrics to external endpoint
- No Grafana: Use your existing monitoring dashboards
Best for:
- Production environments
- Existing monitoring infrastructure
- Cost optimization (no local storage)
Supported Providers:
- Grafana Cloud: Full Prometheus remote write support
- New Relic: Native Prometheus integration
- Generic Prometheus: Any Prometheus-compatible endpoint
- Custom: Your own remote write endpoint
Configuration Example (Grafana Cloud):
monitoring:
enabled: true
mode: remote
remote:
provider: grafana-cloud
prometheus_write:
url: https://prometheus-us-central1.grafana.net/api/prom/push
username: "123456"
password_from: env:MONITORING_PASSWORDConfiguration Example (New Relic):
monitoring:
enabled: true
mode: remote
remote:
provider: newrelic
newrelic:
license_key_from: env:NEWRELIC_LICENSE_KEY
region: "US" # or "EU"Configuration Example (Custom Prometheus):
monitoring:
enabled: true
mode: remote
remote:
provider: prometheus
prometheus_write:
url: https://prometheus.example.com/api/v1/write
bearer_token_from: env:PROMETHEUS_TOKENDisabled
No monitoring infrastructure deployed:
monitoring:
enabled: falseUse this if you have alternative monitoring solutions or don't need monitoring.
Metrics Configuration
The monitoring.metrics configuration section exists in the schema but is not fully implemented. Retention is configured via monitoring.local.retention for local mode, and defaults are used for remote mode.
For local mode, configure retention:
monitoring:
enabled: true
mode: local
local:
retention: "30d" # Metrics retention period
storage_size: "50Gi"The monitoring.metrics.interval field (scrape interval) is defined in the schema but not currently used - Prometheus uses its default scrape interval.
Filtering Metrics (Remote Mode)
Reduce costs by filtering metrics sent to remote endpoints:
monitoring:
enabled: true
mode: remote
remote:
provider: grafana-cloud
prometheus_write:
url: https://prometheus-us-central1.grafana.net/api/prom/push
username: "123456"
password_from: env:MONITORING_PASSWORD
write_relabel_configs:
- source_labels: [__name__]
regex: "kubernetes_.*|node_.*|up|traefik_.*"
action: keepThis example only sends Kubernetes, node, and Traefik metrics.
Logging Overview
Rulebricks uses Vector for centralized log collection and forwarding. Logs are collected from all components and can be forwarded to various destinations.
Enabling Logging
Enable centralized logging:
logging:
enabled: true
vector:
sink:
type: console # Default: output to stdoutLog Sink Types
API Key/Token Based (No IAM Required)
These sinks don't require cloud provider IAM setup:
Elasticsearch
logging:
enabled: true
vector:
sink:
type: elasticsearch
endpoint: "https://elastic.example.com:9200"
api_key: env:ELASTIC_API_KEY
config:
index: "rulebricks-logs"
auth_user: "elastic" # OptionalDatadog
logging:
enabled: true
vector:
sink:
type: datadog_logs
api_key: env:DATADOG_API_KEY
config:
site: "datadoghq.com" # or "datadoghq.eu"Splunk HEC
logging:
enabled: true
vector:
sink:
type: splunk_hec
endpoint: "https://splunk.example.com:8088"
api_key: env:SPLUNK_HEC_TOKEN
config:
index: "main"New Relic Logs
logging:
enabled: true
vector:
sink:
type: new_relic_logs
api_key: env:NEW_RELIC_LICENSE_KEY
config:
region: "US" # or "EU"Cloud Storage (IAM Required)
These sinks require cloud provider IAM setup. See Vector Logging Setup for IAM configuration.
AWS S3
logging:
enabled: true
vector:
sink:
type: aws_s3
config:
bucket: "my-logs-bucket"
region: "us-east-1"
setup_iam: true # Enable automatic IAM setup promptAfter deployment, run:
rulebricks vector setup-s3Google Cloud Storage
logging:
enabled: true
vector:
sink:
type: gcp_cloud_storage
config:
bucket: "my-gcs-bucket"
use_workload_identity: true
setup_iam: trueAfter deployment, run:
rulebricks vector setup-gcsAzure Blob Storage
logging:
enabled: true
vector:
sink:
type: azure_blob
config:
container_name: "logs"
storage_account: "mylogs"
use_managed_identity: true
setup_iam: trueAfter deployment, run:
rulebricks vector setup-azureOther Sinks
Loki
logging:
enabled: true
vector:
sink:
type: loki
endpoint: "http://loki.example.com:3100"HTTP Endpoint
logging:
enabled: true
vector:
sink:
type: http
endpoint: "https://logs.example.com/ingest"
config:
auth_header: "Bearer <token>"Console (Default)
logging:
enabled: true
vector:
sink:
type: consoleOutputs logs to stdout (viewable via rulebricks logs).
Log Configuration
The monitoring.logs configuration section exists in the schema but is not currently implemented. Log levels and retention are managed by Vector and the logging sink configuration.
Viewing Logs
Using the CLI
View logs from any component:
# View app logs
rulebricks logs app
# Follow logs in real-time
rulebricks logs app -f
# View last 500 lines
rulebricks logs app --tail 500
# View all components
rulebricks logs all -fAvailable Components
app- Main Rulebricks applicationhps- HPS service (rule processing)workers- Worker podsdatabase- PostgreSQL databasesupabase- All Supabase servicestraefik- Ingress controllerprometheus- Metrics collectiongrafana- Monitoring dashboardsall- Combined logs from all components
Using kubectl
You can also use kubectl directly:
# List pods
kubectl get pods --all-namespaces
# View logs
kubectl logs <pod-name> -n <namespace> -f
# View logs from all pods in a deployment
kubectl logs -l app=rulebricks-app -n <namespace> -fMonitoring Best Practices
- Production: Use remote mode with external monitoring
- Development: Use local mode for full visibility
- Cost Optimization: Filter metrics in remote mode
- Retention: Adjust retention based on your needs
- Alerts: Set up alerts in your monitoring system
Logging Best Practices
- Use cloud storage for production (S3, GCS, Azure Blob)
- Set up IAM properly for cloud storage sinks
- Use API key sinks for simplicity (Elasticsearch, Datadog)
- Monitor log volume to control costs
- Retain logs appropriately based on compliance needs
Troubleshooting
Prometheus Not Collecting Metrics
- Check Prometheus pod status:
kubectl get pods -n <monitoring-namespace> - Review Prometheus logs:
rulebricks logs prometheus - Verify service discovery: Check Prometheus targets in Grafana
Remote Write Failing
- Verify endpoint URL is correct
- Check authentication credentials
- Review network connectivity
- Check Prometheus logs for errors
Logs Not Appearing in Sink
- Verify Vector pod is running:
kubectl get pods -n <logging-namespace> - Check Vector logs:
rulebricks logs vector - Verify sink configuration
- For cloud storage: Ensure IAM is configured correctly
High Log Volume
- Review log levels (reduce from
debugtoinfo) - Filter logs at the Vector level
- Consider log sampling for high-volume components
- Review retention policies
Next Steps
- Set up cloud storage logging: See Vector Logging Setup
- Configure alerts: Set up alerts in your monitoring system
- Customize dashboards: Create custom Grafana dashboards (local mode)
- Review metrics: Understand what metrics are available