Private Deployment
Troubleshooting

Troubleshooting

Common diagnostics for private deployments, roughly in the order problems tend to appear.

Migration Jobs

The chart runs a migration job on every install and upgrade. Failures here are usually inaccurate values or, for managed Supabase, a project that doesn't exist yet.

# Check job status
kubectl get jobs -n rulebricks
 
# Get detailed error
kubectl describe job rulebricks-db-migrate-1 -n rulebricks
 
# Check pod logs (self-hosted Supabase)
kubectl logs job/rulebricks-db-migrate-1 -n rulebricks --all-containers
 
# Check pod logs (managed Supabase)
kubectl logs job/rulebricks-managed-supabase-setup-1 -n rulebricks

Common causes:

  • Database not ready (increase readiness wait)
  • Invalid credentials
  • Network policy blocking access
  • Supabase Cloud project not created yet, or wrong access token

To inspect which migrations have applied on a self-hosted database:

kubectl exec -it deploy/rulebricks-supabase-db -n rulebricks -- \
  psql -U postgres -c "SELECT * FROM schema_migrations ORDER BY applied_at;"

TLS Certificates

# Check cert-manager logs
kubectl logs -n cert-manager -l app=cert-manager
 
# Check certificate status
kubectl get certificates -n rulebricks
kubectl describe certificate rulebricks-tls -n rulebricks
 
# Check ClusterIssuer
kubectl describe clusterissuer rulebricks-letsencrypt

The most common cause is enabling global.tlsEnabled before DNS records resolve to the cluster's load balancer. Let's Encrypt must be able to reach your domain.

Workers Not Scaling

# Check KEDA
kubectl get scaledobject -n rulebricks
kubectl describe scaledobject rulebricks-hps-workers -n rulebricks
 
# Check Kafka consumer lag
kubectl exec -it rulebricks-kafka-0 -n rulebricks -- \
  kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --describe --group generic-workers

Common causes:

  • Kafka consumer group not found
  • KEDA unable to reach Kafka
  • Incorrect threshold configuration

For the sizing model behind worker scaling (and why partition counts matter), see Performance & Scaling.

External Kafka or Redis Problems

If you've pointed the chart at your own Kafka or Redis and solves are failing or timing out, work through the verification checklist, which covers the HPS health endpoints, an end-to-end smoke test, and a table of common failure modes.

Log Collection

# All pods in namespace
kubectl logs -n rulebricks -l app.kubernetes.io/instance=rulebricks --all-containers
 
# Specific component with follow
kubectl logs -n rulebricks -l app.kubernetes.io/component=hps-worker -f

Clean Reinstall

As a last resort, fully remove the deployment and start over:

# Full cleanup including data
helm uninstall rulebricks -n rulebricks
kubectl delete pvc --all -n rulebricks
kubectl delete namespace rulebricks
 
# Reinstall
helm install rulebricks oci://ghcr.io/rulebricks/helm/stack \
  --namespace rulebricks \
  --create-namespace \
  -f your-values.yaml
⚠️

Deleting PVCs destroys the self-hosted database and all application state. Make sure you have a backup before a clean reinstall on anything but a fresh evaluation.

Still Stuck?

Email support@rulebricks.com with the failing component's logs, or open an issue on the helm (opens in a new tab) or CLI (opens in a new tab) repository.