Berserk Docs

Self-Hosting Berserk

The configuration surface of a self-hosted Berserk cluster — secrets, user integration, and network ingress — with links to the detailed guides.

Berserk is self-hosted by deploying its Helm chart into your own Kubernetes cluster. You bring two external dependencies — a PostgreSQL database and an S3-compatible object store — and Berserk runs everything else in-cluster. This page is the map of what you configure and where: it orients you across the three areas every self-hosted install has to get right, and links to the detailed guide for each.

For a linear, copy-paste walk-through that takes a fresh cluster to its first query, start with Creating a Berserk Cluster. Come back here when you want to understand the choices behind each step.

Architecture at a glance

Exactly two services take traffic from outside the cluster: the gateway (users, CLI, UI) and ingest (telemetry from your collectors). Everything else is reachable only in-cluster.

You bring (your infrastructure):

  • A Kubernetes cluster, with an ingress controller plus DNS/TLS for the two public edges.
  • PostgreSQL 17 or 18 — metadata storage.
  • S3-compatible object storage (AWS S3, GCS, Cloudflare R2, MinIO, …) — segment data and the source of truth for all telemetry.
  • Optional: an OIDC identity provider or an authenticating reverse proxy if you want SSO rather than local accounts.

Berserk runs (in your cluster, via Helm): the gateway, meta, query, ingest, nursery, janitor, ui (data-plane + workspace admin), and permissions services. See Cluster Admin for the full service and port map.

Secrets

Berserk reads every credential from a Kubernetes Secret. For each one you decide whether Helm creates and manages it (managed: true) or you create it out-of-band (managed: false) and the chart only references it by name.

Use Helm-managed secrets (managed: true) when you run helm install/upgrade directly. Under Argo CD or other GitOps tooling, create the Secrets externally (managed: false) — the managed path needs to read whether a Secret already exists, which Argo's apply model doesn't support. External secrets are also the integration point for Vault, the External Secrets Operator, or SOPS.

The secrets the chart uses:

  • postgres-credentials and s3-credentials — your database connection string and object-store keys. Always required (S3 keys can be skipped on EKS, where IRSA supplies them). External by default.
  • gateway-secrets — the cookie signing key and internal service tokens. Helm-managed and auto-generated by default, and preserved across upgrades, so most installs never touch it.
  • ingest-token — the default ingest token. Managed by default: an init container mints it via Meta before ingest starts.
  • auth-postgres-credentials — only if you run the auth stack against a separate database (otherwise it shares postgres-credentials).
  • gateway-oidc-secrets — only with OIDC. Always external — the client_secret comes from your IdP, so the chart never generates it.

See Dependencies for the full table, the Helm-managed vs. external example for each secret, and a single full-install command.

User integration

Berserk ships with no default credentials. The gateway owns the auth edge — it provisions the first admin, runs sign-in, and injects the authenticated identity into the downstream UI and services. You choose how identities enter the system; all four paths are config on the gateway service.

  • One-time setup link (default) — on first boot with an empty user table, the gateway prints a single-use /setup?token=… URL to its container logs. Open it, create the admin in the browser. Best for getting started and single-tenant installs.
  • Pre-provisioned admin — seed the first admin from a value or secret (bootstrap_admin_email + bootstrap_admin_password) for fully unattended deploys.
  • OIDC SSO (standalone) — point the gateway at your identity provider (Okta, Entra ID, Google Workspace, Keycloak, …). Admins are promoted from an email allowlist; set bootstrap_disabled: true to make the deployment OIDC-only.
  • Trusted reverse proxy — when an authenticating proxy already fronts the gateway, it asserts the user via either a direct email header (e.g. X-USER-HEADER) or SPIFFE on-behalf-of (on-behalf-of: spiffe://<trust-domain>/user/<name> — the gateway strips the configured prefix and appends an email domain). No secret material is involved; identity is trusted on network grounds, so the gateway must be reachable only through the proxy.

By default the assertion-based methods (OIDC and trusted-proxy) auto-provision users on first sign-in. Set auto_provision_users: false for directory mode, where only users that already exist may sign in and you manage the list yourself.

See UI First-Boot Setup for the full walkthrough of all four paths. The supporting secrets and Helm values live in Dependencies → OIDC Client Secret and Dependencies → Trusted-Proxy Authentication.

Networking and ingress

The model is built around two — and only two — public edges. Treat them differently: the gateway is an HTTP edge, ingest is a gRPC/HTTP edge.

Gateway: the HTTP edge

The gateway is the single public entry point for people, the CLI, and the web UI. Front it with a normal HTTP/HTTPS ingress (or a LoadBalancer Service).

It terminates user traffic — serving the API over HTTP+JSON, with SSE for streaming responses — and translates that into the protocol each backend speaks: gRPC to meta/query/ingest, plain HTTP to ui. Because that translation happens inside the gateway, the edge in front of it stays a plain HTTP ingress.

Do not set a gRPC backend-protocol on the gateway's Ingress. The gateway serves clients over HTTP+JSON and SSE — it is a plain HTTP service, not gRPC — so configuring the ingress to treat it as a gRPC backend will break it. gRPC backend-protocol belongs on the ingest ingress (below), not the gateway's.

Streaming responses use SSE, so the gateway's ingress must not buffer responses — if yours does (e.g. nginx proxy-buffering), turn it off so events reach the client as they're produced rather than only when the stream closes.

Set global.publicBaseUrl to the public hostname — it's used to mint OIDC redirect URIs and the one-time setup link. Terminate TLS at your ingress (or via tls: on the gateway Ingress) and keep cookie_secure: true in production. The required gateway runtime config is summarized in UI First-Boot Setup → Common runtime config.

Ingest: the OTLP edge

Your OpenTelemetry collectors send to ingest over OTLP on two ports:

PortProtocolIngress requirement
4317OTLP gRPC (preferred)HTTP/2 end-to-end to the backend
4318OTLP HTTPPlain HTTP

OTLP gRPC on 4317 needs an ingress that speaks HTTP/2 to the backend:

  • nginx — annotate the Ingress nginx.ingress.kubernetes.io/backend-protocol: "GRPC".
  • Traefik — set traefik.ingress.kubernetes.io/service.serversscheme: h2c on the Service (not the Ingress — Traefik only honors it on the Service; placed on the Ingress it is ignored and every external request fails the HTTP/2 preface check with a 500). The bundled chart ships this on the ingest Service by default, so you normally don't add it yourself.
  • A single Ingress can fan out to both ports using per-path portName: otlp-grpc and portName: otlp-http.

Also raise your ingress request body limit to the 16 MiB wire cap — nginx defaults to 1 MiB, so set nginx.ingress.kubernetes.io/proxy-body-size: "16m". The ingest endpoint is a separate host from the gateway; send telemetry there, not to the gateway/query endpoint. Full detail in Ingestion → Protocols, → Collector configuration, and → Request Size Limits.

Network policies

Optional Kubernetes NetworkPolicies lock every service down to its known peers, leaving only the gateway and ingest reachable from outside. Enable with global.networkPolicy.enabled: true (requires a NetworkPolicy-capable CNI), then configure external egress (S3 on 443, PostgreSQL on 5432) and, if your collector lives elsewhere, OTLP egress. See Network Policies.

Other configuration

  • Storage — services use local disk for the query segment cache and scratch space. S3 is the source of truth, so local disk is ephemeral: losing it means a cold cache, not data loss. Size the query cache to roughly a week of compressed data. → Storage.
  • Observability — export Berserk's own logs, metrics, and traces over OTLP to any collector, including back into Berserk itself. → Cluster Admin → Observability and Service Metrics.
  • Query library Git sync — mirror saved queries to a Git repo. This is configured at runtime in the admin UI, not via Helm. → Git Sync.

Next steps

On this page