Keycloak on Kubernetes: Production Deployment Guide

Guilliano Molaire Guilliano Molaire Updated May 30, 2026 9 min read

Last updated: March 2026

Running Keycloak in production on Kubernetes requires more than just deploying a container. You need to handle database connectivity, TLS termination, clustering, session replication, autoscaling, health checks, and graceful shutdowns. Getting any of these wrong leads to authentication outages, which affect every application that depends on Keycloak.

This guide provides a production-ready Kubernetes deployment for Keycloak, covering every component with complete YAML manifests. If you have already deployed Keycloak with ArgoCD, this guide builds on those concepts with production-hardening steps. See our ArgoCD deployment guide for GitOps-based delivery.

Architecture Overview

A production Keycloak deployment on Kubernetes consists of:

  • Keycloak pods: Multiple replicas running the Keycloak server
  • PostgreSQL: The backing database (managed or self-hosted)
  • Ingress controller: TLS termination and routing
  • Infinispan: Distributed cache for session replication between pods
  • Monitoring: Health checks, readiness probes, and metrics

The goal is a deployment that handles failover gracefully, scales with demand, and survives node failures without authentication downtime.

Namespace and Prerequisites

# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: keycloak
  labels:
    app.kubernetes.io/part-of: identity

Create the namespace and required secrets:

kubectl apply -f namespace.yaml

# Create database credentials
kubectl create secret generic keycloak-db-credentials 
  --namespace keycloak 
  --from-literal=username=keycloak 
  --from-literal=password='your-strong-password'

# Create admin credentials
kubectl create secret generic keycloak-admin-credentials 
  --namespace keycloak 
  --from-literal=username=admin 
  --from-literal=password='your-admin-password'

# Create TLS certificate (or use cert-manager)
kubectl create secret tls keycloak-tls 
  --namespace keycloak 
  --cert=tls.crt 
  --key=tls.key

PostgreSQL with CloudNativePG

For production, use a PostgreSQL operator to manage your database. CloudNativePG is a widely adopted option:

# postgres-cluster.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: keycloak-db
  namespace: keycloak
spec:
  instances: 3
  primaryUpdateStrategy: unsupervised

  storage:
    size: 20Gi
    storageClass: gp3

  postgresql:
    parameters:
      shared_buffers: "1GB"
      effective_cache_size: "3GB"
      work_mem: "16MB"
      maintenance_work_mem: "256MB"
      max_connections: "200"
      checkpoint_completion_target: "0.9"
      wal_buffers: "32MB"
      max_wal_size: "2GB"

  bootstrap:
    initdb:
      database: keycloak
      owner: keycloak
      secret:
        name: keycloak-db-credentials

  resources:
    requests:
      memory: "2Gi"
      cpu: "1"
    limits:
      memory: "4Gi"
      cpu: "2"

  backup:
    barmanObjectStore:
      destinationPath: "s3://keycloak-backups/postgres/"
      s3Credentials:
        accessKeyId:
          name: aws-credentials
          key: ACCESS_KEY_ID
        secretAccessKey:
          name: aws-credentials
          key: SECRET_ACCESS_KEY
    retentionPolicy: "30d"

  monitoring:
    enablePodMonitor: true

This creates a 3-node PostgreSQL cluster with automated failover, backups, and monitoring. For detailed PostgreSQL tuning recommendations, see our Keycloak database tuning guide.

Keycloak Deployment

Deployment vs. StatefulSet

Use a Deployment for Keycloak, not a StatefulSet. Keycloak pods are stateless — sessions are replicated through Infinispan, and persistent data lives in PostgreSQL. A Deployment gives you rolling updates and simpler scaling.

# keycloak-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: keycloak
  namespace: keycloak
  labels:
    app.kubernetes.io/name: keycloak
    app.kubernetes.io/component: server
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app.kubernetes.io/name: keycloak
  template:
    metadata:
      labels:
        app.kubernetes.io/name: keycloak
        app.kubernetes.io/component: server
    spec:
      serviceAccountName: keycloak
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000
        seccompProfile:
          type: RuntimeDefault

      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app.kubernetes.io/name
                      operator: In
                      values:
                        - keycloak
                topologyKey: kubernetes.io/hostname

      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              app.kubernetes.io/name: keycloak

      terminationGracePeriodSeconds: 60

      containers:
        - name: keycloak
          image: quay.io/keycloak/keycloak:26.0
          args:
            - start
            - --optimized

          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
            - name: management
              containerPort: 9000
              protocol: TCP

          env:
            # Database
            - name: KC_DB
              value: postgres
            - name: KC_DB_URL
              value: jdbc:postgresql://keycloak-db-rw.keycloak.svc:5432/keycloak
            - name: KC_DB_USERNAME
              valueFrom:
                secretKeyRef:
                  name: keycloak-db-credentials
                  key: username
            - name: KC_DB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: keycloak-db-credentials
                  key: password
            - name: KC_DB_POOL_MIN_SIZE
              value: "10"
            - name: KC_DB_POOL_MAX_SIZE
              value: "50"

            # HTTP / Proxy
            - name: KC_HOSTNAME
              value: auth.example.com
            - name: KC_PROXY_HEADERS
              value: xforwarded
            - name: KC_HTTP_ENABLED
              value: "true"

            # Clustering
            - name: KC_CACHE
              value: ispn
            - name: KC_CACHE_STACK
              value: kubernetes
            - name: JAVA_OPTS_KC_HEAP
              value: "-XX:InitialRAMPercentage=50.0 -XX:MaxRAMPercentage=70.0"

            # Health and metrics
            - name: KC_HEALTH_ENABLED
              value: "true"
            - name: KC_METRICS_ENABLED
              value: "true"

            # Admin credentials (initial setup only)
            - name: KC_BOOTSTRAP_ADMIN_USERNAME
              valueFrom:
                secretKeyRef:
                  name: keycloak-admin-credentials
                  key: username
            - name: KC_BOOTSTRAP_ADMIN_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: keycloak-admin-credentials
                  key: password

            # JGroups DNS discovery for Infinispan clustering
            - name: jgroups.dns.query
              value: keycloak-headless.keycloak.svc.cluster.local

          readinessProbe:
            httpGet:
              path: /health/ready
              port: management
            initialDelaySeconds: 30
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 3

          livenessProbe:
            httpGet:
              path: /health/live
              port: management
            initialDelaySeconds: 60
            periodSeconds: 30
            timeoutSeconds: 5
            failureThreshold: 5

          startupProbe:
            httpGet:
              path: /health/started
              port: management
            initialDelaySeconds: 15
            periodSeconds: 5
            failureThreshold: 30

          resources:
            requests:
              memory: "1Gi"
              cpu: "500m"
            limits:
              memory: "2Gi"
              cpu: "2"

          lifecycle:
            preStop:
              exec:
                command: ["sh", "-c", "sleep 10"]

Key configuration decisions explained:

  • maxUnavailable: 0: Ensures at least the current number of pods are available during rolling updates. Combined with maxSurge: 1, this means Kubernetes creates a new pod before terminating an old one.
  • Pod anti-affinity: Spreads Keycloak pods across different nodes to survive node failures.
  • Topology spread constraints: Distributes pods across availability zones for zone-level resilience.
  • preStop lifecycle hook: The 10-second sleep allows the pod to be removed from the service endpoints before it starts shutting down, preventing dropped connections.
  • Separate management port (9000): Health and metrics endpoints run on a different port from application traffic, following Keycloak’s best practice.

Service and Headless Service

# keycloak-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: keycloak
  namespace: keycloak
  labels:
    app.kubernetes.io/name: keycloak
spec:
  type: ClusterIP
  ports:
    - name: http
      port: 8080
      targetPort: http
      protocol: TCP
  selector:
    app.kubernetes.io/name: keycloak

---

# Headless service for Infinispan DNS discovery
apiVersion: v1
kind: Service
metadata:
  name: keycloak-headless
  namespace: keycloak
  labels:
    app.kubernetes.io/name: keycloak
spec:
  type: ClusterIP
  clusterIP: None
  ports:
    - name: jgroups
      port: 7800
      targetPort: 7800
      protocol: TCP
  selector:
    app.kubernetes.io/name: keycloak
  publishNotReadyAddresses: true

The headless service is essential for Infinispan clustering. JGroups uses DNS discovery to find other Keycloak pods, and the headless service provides DNS records for each pod. Setting publishNotReadyAddresses: true ensures that pods can discover each other during startup.

Ingress with TLS

NGINX Ingress

# keycloak-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: keycloak
  namespace: keycloak
  annotations:
    nginx.ingress.kubernetes.io/proxy-buffer-size: "128k"
    nginx.ingress.kubernetes.io/proxy-buffers-number: "4"
    nginx.ingress.kubernetes.io/affinity: "cookie"
    nginx.ingress.kubernetes.io/session-cookie-name: "KC_INGRESS"
    nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"
    nginx.ingress.kubernetes.io/configuration-snippet: |
      more_set_headers "X-Frame-Options: SAMEORIGIN";
      more_set_headers "X-Content-Type-Options: nosniff";
      more_set_headers "X-XSS-Protection: 1; mode=block";
      more_set_headers "Strict-Transport-Security: max-age=31536000; includeSubDomains";
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - auth.example.com
      secretName: keycloak-tls
  rules:
    - host: auth.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: keycloak
                port:
                  number: 8080

Key annotations:

  • proxy-buffer-size: 128k: Keycloak responses can include large headers (especially with many roles in tokens). Increasing the buffer prevents 502 errors.
  • Session affinity: While not strictly required (Infinispan handles session replication), session affinity reduces cross-pod lookups and improves latency.

Automated TLS with cert-manager

Instead of managing certificates manually, use cert-manager with Let’s Encrypt:

# certificate.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: keycloak-tls
  namespace: keycloak
spec:
  secretName: keycloak-tls
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
    - auth.example.com

Horizontal Pod Autoscaler

Scale Keycloak pods based on CPU utilization:

# keycloak-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: keycloak
  namespace: keycloak
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: keycloak
  minReplicas: 3
  maxReplicas: 10
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Pods
          value: 2
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Pods
          value: 1
          periodSeconds: 120
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

The stabilization windows prevent thrashing:

  • Scale up: Wait 60 seconds before adding more pods, add up to 2 pods per minute
  • Scale down: Wait 5 minutes before removing pods, remove only 1 pod per 2 minutes

This ensures the cluster does not oscillate rapidly during variable traffic patterns.

Pod Disruption Budget

Protect against voluntary disruptions (node drains, cluster upgrades):

# keycloak-pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: keycloak
  namespace: keycloak
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app.kubernetes.io/name: keycloak

This ensures at least 2 Keycloak pods are always running, even during node maintenance operations.

Infinispan Clustering

Keycloak uses Infinispan for distributed caching and session replication. In Kubernetes, DNS-based discovery is the simplest approach.

The configuration is handled through Keycloak’s environment variables:

- name: KC_CACHE
  value: ispn
- name: KC_CACHE_STACK
  value: kubernetes
- name: jgroups.dns.query
  value: keycloak-headless.keycloak.svc.cluster.local

Verifying Cluster Formation

After deploying, verify that all pods have joined the Infinispan cluster:

# Check Keycloak logs for cluster join events
kubectl logs -n keycloak deployment/keycloak | grep -i "cluster"

# You should see messages like:
# Received new cluster view: [keycloak-xxx|2] (3) [keycloak-xxx, keycloak-yyy, keycloak-zzz]

If pods are not clustering, check:

  1. The headless service has publishNotReadyAddresses: true
  2. The DNS query matches the headless service name
  3. Port 7800 (JGroups) is not blocked by network policies

Custom Infinispan Cache Configuration

For high-traffic deployments, you can tune Infinispan’s cache settings by mounting a custom cache-ispn.xml:

# configmap with custom cache config
apiVersion: v1
kind: ConfigMap
metadata:
  name: keycloak-cache-config
  namespace: keycloak
data:
  cache-ispn.xml: |
    <infinispan>
      <cache-container name="keycloak">
        <transport lock-timeout="60000"/>
        <distributed-cache name="sessions" owners="2">
          <expiration lifespan="-1"/>
        </distributed-cache>
        <distributed-cache name="authenticationSessions" owners="2">
          <expiration lifespan="-1"/>
        </distributed-cache>
        <distributed-cache name="offlineSessions" owners="1">
          <expiration lifespan="-1"/>
        </distributed-cache>
      </cache-container>
    </infinispan>

The owners parameter controls how many copies of each session exist across the cluster. Setting owners="2" means each session is replicated to 2 nodes, providing resilience against a single node failure. For more on session management strategies, see our session management feature.

RBAC and ServiceAccount

# keycloak-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: keycloak
  namespace: keycloak

---

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: keycloak
  namespace: keycloak
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: keycloak
  namespace: keycloak
subjects:
  - kind: ServiceAccount
    name: keycloak
    namespace: keycloak
roleRef:
  kind: Role
  name: keycloak
  apiGroup: rbac.authorization.k8s.io

Keycloak needs get and list permissions on pods for DNS-PING cluster discovery.

Network Policies

Restrict network access to only what is needed:

# network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: keycloak
  namespace: keycloak
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: keycloak
  policyTypes:
    - Ingress
    - Egress
  ingress:
    # Allow traffic from ingress controller
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: ingress-nginx
      ports:
        - port: 8080
          protocol: TCP
    # Allow JGroups traffic between Keycloak pods
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: keycloak
      ports:
        - port: 7800
          protocol: TCP
  egress:
    # Allow DNS
    - to:
        - namespaceSelector: {}
      ports:
        - port: 53
          protocol: UDP
        - port: 53
          protocol: TCP
    # Allow PostgreSQL
    - to:
        - podSelector:
            matchLabels:
              cnpg.io/cluster: keycloak-db
      ports:
        - port: 5432
          protocol: TCP
    # Allow HTTPS for identity provider communication
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
      ports:
        - port: 443
          protocol: TCP

Monitoring with Prometheus

Keycloak exposes Prometheus metrics on the management port:

# service-monitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: keycloak
  namespace: keycloak
  labels:
    release: prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: keycloak
  endpoints:
    - port: management
      path: /metrics
      interval: 30s

Key metrics to alert on:

  • keycloak_login_attempts_total — authentication volume
  • keycloak_failed_login_attempts_total — failed logins (brute force detection)
  • keycloak_request_duration_seconds — API latency
  • JVM metrics (jvm_memory_used_bytes, jvm_gc_pause_seconds) — resource pressure

For built-in monitoring and alerting, Skycloak provides insights dashboards that track these metrics without additional setup.

Deploying Everything

Apply all manifests in order:

kubectl apply -f namespace.yaml
kubectl apply -f keycloak-rbac.yaml
kubectl apply -f postgres-cluster.yaml

# Wait for PostgreSQL to be ready
kubectl wait --for=condition=ready cluster/keycloak-db -n keycloak --timeout=300s

kubectl apply -f keycloak-deployment.yaml
kubectl apply -f keycloak-service.yaml
kubectl apply -f keycloak-ingress.yaml
kubectl apply -f keycloak-hpa.yaml
kubectl apply -f keycloak-pdb.yaml
kubectl apply -f network-policy.yaml

Verifying the Deployment

# Check pod status
kubectl get pods -n keycloak

# Verify cluster formation
kubectl logs -n keycloak -l app.kubernetes.io/name=keycloak --tail=50 | grep "cluster view"

# Test the health endpoint
kubectl port-forward -n keycloak svc/keycloak 8080:8080
curl http://localhost:8080/health/ready

Operational Considerations

Rolling Updates

When updating Keycloak versions, the rolling update strategy with maxUnavailable: 0 ensures zero downtime. However, be aware that Keycloak may need database migrations during major version upgrades. These run automatically on the first pod that starts.

Backup and Restore

With CloudNativePG, backups are automated. For disaster recovery, document and test your restore procedure:

# Restore from backup
kubectl apply -f - <<EOF
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: keycloak-db-restored
  namespace: keycloak
spec:
  instances: 3
  bootstrap:
    recovery:
      source: keycloak-db
  externalClusters:
    - name: keycloak-db
      barmanObjectStore:
        destinationPath: "s3://keycloak-backups/postgres/"
        s3Credentials:
          accessKeyId:
            name: aws-credentials
            key: ACCESS_KEY_ID
          secretAccessKey:
            name: aws-credentials
            key: SECRET_ACCESS_KEY
EOF

Managing Keycloak Configuration as Code

Once your cluster is running, manage realm configuration with Terraform or Pulumi for version-controlled, repeatable configuration.

Conclusion

A production Keycloak deployment on Kubernetes requires careful attention to database management, clustering, TLS, scaling, and resilience. The manifests in this guide provide a foundation that handles these concerns, but you should adapt them to your specific requirements, traffic patterns, and compliance needs. The Keycloak guides on running in containers cover additional configuration options for containerized deployments.

For organizations that want production-grade Keycloak without managing Kubernetes infrastructure, Skycloak’s managed hosting handles all of this automatically, including database tuning, high availability, TLS, and 24/7 monitoring. Check our pricing to see what fits your scale.

Guilliano Molaire
Written by Guilliano Molaire Founder

Guilliano is the founder of Skycloak and a cloud infrastructure specialist with deep expertise in product development and scaling SaaS products. He discovered Keycloak while consulting on enterprise IAM and built Skycloak to make managed Keycloak accessible to teams of every size.

Ready to simplify your authentication?

Deploy production-ready Keycloak in minutes. Unlimited users, flat pricing, no SSO tax.

© 2026 Skycloak. All Rights Reserved. Design by Yasser Soliman