Cased CD Enterprise - Troubleshooting Guide

This guide helps diagnose and fix common installation issues.

Quick Diagnostic Script

Run this first to get a health report:

curl -sL https://raw.githubusercontent.com/cased/cased-cd/main/scripts/diagnose.sh | bash

Or manually:

kubectl get pods -n argocd -l app.kubernetes.io/name=cased-cd
kubectl describe pod -n argocd -l app.kubernetes.io/name=cased-cd
kubectl logs -n argocd -l app.kubernetes.io/name=cased-cd --tail=50

Common Issues

1. Pod Stuck in `ImagePullBackOff`

Symptom:

NAME                        READY   STATUS             RESTARTS   AGE
cased-cd-enterprise-xxx     0/1     ImagePullBackOff   0          2m

Check the error:

kubectl describe pod -n argocd -l app.kubernetes.io/component=enterprise | grep -A 5 "Events:"

Cause A: imagePullSecret Not Created

Error:

Failed to pull image "registry.cased.com/cased/cased-cd-enterprise:0.2.8":
rpc error: code = Unknown desc = failed to pull and unpack image:
failed to resolve reference "registry.cased.com/cased/cased-cd-enterprise:0.2.8":
pull access denied, repository does not exist or may require authentication

Fix:

# Create the secret with your customer token (provided by Cased support)
kubectl create secret docker-registry cased-cd-registry \
  --docker-server=registry.cased.com \
  --docker-username="YOUR_CUSTOMER_NAME" \
  --docker-password="YOUR_CUSTOMER_TOKEN" \
  -n argocd

# Verify it exists
kubectl get secret cased-cd-registry -n argocd

Cause B: Wrong Secret Name in Helm Values

Fix:

# Check what secret name you used
kubectl get secrets -n argocd | grep docker-registry

# Reinstall with correct secret name
helm upgrade cased-cd cased-cd/cased-cd \
  --namespace argocd \
  --set enterprise.image.repository=registry.cased.com/cased/cased-cd-enterprise \
  --set imagePullSecrets[0].name=cased-cd-registry  # ← Must match secret name

Cause C: Token Expired or Invalid

Check token validity:

# Try pulling the image manually
echo "YOUR_CUSTOMER_TOKEN" | docker login registry.cased.com -u "YOUR_CUSTOMER_NAME" --password-stdin
docker pull registry.cased.com/cased/cased-cd-enterprise:0.2.8

If this fails, contact support@cased.com for a new token.

2. Pod Stuck in `CrashLoopBackOff`

Symptom:

NAME                        READY   STATUS             RESTARTS   AGE
cased-cd-enterprise-xxx     0/1     CrashLoopBackOff   5          5m

Check logs:

kubectl logs -n argocd -l app.kubernetes.io/component=enterprise --tail=100

Cause A: Cannot Connect to ArgoCD Server

Error in logs:

Failed to connect to ArgoCD server: dial tcp 10.96.0.1:443: connect: connection refused

Check ArgoCD is running:

kubectl get pods -n argocd -l app.kubernetes.io/name=argocd-server

Fix:

# If ArgoCD server isn't running, install/restart it first
kubectl rollout status deployment/argocd-server -n argocd

# Then restart Cased CD
kubectl rollout restart deployment/cased-cd-enterprise -n argocd

Cause B: Missing Environment Variables

Check env vars are set:

kubectl get deployment cased-cd-enterprise -n argocd -o yaml | grep -A 10 "env:"

Should include:

env:
  - name: ARGOCD_SERVER
    value: "https://argocd-server.argocd.svc.cluster.local"
  - name: PORT
    value: "8081"

3. PersistentVolumeClaim Stuck in `Pending`

Symptom:

$ kubectl get pvc -n argocd
NAME                          STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS
cased-cd-enterprise-audit     Pending

Check the error:

kubectl describe pvc cased-cd-enterprise-audit -n argocd

Cause A: No Default StorageClass

Error:

no persistent volumes available for this claim and no storage class is set

Check storage classes:

kubectl get storageclass

Fix - Option 1: Use existing storage class:

helm upgrade cased-cd cased-cd/cased-cd \
  --namespace argocd \
  --set enterprise.auditTrail.storageClass=gp2  # ← Use your storage class name

Fix - Option 2: Disable PVC (audit logs go to pod logs):

helm upgrade cased-cd cased-cd/cased-cd \
  --namespace argocd \
  --set enterprise.auditTrail.enabled=false  # ← Disable persistent audit logs

Cause B: Missing EBS CSI Driver (AWS EKS)

Error:

waiting for first consumer to be created before binding
Waiting for a volume to be created either by the external provisioner 'ebs.csi.aws.com'
or manually by the system administrator

On AWS EKS, the EBS CSI driver is required for persistent volumes but is not installed by default.

Check if EBS CSI driver is installed:

kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-ebs-csi-driver
aws eks list-addons --cluster-name YOUR_CLUSTER_NAME

Fix - Install EBS CSI driver:

Create IAM role for the driver:

# Get your cluster's OIDC provider
OIDC_ID=$(aws eks describe-cluster --name YOUR_CLUSTER_NAME --query "cluster.identity.oidc.issuer" --output text | cut -d '/' -f 5)

# Create trust policy
cat > ebs-csi-trust-policy.json << EOF
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Federated": "arn:aws:iam::YOUR_ACCOUNT_ID:oidc-provider/oidc.eks.REGION.amazonaws.com/id/$OIDC_ID"
    },
    "Action": "sts:AssumeRoleWithWebIdentity",
    "Condition": {
      "StringEquals": {
        "oidc.eks.REGION.amazonaws.com/id/$OIDC_ID:sub": "system:serviceaccount:kube-system:ebs-csi-controller-sa",
        "oidc.eks.REGION.amazonaws.com/id/$OIDC_ID:aud": "sts.amazonaws.com"
      }
    }
  }]
}
EOF

# Create IAM role
aws iam create-role \
  --role-name AmazonEKS_EBS_CSI_DriverRole \
  --assume-role-policy-document file://ebs-csi-trust-policy.json

# Attach policy
aws iam attach-role-policy \
  --role-name AmazonEKS_EBS_CSI_DriverRole \
  --policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy

Install the addon:

aws eks create-addon \
  --cluster-name YOUR_CLUSTER_NAME \
  --addon-name aws-ebs-csi-driver \
  --service-account-role-arn arn:aws:iam::YOUR_ACCOUNT_ID:role/AmazonEKS_EBS_CSI_DriverRole

Wait for it to become active (takes ~60 seconds):

aws eks describe-addon \
  --cluster-name YOUR_CLUSTER_NAME \
  --addon-name aws-ebs-csi-driver \
  --query 'addon.status'

Once the driver is active, the PVC will automatically bind.

4. Cannot Access UI (Connection Refused)

Symptom:

$ kubectl port-forward svc/cased-cd 8080:80 -n argocd
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080

$ curl http://localhost:8080
curl: (7) Failed to connect to localhost port 8080: Connection refused

Check pod is running:

kubectl get pods -n argocd -l app.kubernetes.io/name=cased-cd

Check service:

kubectl get svc cased-cd -n argocd
kubectl describe svc cased-cd -n argocd

Check endpoints:

kubectl get endpoints cased-cd -n argocd

If endpoints are empty, the pod isn't ready. Check pod health:

kubectl describe pod -n argocd -l app.kubernetes.io/name=cased-cd

5. RBAC Errors (Enterprise Features Not Working)

Symptom:

Enterprise features show "Access Denied" or don't work

Check RBAC role exists:

kubectl get role cased-cd-enterprise -n argocd
kubectl get rolebinding cased-cd-enterprise -n argocd

Verify permissions:

kubectl describe role cased-cd-enterprise -n argocd

Should allow:

get, update, patch on ConfigMaps: argocd-rbac-cm, argocd-notifications-cm

get, update, patch on Secrets: argocd-secret

Fix:

6. Health Checks Failing

Check liveness/readiness probes:

kubectl describe pod -n argocd -l app.kubernetes.io/component=enterprise | grep -A 10 "Liveness\|Readiness"

Test health endpoint manually:

# Port-forward to the pod directly
kubectl port-forward -n argocd pod/cased-cd-enterprise-xxx 8081:8081

# Test in another terminal
curl http://localhost:8081/health

Expected response:

{"status":"ok","version":"0.2.8"}

7. Wrong Image Version

Check what image is running:

kubectl get deployment cased-cd-enterprise -n argocd -o jsonpath='{.spec.template.spec.containers[0].image}'

Update to latest:

helm upgrade cased-cd cased-cd/cased-cd \
  --namespace argocd \
  --set enterprise.image.tag=0.2.8  # ← Specify version

Debug Mode

Enable verbose logging:

helm upgrade cased-cd cased-cd/cased-cd \
  --namespace argocd \
  --set enterprise.debug=true

Then check logs:

kubectl logs -n argocd -l app.kubernetes.io/component=enterprise -f

Getting Help

Collect Debug Information

Run this and send output to support:

#!/bin/bash
echo "=== Cased CD Debug Report ==="
echo "Date: $(date)"
echo ""

echo "=== Pods ==="
kubectl get pods -n argocd -l app.kubernetes.io/name=cased-cd

echo ""
echo "=== Deployments ==="
kubectl get deployments -n argocd -l app.kubernetes.io/name=cased-cd

echo ""
echo "=== Services ==="
kubectl get svc -n argocd -l app.kubernetes.io/name=cased-cd

echo ""
echo "=== PVCs ==="
kubectl get pvc -n argocd -l app.kubernetes.io/name=cased-cd

echo ""
echo "=== Secrets ==="
kubectl get secrets -n argocd | grep cased

echo ""
echo "=== Recent Pod Events ==="
kubectl get events -n argocd --sort-by='.lastTimestamp' | grep cased-cd | tail -20

echo ""
echo "=== Pod Logs (Last 50 lines) ==="
kubectl logs -n argocd -l app.kubernetes.io/name=cased-cd --tail=50

echo ""
echo "=== Enterprise Pod Logs (Last 50 lines) ==="
kubectl logs -n argocd -l app.kubernetes.io/component=enterprise --tail=50 2>/dev/null || echo "No enterprise pods found"

Save as debug-report.sh, run it:

chmod +x debug-report.sh
./debug-report.sh > debug-report.txt

Send debug-report.txt to support@cased.com

Contact Support

Email: support@cased.com

Documentation: https://cased.github.io/cased-cd

GitHub Issues: https://github.com/cased/cased-cd/issues

Include:

Kubernetes version: kubectl version --short
ArgoCD version: kubectl get pods -n argocd -l app.kubernetes.io/name=argocd-server -o jsonpath='{.items[0].spec.containers[0].image}'

Debug report from above

Steps to reproduce the issue