Kubernetes Security: Essential Steps to Protect Your Cluster – ITFROMZERO

Table of Contents

When your Kubernetes cluster becomes a target

I learned a hard lesson last year. A server got brute-forced over SSH in the middle of the night, and I had to scramble to deal with it half-asleep. Since then I’ve made it a rule: whatever infrastructure I deploy, security gets set up on day one — not after something goes wrong.

Kubernetes is no exception. A lot of teams spin up a cluster, get their app running, and ship straight to production — security is something they’ll deal with later. And “later” usually never comes until there’s an incident.

This isn’t theory. In 2018, the RedLock team discovered that Tesla’s Kubernetes cluster had its API server exposed to the internet with no authentication — attackers were using it to mine cryptocurrency. Containers ran as root, pods communicated freely, secrets were stored as unencrypted base64, and the API server had no access restrictions. This is the reality of many production clusters I’ve reviewed, not just Tesla’s.

Why Kubernetes is insecure by default

K8s was designed to be flexible and easy to use first. Security is a layer you add yourself — it doesn’t come out of the box. Specifically:

RBAC is not enabled by default on some older versions or freshly created managed clusters.
Pods run as root unless you explicitly define a SecurityContext.
Network Policy doesn’t exist until you write one — every pod in the cluster can freely talk to every other pod.
Secrets are just base64: anyone with read access to etcd or the Secret resource can read everything in plaintext.
The default Service Account is mounted into every pod and can call the Kubernetes API.

Put these together: a single compromised container is enough for an attacker to escalate privileges across the entire cluster. I’ve tested this in a lab — it took less than 10 minutes to go from a compromised pod to cluster-admin access.

Security layers you need to configure

1. RBAC — Access Control

The foundation of everything is least privilege: give each identity only the permissions it actually needs, nothing more.

# Create a Role that only allows reading Pods in the "app" namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: app
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: app
subjects:
- kind: ServiceAccount
  name: my-service-account
  namespace: app
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Avoid ClusterRole with verbs: ["*"] or resources: ["*"] at all costs — this is a mistake I see in nearly every cluster that was set up in a hurry. Audit periodically to catch excessive permissions:

# Check what a service account is allowed to do
kubectl auth can-i --list --as=system:serviceaccount:app:my-service-account

# Find ClusterRoleBindings assigned to anonymous users
kubectl get clusterrolebindings -o json | jq '.items[] | select(.subjects[]?.name=="system:anonymous")'

2. Pod Security — Restricting what containers can do

Since Kubernetes 1.25, Pod Security Admission replaces PodSecurityPolicy (deprecated since 1.21). Enforce policies at the namespace level:

# Apply the "restricted" policy to the production namespace
kubectl label namespace production \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/audit=restricted \
  pod-security.kubernetes.io/warn=restricted

In your Pod/Deployment spec, always declare SecurityContext explicitly:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  capabilities:
    drop:
    - ALL

The container runs as UID 1000, with a read-only filesystem and all Linux capabilities dropped. Even if the container is compromised, the attacker can’t do much — they’re stuck in a very tight sandbox.

3. Network Policy — Firewalls between Pods

By default, pod A can call pod B directly even if there’s no reason for that to happen. Network Policy lets you define exactly which traffic is allowed:

# Only allow the frontend to call the backend, nothing else
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-policy
  namespace: app
spec:
  podSelector:
    matchLabels:
      role: backend
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          role: frontend
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          role: database
    ports:
    - protocol: TCP
      port: 5432

One important caveat: Network Policy requires a CNI plugin that supports it — Calico, Cilium, or Weave all work. If you’re using Flannel, be careful: policies get created but have absolutely no effect, with no warning whatsoever.

4. Secret Management — Don’t trust Kubernetes Secrets by default

Base64 is not encryption. Anyone with read access to etcd or who can run kubectl get secret gets the plaintext immediately. Two approaches:

Option 1 — Enable encryption at rest for etcd (requires control plane access):

# Create EncryptionConfiguration
cat > /etc/kubernetes/encryption-config.yaml <<EOF
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
    - secrets
    providers:
    - aescbc:
        keys:
        - name: key1
          secret: $(head -c 32 /dev/urandom | base64)
    - identity: {}
EOF

# Add to kube-apiserver: --encryption-provider-config=/etc/kubernetes/encryption-config.yaml

Option 2 (recommended) — External Secrets Operator integrated with HashiCorp Vault or AWS Secrets Manager. Secrets are never stored in etcd and are only injected into pods at runtime:

# Install External Secrets Operator
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets -n external-secrets --create-namespace

5. Audit Logging and Runtime Security

Detecting anomalous behavior early is just as important as prevention. Enable audit logging on the API server:

# audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
  resources:
  - group: ""
    resources: ["secrets", "configmaps"]
- level: RequestResponse
  verbs: ["delete", "create"]
  resources:
  - group: ""
    resources: ["pods"]

Falco adds a runtime detection layer — detecting shell spawning inside containers, unusual file access, suspicious network connections, all in real time. On production this is non-negotiable:

helm repo add falcosecurity https://falcosecurity.github.io/charts
helm install falco falcosecurity/falco --namespace falco --create-namespace \
  --set falco.grpc.enabled=true \
  --set falco.grpcOutput.enabled=true

Prioritized security checklist

The NSA and CISA once issued a joint advisory on K8s security, concluding there’s no silver bullet — only multiple overlapping layers of defense are effective. If one layer gets bypassed, the next one stops the attacker. Practical priority order:

Don’t expose the API server to the internet — use a VPN or bastion host. This is the most severe mistake and the easiest to fix.
Keep Kubernetes up to date — K8s patches CVEs on a release cycle; older versions have publicly known vulnerabilities.
RBAC with least privilege — review regularly, remove permissions that are no longer needed.
SecurityContext on every Pod — set it in your Helm chart or default template so nothing slips through.
Network Policy — start with deny-all, then open up traffic as needed.
Scan images before deploying — integrate Trivy or Grype directly into your CI/CD pipeline.
Proper secret management — at minimum encryption at rest, ideally an external vault.
Audit logs + runtime monitoring — Falco for production, no exceptions.

# Scan an image with Trivy before pushing
trivy image --severity HIGH,CRITICAL your-image:tag

# Check for misconfigurations in YAML manifests
trivy config ./k8s-manifests/

# Audit the cluster against the CIS Kubernetes Benchmark
docker run --pid=host --userns=host --rm -v /etc:/etc:ro \
  -v /var:/var:ro -v /usr/lib/systemd:/usr/lib/systemd:ro \
  aquasec/kube-bench:latest

What I always emphasize with my team: security has to be part of the process, not a separate task. Integrate Trivy into your CI pipeline, enforce Pod Security at the admission controller, put RBAC reviews into sprint retrospectives — once it becomes routine, it stops feeling like a burden.

A Kubernetes cluster will never reach a state of “perfect security.” But with the steps above, you’ve pushed the risk down to an acceptable level — and more importantly, you’ll know when something unusual happens, instead of finding out after the damage is done.