Kubernetes Manifest Validation
The Problem: Kubernetes Accepts Almost Anything
Kubernetes is remarkably permissive in what it accepts. A deployment with no resource limits, no health checks, running as root, with the latest tag will deploy successfully. Kubernetes does not tell you it will behave badly -- it just does what you asked, and you discover the consequences in production.
Manifest validation tools catch these misconfigurations before they reach the cluster. They enforce best practices that Kubernetes itself does not require but that production operations demand.
Validation Tool Comparison
| Tool | Focus | Approach | CRD Support | Speed |
|---|---|---|---|---|
| kubeval | Schema validation | Validates against K8s API schemas | Limited | Fast |
| kubeconform | Schema validation | Faster kubeval replacement | Yes (via plugins) | Very fast |
| kube-score | Best practices | Opinionated checks (security, reliability) | No | Fast |
| Polaris | Policy enforcement | Fairwinds' policy engine | Yes | Fast |
| Datree | Policy enforcement | Built-in + custom rules | Yes | Fast |
| OPA/Gatekeeper | Admission control | Custom Rego policies in-cluster | Yes | Fast |
kubeval and kubeconform
# kubeval: validates manifests against K8s API schemas
kubeval deployment.yaml --kubernetes-version 1.29.0
# kubeconform: faster kubeval replacement with CRD support
kubeconform -strict -kubernetes-version 1.29.0 deployment.yaml
# Validate all manifests in a directory
kubeconform -strict -summary k8s/
# Validate Helm-rendered templates
helm template myapp ./charts/myapp/ | kubeconform -strict
# Use with custom resource definitions
kubeconform -strict \
-schema-location default \
-schema-location 'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' \
k8s/
kube-score
kube-score checks for best practices rather than schema validity:
# Score a deployment manifest
kube-score score deployment.yaml
# Output:
# apps/v1/Deployment myapp: (CRITICAL) Container has no readiness probe
# apps/v1/Deployment myapp: (CRITICAL) No network policy matching pod
# apps/v1/Deployment myapp: (WARNING) CPU limit is not set
# Score all manifests in a directory
kube-score score k8s/*.yaml
# Output as JSON for CI parsing
kube-score score deployment.yaml --output-format json
# Ignore specific checks
kube-score score deployment.yaml \
--ignore-test container-cpu-limit \
--ignore-test pod-networkpolicy
Polaris
Polaris provides both CLI scanning and in-cluster admission control:
# Audit local manifests
polaris audit --audit-path ./k8s/ --format=pretty
# Checks: resource limits, health probes, security context,
# image pull policy, host network access, privilege escalation
# Audit with a minimum score threshold
polaris audit --audit-path ./k8s/ --set-exit-code-below-score 80
# Run Polaris as a Kubernetes admission controller
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm install polaris fairwinds-stable/polaris --namespace polaris
What to Validate in Kubernetes Manifests
A production-ready deployment should include ALL of the following. Each annotation explains why the field matters:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
labels:
app: myapp
version: "1.2.3" # Pinned version, not "latest"
spec:
replicas: 3 # Multiple replicas for availability
selector:
matchLabels:
app: myapp
template:
spec:
securityContext:
runAsNonRoot: true # Never run as root
runAsUser: 1000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault # Use default seccomp profile
containers:
- name: myapp
image: myapp:1.2.3 # Pinned tag, never :latest
resources:
requests: # Scheduler needs these
cpu: 100m
memory: 128Mi
limits: # Prevent noisy-neighbor issues
cpu: 500m
memory: 512Mi
readinessProbe: # When to send traffic
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
livenessProbe: # When to restart
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
failureThreshold: 3
startupProbe: # Grace period for slow starts
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
ports:
- containerPort: 8080
protocol: TCP
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef: # Never hardcode secrets
name: myapp-secrets
key: db-password
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: myapp
Checklist for Manifest Review
| Check | Why | Severity |
|---|---|---|
Image tag is not :latest |
Prevents unpredictable deployments | Critical |
| Resource requests and limits set | Prevents resource starvation | Critical |
| Readiness probe configured | Prevents traffic to unready pods | Critical |
| Liveness probe configured | Enables automatic recovery | High |
runAsNonRoot: true |
Prevents container breakout | Critical |
readOnlyRootFilesystem: true |
Limits attacker file writes | High |
allowPrivilegeEscalation: false |
Prevents privilege elevation | Critical |
| All capabilities dropped | Minimizes kernel exposure | High |
No hostNetwork: true |
Prevents host network access | Critical |
No hostPID: true |
Prevents process visibility | Critical |
Secrets via secretKeyRef |
No hardcoded credentials | Critical |
| Multiple replicas | Availability during node failure | High |
| PodDisruptionBudget exists | Prevents all-at-once upgrades | Medium |
| NetworkPolicy exists | Zero-trust network segmentation | High |
Custom Validation with OPA Gatekeeper
For organization-specific rules that go beyond standard tools, deploy OPA Gatekeeper as a Kubernetes admission controller:
# constraint-template.yaml
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8snolatestimage
spec:
crd:
spec:
names:
kind: K8sNoLatestImage
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8snolatestimage
violation[{"msg": msg}] {
container := input.review.object.spec.template.spec.containers[_]
endswith(container.image, ":latest")
msg := sprintf("Container '%s' uses :latest tag. Pin a specific version.", [container.name])
}
violation[{"msg": msg}] {
container := input.review.object.spec.template.spec.containers[_]
not contains(container.image, ":")
msg := sprintf("Container '%s' has no image tag. Pin a specific version.", [container.name])
}
---
# constraint.yaml
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sNoLatestImage
metadata:
name: no-latest-image
spec:
match:
kinds:
- apiGroups: ["apps"]
kinds: ["Deployment", "StatefulSet", "DaemonSet"]
namespaces:
- production
- staging
Integrating Validation into CI
# .github/workflows/k8s-validation.yml
name: Kubernetes Manifest Validation
on:
pull_request:
paths:
- 'k8s/**'
- 'charts/**'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Schema validation
run: kubeconform -strict -summary k8s/
- name: Best practices check
run: |
kube-score score k8s/*.yaml
EXIT_CODE=$?
if [ $EXIT_CODE -ne 0 ]; then
echo "kube-score found issues. Review above output."
exit 1
fi
- name: Policy audit
run: |
polaris audit --audit-path k8s/ \
--set-exit-code-below-score 80 \
--format=pretty
Manifest validation is one of the highest-value, lowest-cost practices in Kubernetes security. It runs in seconds, catches real issues, and prevents misconfigurations that take hours to debug in production.