Multi-tenancy is hard. When multiple customers share the same infrastructure, every component becomes a potential attack vector. At KubeBid, security isn't an afterthought—it's foundational to our architecture. This post details our approach to zero-trust security in a multi-tenant Kubernetes environment.
The Multi-Tenancy Challenge
In a traditional single-tenant model, you trust everything inside your network perimeter. Multi-tenancy breaks this model completely. Your neighbors on the same cluster could be:
- Competitors running similar workloads
- Malicious actors who signed up specifically to attack other tenants
- Legitimate users whose compromised workloads become attack vectors
Our security model assumes every tenant is potentially hostile. We call this "zero-trust multi-tenancy."
Layer 1: Compute Isolation
Kubernetes namespaces provide logical isolation, but they share the same kernel. Container escapes, though rare, can give attackers access to other tenants' workloads. We needed something stronger.
Hardware-Backed Isolation
Each tenant's workloads run in dedicated virtual machines, not just containers. We use a layered approach:
┌─────────────────────────────────────────────────────────────┐
│ Physical Host │
├─────────────────────────────────────────────────────────────┤
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Tenant A │ │ Tenant B │ │ Tenant C │ │
│ │ ┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐ │ │
│ │ │ Pod │ │ │ │ Pod │ │ │ │ Pod │ │ │
│ │ └────────┘ │ │ └────────┘ │ │ └────────┘ │ │
│ │ containerd │ │ containerd │ │ containerd │ │
│ ├──────────────┤ ├──────────────┤ ├──────────────┤ │
│ │ microVM │ │ microVM │ │ microVM │ │
│ │ (Firecracker)│ │ (Firecracker)│ │ (Firecracker)│ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Hypervisor (KVM) │
├─────────────────────────────────────────────────────────────┤
│ Host Kernel │
└─────────────────────────────────────────────────────────────┘
We use Firecracker microVMs—the same technology that powers AWS Lambda. Each microVM boots in under 125ms and provides full hardware virtualization with minimal overhead.
CPU and Memory Isolation
Beyond VMs, we implement additional hardware-level protections:
# Kata Containers configuration for additional isolation
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: kubebid-secure
handler: kata-fc
scheduling:
nodeSelector:
kubebid.io/secure-runtime: "true"
overhead:
podFixed:
memory: "160Mi"
cpu: "250m"
- Memory encryption: AMD SEV or Intel TDX encrypts memory at the hardware level
- CPU pinning: Tenants get dedicated CPU cores with no time-sharing
- Cache isolation: L3 cache partitioning prevents side-channel attacks
Layer 2: Network Isolation
Network isolation prevents tenants from snooping on each other's traffic or accessing unauthorized services.
Default Deny Network Policies
Every tenant namespace starts with a default-deny policy. Traffic must be explicitly allowed:
# Applied automatically to every tenant namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
# No ingress/egress rules = deny all
Tenants can then create policies to allow specific traffic:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-web-ingress
namespace: tenant-acme
spec:
podSelector:
matchLabels:
app: web
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
kubebid.io/ingress-controller: "true"
ports:
- protocol: TCP
port: 8080
Encrypted Overlay Network
All pod-to-pod traffic is encrypted using WireGuard, even within the same node:
# Cilium configuration for transparent encryption
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
name: encrypt-all
spec:
endpointSelector: {}
egressDeny:
- toEndpoints:
- matchLabels:
io.cilium.k8s.policy.unencrypted: "true"
This means even if an attacker gains access to the underlying network infrastructure, they can't read tenant traffic.
DNS Isolation
Tenants can only resolve DNS names within their namespace and explicitly shared services:
# CoreDNS policy plugin configuration
.:53 {
kubebid_policy {
# Tenants can only resolve:
# 1. Services in their own namespace
# 2. Services in kube-system (if allowed)
# 3. External DNS
allow_same_namespace
allow_external
deny_cross_namespace
}
forward . /etc/resolv.conf
}
Layer 3: API and Control Plane Security
The Kubernetes API server is a critical attack surface. We implement multiple layers of protection.
RBAC with Namespace Scoping
Tenants get RoleBindings scoped to their namespace. They cannot access cluster-level resources:
# Tenant admin role - scoped to namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: tenant-admin
namespace: tenant-acme
rules:
- apiGroups: ["", "apps", "batch"]
resources: ["*"]
verbs: ["*"]
- apiGroups: ["networking.k8s.io"]
resources: ["networkpolicies"]
verbs: ["get", "list", "create", "update", "delete"]
# Explicitly NOT granting:
# - clusterroles, clusterrolebindings
# - nodes, persistentvolumes
# - namespaces
Admission Control
We run a series of admission webhooks that validate and mutate all API requests:
# OPA Gatekeeper constraint
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sBlockPrivilegedContainers
metadata:
name: block-privileged
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
excludedNamespaces: ["kube-system"]
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sBlockHostNamespace
metadata:
name: block-host-namespace
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
excludedNamespaces: ["kube-system"]
Our admission controller blocks: privileged containers, host networking/PID/IPC, hostPath mounts, NodePort services, and dozens of other risky configurations.
Layer 4: Data Encryption
All tenant data is encrypted, both in transit and at rest.
Secrets Management
Kubernetes Secrets are encrypted at the etcd layer using a per-tenant KMS key:
# API server encryption configuration
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- kms:
name: kubebid-kms
endpoint: unix:///var/run/kmsplugin/socket.sock
cachesize: 1000
timeout: 3s
- identity: {}
Each tenant has their own encryption key, stored in our HSM-backed KMS. Even if an attacker gained access to etcd, they couldn't decrypt another tenant's secrets.
Storage Encryption
Persistent volumes use dm-crypt with per-volume keys:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: encrypted-ssd
provisioner: kubebid.io/csi-driver
parameters:
type: ssd
encryption: "true"
encryptionKeySource: tenant-kms
fsType: ext4
Audit and Monitoring
Security isn't just about prevention—it's about detection and response.
Comprehensive Audit Logging
Every API request is logged with full request/response bodies for sensitive operations:
# Audit policy
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all changes to secrets at metadata level
- level: Metadata
resources:
- group: ""
resources: ["secrets"]
# Log all pod exec/attach at request level
- level: Request
resources:
- group: ""
resources: ["pods/exec", "pods/attach"]
# Log everything else at metadata level
- level: Metadata
omitStages:
- RequestReceived
Runtime Security
We use Falco to detect suspicious runtime behavior:
# Falco rules for multi-tenant security
- rule: Detect Privilege Escalation
desc: Detect attempts to escalate privileges
condition: >
spawned_process and proc.name in (sudo, su, doas) and
not user.name = root
output: >
Privilege escalation attempt
(user=%user.name command=%proc.cmdline tenant=%k8s.ns.name)
priority: CRITICAL
- rule: Detect Container Escape
desc: Detect potential container escape attempts
condition: >
container and proc.name in (nsenter, unshare) or
(proc.name = mount and proc.args contains "proc")
output: >
Potential container escape
(command=%proc.cmdline tenant=%k8s.ns.name)
priority: CRITICAL
Compliance
Our security model has been validated by third-party audits:
- SOC 2 Type II: Certified for security, availability, and confidentiality
- HIPAA: BAA available for healthcare workloads
- GDPR: Full compliance with data processing agreements
- PCI DSS: Level 1 service provider certification
We also run a bug bounty program. If you find a way to access another tenant's data, we'll pay you for it.
Conclusion
Zero-trust multi-tenancy requires security at every layer: compute, network, API, and data. It's not enough to implement one or two of these controls—you need all of them working together.
If you're building a multi-tenant platform, I hope this gives you a roadmap for what's possible. And if you want to run your workloads on infrastructure with these protections built in, give KubeBid a try.
Alex Patel leads Security Engineering at KubeBid. Previously, he built security infrastructure at Stripe and was a founding member of Google's Kubernetes security team.