Blog

Case studies, strategies, and ideas shaping modern technology.

No VPNs, No Bastions: Zero Trust Kubernetes

No VPNs, No Bastions: Zero Trust Kubernetes

Introduction

As Kubernetes platforms evolve, many organisations reach a point where their original architecture, optimised for speed and accessibility, no longer meets modern security expectations.

A common pattern in earlier-stage environments is to rely on:

  • Public Kubernetes API endpoints
  • Broad network access controls
  • VPNs or bastion hosts to secure internal systems

While these approaches are widely used, they introduce long-term challenges. VPNs extend the network perimeter and add operational overhead. Bastion hosts centralised access but still exposes public IP addresses and requires ongoing maintenance. In both cases, access is still fundamentally based on network location rather than identity.

This case study explores how we designed and implemented a fully private, zero-trust Kubernetes access model that eliminates the need for public endpoints, VPNs, and bastion hosts.

 

Challenge

At first glance, the platform appeared to follow good practices.

Clusters were provisioned using infrastructure-as-code, worker nodes were private, and deployments were handled through automated pipelines. From an operational standpoint, everything worked as expected.

However, a deeper review revealed a fundamental issue.

The Gap Between Perception and Reality

Although the worker nodes were private, the Kubernetes control plane remained publicly accessible. This meant:

  • The Kubernetes API could still be reached from outside the environment
  • Access controls relied on network restrictions rather than true isolation
  • In some cases, authorised networks were overly permissive

This created a misleading situation where the clusters appeared secure, but the primary entry point remained exposed.

 

Why This Became Critical

The Kubernetes API is the control surface for:

  • Deploying workloads
  • Modifying infrastructure
  • Accessing cluster state

Even when protected by IAM and network rules, a public endpoint introduces unnecessary risk. The team needed to eliminate this exposure entirely by moving to private endpoint-only clusters.

 

Security Broke the Delivery Model

Enabling private endpoints immediately exposed a deeper dependency.

The existing deployment workflow relied on:

  • External CI/CD pipelines
  • Direct communication with the Kubernetes API
  • Applying configuration (such as RootSync manifests) after cluster creation

Once the public endpoint was disabled:

  • Standard access methods stopped working
  • Pipelines could no longer complete deployments
  • The system lost its ability to bootstrap and update clusters

 

This revealed a key architectural flaw:

The platform’s delivery model depended on the very exposure it now needed to remove.

 lock.png

The Constraint: No VPNs or Bastion Hosts

The team also made a deliberate decision to avoid traditional solutions:

  • VPNs would introduce cost, complexity, and extend the network boundary
  • Bastion hosts would still expose public endpoints and require ongoing management

Instead, the solution needed to:

  • Keep clusters fully private
  • Avoid expanding the network perimeter
  • Provide secure, controlled access based on identity, not location

 

A Broader Pattern

A similar challenge had already been encountered in an AWS environment, where clusters were fully private, but our engineers still needed to:

  • Debug services
  • Access internal workloads
  • Connect to databases inside the cluster

This reinforced a key insight:

Removing public access is only half the problem. The real challenge is enabling secure, controlled access back into private systems.

 

Solution

To address these challenges, the team implemented a multi-layered redesign, aligning the platform with zero-trust principles and removing reliance on network-based access.

 

Step 1: Making Clusters Truly Private

The first step was to eliminate all public exposure.

Clusters were reconfigured to:

  • Enable private nodes and private endpoints
  • Disable public API access entirely
  • Restrict access to internal network ranges only

This ensured that:

  • The Kubernetes API could not be reached from the internet
  • All communication with the cluster originated inside the trusted environment

unnamed 4

 

Step 2: Moving Deployment Inside the Network

With external access removed, the deployment model had to change.

Traditionally, services such as GitHub Actions and Bitbucket Pipelines run on external infrastructure and connect into the target environment to perform deployments.

The team reversed this model.

The source code remained in GitHub and Bitbucket, but the execution environment was moved inside the private network. A self-hosted CI/CD runner securely pulls the pipeline jobs and source code, then performs all deployment steps from within the VPC.

This means code stays external, but execution happens internally.

As a result:

  • No public Kubernetes API endpoint is required
  • No inbound access to the cluster is needed
  • Existing CI/CD workflows continue with minimal changes

 

Internal CI/CD Execution

A private CI/CD runner was deployed inside the same VPC as the cluster.

This runner:

  • Executes pipeline jobs internally
  • Has direct access to the private Kubernetes API
  • Handles infrastructure provisioning and workload deployment

This allowed the team to:

  • Maintain existing pipelines
  • Avoid exposing the cluster
  • Restore full deployment functionality

From a developer perspective, workflows remained largely unchanged, while security improved significantly behind the scenes.

 

Step 3: Reducing Dependency on Direct Access (GitOps)

To further strengthen the model, the team introduced GitOps principles.

Instead of relying on direct API calls:

  • Desired state is defined in version-controlled manifests
  • The cluster continuously synchronises itself with that state

Using Config Sync:

  • Workloads are applied automatically inside the cluster
  • The need for inbound connectivity is reduced

This creates a more resilient and secure deployment pattern.

 

Step 4: Replacing Network Access with Identity-Based Access

A key part of the solution was removing reliance on network-level trust.

Instead of VPNs or bastion hosts, the team adopted identity-driven access patterns.

 

GCP Approach: Connect Gateway

Access to clusters is handled through a managed API layer:

  • Users authenticate via IAM
  • Requests are routed through Google’s control plane
  • Traffic is securely proxied to the private cluster

This means:

  • No direct endpoint exposure
  • No VPN required
  • No bastion infrastructure

Access becomes:

  • Centralised
  • Auditable
  • Based on identity-based controls

 

AWS Comparison: SSM Tunnel-Based Access

In AWS, the same zero-trust objectives are achieved using a different model.

 

Deployments via CI/CD

For automated deployments to private EKS clusters:

  • A self-hosted Bitbucket runner operates inside the AWS environment
  • The runner securely pulls pipeline jobs from Bitbucket
  • Deployment traffic reaches the cluster over internal network paths only
  • No public Kubernetes API endpoint is required

In this model, the pipeline execution environment runs inside the private network and initiates outbound connections to Bitbucket.

 

Human Access via SSM Tunnel

For engineers who need temporary access to services inside the cluster:

  • A sidecar container runs an AWS Systems Manager (SSM) Agent alongside the application
  • The agent registers with AWS Systems Manager
  • Engineers create an authenticated, short-lived tunnel from their local machine
  • Traffic is routed through AWS infrastructure directly to the target service

For example, a developer debugging a PostgreSQL instance can create a local tunnel that connects securely to the database inside the cluster without exposing it publicly.

 

Key Characteristics

This AWS model:

  • Requires no inbound network access
  • Uses short-lived, authenticated connections
  • Separates CI/CD access from human access
  • Supports tightly controlled debugging workflows
  • Keeps all cluster endpoints private

 

Google Cloud Comparison

In Google Cloud, both CI/CD systems and human users access private Google Kubernetes Engine clusters through GKE Connect Gateway.

The execution environment remains external, but all access is brokered through Google's control plane and governed by Google Cloud IAM.

 

Shared Principle

Although implemented differently, both AWS and Google Cloud follow the same architectural principles:

  • No public endpoints
  • No persistent network access
  • Identity-based authentication
  • Controlled, temporary connectivity

This is the foundation of a zero-trust architecture.

 

table.png

 

In Google Cloud, the control plane acts as the secure intermediary.

In AWS, trusted components run inside the private environment and establish outbound connections to external systems.

Both approaches eliminate public exposure while maintaining secure, auditable access for automation and engineers alike.

 

aws.png

 

Step 5: Locking Down Network Egress

To complete the security model, outbound traffic was restricted.

A deny-by-default approach was implemented:

  • Only internal network ranges are allowed
  • Required cloud service endpoints are permitted
  • All other traffic is blocked

This ensures that workloads:

  • Cannot freely access the internet
  • Only communicate with approved services

 

Step 6: Enforcing Least Privilege IAM

IAM was restructured to align with the new model:

  • CI/CD systems have narrowly scoped permissions
  • Users require explicit access roles
  • Responsibilities are clearly separated

This reduces risk and ensures access is granted only where necessary.

 

Results

The outcome was a secure, scalable platform that aligns with modern zero-trust principles.

The platform was transformed from a traditionally secured Kubernetes environment into a fully private, identity-driven architecture based on zero-trust principles.

 

What Changed

  • All Kubernetes clusters now operate with private endpoints only
  • The Kubernetes API is no longer accessible from the internet
  • CI/CD pipelines continue to deploy without relying on public access
  • Engineers can securely access and debug services when needed
  • VPNs and bastion hosts have been completely eliminated
  • Outbound traffic is tightly controlled through restricted egress rules
  • IAM permissions follow least privilege principles

 

Security Improvements

The new architecture significantly reduced the attack surface by removing public endpoints and replacing network-based trust with identity-based access controls.

All access is now:

  • Authenticated and authorised
  • Fully auditable
  • Temporary where required
  • Independent of network location

 

access.png

 

Operational Benefits

These security improvements were achieved without disrupting day-to-day development.

  • Existing pipelines required minimal changes
  • Developer workflows remained familiar
  • Platform complexity is handled internally rather than by end users
  • Secure access is available when needed for troubleshooting and support

 

Future Readiness

The platform is now positioned to support further enhancements, including:

  • Full GitOps adoption
  • Advanced controls such as VPC Service Controls
  • Scalable multi-environment deployments
  • Secure break-glass access patterns

 

Outcome

The organisation now has a secure and scalable Kubernetes platform that delivers:

  • Fully private clusters
  • Zero-trust access
  • Seamless CI/CD
  • Controlled operational access
  • Reduced operational overhead
  • Improved security posture

 

Key Takeaway

This case study highlights a fundamental shift in platform security:

Secure access is no longer about protecting a network perimeter; it is about removing it entirely.

By replacing VPNs and bastion hosts with:

  • Private endpoints
  • Internal execution
  • Identity-based access
  • Controlled connectivity patterns

The team achieved a secure, zero-trust Kubernetes architecture without sacrificing usability, performance, or developer productivity.

 


Ready to move to a fully private, identity-driven model?

We can help you redesign your platform around private endpoints, internal execution, and zero-trust access, without compromising operability. 

Reach out now