No VPNs, No Bastions: Zero Trust Kubernetes

Introduction

As Kubernetes platforms evolve, many organisations reach a point where their original architecture, optimised for speed and accessibility, no longer meets modern security expectations.

A common pattern in earlier-stage environments is to rely on:

Public Kubernetes API endpoints
Broad network access controls
VPNs or bastion hosts to secure internal systems

While these approaches are widely used, they introduce long-term challenges. VPNs extend the network perimeter and add operational overhead. Bastion hosts centralised access but still exposes public IP addresses and requires ongoing maintenance. In both cases, access is still fundamentally based on network location rather than identity.

This case study explores how we designed and implemented a fully private, zero-trust Kubernetes access model that eliminates the need for public endpoints, VPNs, and bastion hosts.

Challenge

At first glance, the platform appeared to follow good practices.

Clusters were provisioned using infrastructure-as-code, worker nodes were private, and deployments were handled through automated pipelines. From an operational standpoint, everything worked as expected.

However, a deeper review revealed a fundamental issue.

The Gap Between Perception and Reality

Although the worker nodes were private, the Kubernetes control plane remained publicly accessible. This meant:

The Kubernetes API could still be reached from outside the environment
Access controls relied on network restrictions rather than true isolation
In some cases, authorised networks were overly permissive

This created a misleading situation where the clusters appeared secure, but the primary entry point remained exposed.

Why This Became Critical

The Kubernetes API is the control surface for:

Deploying workloads
Modifying infrastructure
Accessing cluster state

Even when protected by IAM and network rules, a public endpoint introduces unnecessary risk. The team needed to eliminate this exposure entirely by moving to private endpoint-only clusters.

Security Broke the Delivery Model

Enabling private endpoints immediately exposed a deeper dependency.

The existing deployment workflow relied on:

External CI/CD pipelines
Direct communication with the Kubernetes API
Applying configuration (such as RootSync manifests) after cluster creation

Once the public endpoint was disabled:

Standard access methods stopped working
Pipelines could no longer complete deployments
The system lost its ability to bootstrap and update clusters

This revealed a key architectural flaw:

The platform’s delivery model depended on the very exposure it now needed to remove.

The Constraint: No VPNs or Bastion Hosts

The team also made a deliberate decision to avoid traditional solutions:

VPNs would introduce cost, complexity, and extend the network boundary
Bastion hosts would still expose public endpoints and require ongoing management

Instead, the solution needed to:

Keep clusters fully private
Avoid expanding the network perimeter
Provide secure, controlled access based on identity, not location

A Broader Pattern

A similar challenge had already been encountered in an AWS environment, where clusters were fully private, but our engineers still needed to:

Debug services
Access internal workloads
Connect to databases inside the cluster

This reinforced a key insight:

Removing public access is only half the problem. The real challenge is enabling secure, controlled access back into private systems.

Solution

To address these challenges, the team implemented a multi-layered redesign, aligning the platform with zero-trust principles and removing reliance on network-based access.

Step 1: Making Clusters Truly Private

The first step was to eliminate all public exposure.

Clusters were reconfigured to:

Enable private nodes and private endpoints
Disable public API access entirely
Restrict access to internal network ranges only

This ensured that:

The Kubernetes API could not be reached from the internet
All communication with the cluster originated inside the trusted environment

unnamed 4

Step 2: Moving Deployment Inside the Network

With external access removed, the deployment model had to change.

Traditionally, services such as GitHub Actions and Bitbucket Pipelines run on external infrastructure and connect into the target environment to perform deployments.

The team reversed this model.

The source code remained in GitHub and Bitbucket, but the execution environment was moved inside the private network. A self-hosted CI/CD runner securely pulls the pipeline jobs and source code, then performs all deployment steps from within the VPC.

This means code stays external, but execution happens internally.

As a result:

No public Kubernetes API endpoint is required
No inbound access to the cluster is needed
Existing CI/CD workflows continue with minimal changes

Internal CI/CD Execution

A private CI/CD runner was deployed inside the same VPC as the cluster.

This runner:

Executes pipeline jobs internally
Has direct access to the private Kubernetes API
Handles infrastructure provisioning and workload deployment

This allowed the team to:

Maintain existing pipelines
Avoid exposing the cluster
Restore full deployment functionality

From a developer perspective, workflows remained largely unchanged, while security improved significantly behind the scenes.

Step 3: Reducing Dependency on Direct Access (GitOps)

To further strengthen the model, the team introduced GitOps principles.

Instead of relying on direct API calls:

Desired state is defined in version-controlled manifests
The cluster continuously synchronises itself with that state

Using Config Sync:

Workloads are applied automatically inside the cluster
The need for inbound connectivity is reduced

This creates a more resilient and secure deployment pattern.

Step 4: Replacing Network Access with Identity-Based Access

A key part of the solution was removing reliance on network-level trust.

Instead of VPNs or bastion hosts, the team adopted identity-driven access patterns.

GCP Approach: Connect Gateway

Access to clusters is handled through a managed API layer:

Users authenticate via IAM
Requests are routed through Google’s control plane
Traffic is securely proxied to the private cluster

This means:

No direct endpoint exposure
No VPN required
No bastion infrastructure

Access becomes:

Centralised
Auditable
Based on identity-based controls

AWS Comparison: SSM Tunnel-Based Access

In AWS, the same zero-trust objectives are achieved using a different model.

Deployments via CI/CD

For automated deployments to private EKS clusters:

A self-hosted Bitbucket runner operates inside the AWS environment
The runner securely pulls pipeline jobs from Bitbucket
Deployment traffic reaches the cluster over internal network paths only
No public Kubernetes API endpoint is required

In this model, the pipeline execution environment runs inside the private network and initiates outbound connections to Bitbucket.

Human Access via SSM Tunnel

For engineers who need temporary access to services inside the cluster:

A sidecar container runs an AWS Systems Manager (SSM) Agent alongside the application
The agent registers with AWS Systems Manager
Engineers create an authenticated, short-lived tunnel from their local machine
Traffic is routed through AWS infrastructure directly to the target service

For example, a developer debugging a PostgreSQL instance can create a local tunnel that connects securely to the database inside the cluster without exposing it publicly.

Key Characteristics

This AWS model:

Requires no inbound network access
Uses short-lived, authenticated connections
Separates CI/CD access from human access
Supports tightly controlled debugging workflows
Keeps all cluster endpoints private

Google Cloud Comparison

In Google Cloud, both CI/CD systems and human users access private Google Kubernetes Engine clusters through GKE Connect Gateway.

The execution environment remains external, but all access is brokered through Google's control plane and governed by Google Cloud IAM.

Shared Principle

Although implemented differently, both AWS and Google Cloud follow the same architectural principles:

No public endpoints
No persistent network access
Identity-based authentication
Controlled, temporary connectivity

This is the foundation of a zero-trust architecture.

In Google Cloud, the control plane acts as the secure intermediary.

In AWS, trusted components run inside the private environment and establish outbound connections to external systems.

Both approaches eliminate public exposure while maintaining secure, auditable access for automation and engineers alike.

Step 5: Locking Down Network Egress

To complete the security model, outbound traffic was restricted.

A deny-by-default approach was implemented:

Only internal network ranges are allowed
Required cloud service endpoints are permitted
All other traffic is blocked

This ensures that workloads:

Cannot freely access the internet
Only communicate with approved services

Step 6: Enforcing Least Privilege IAM

IAM was restructured to align with the new model:

CI/CD systems have narrowly scoped permissions
Users require explicit access roles
Responsibilities are clearly separated

This reduces risk and ensures access is granted only where necessary.

Results

The outcome was a secure, scalable platform that aligns with modern zero-trust principles.

The platform was transformed from a traditionally secured Kubernetes environment into a fully private, identity-driven architecture based on zero-trust principles.

What Changed

All Kubernetes clusters now operate with private endpoints only
The Kubernetes API is no longer accessible from the internet
CI/CD pipelines continue to deploy without relying on public access
Engineers can securely access and debug services when needed
VPNs and bastion hosts have been completely eliminated
Outbound traffic is tightly controlled through restricted egress rules
IAM permissions follow least privilege principles

Security Improvements

The new architecture significantly reduced the attack surface by removing public endpoints and replacing network-based trust with identity-based access controls.

All access is now:

Authenticated and authorised
Fully auditable
Temporary where required
Independent of network location

Operational Benefits

These security improvements were achieved without disrupting day-to-day development.

Existing pipelines required minimal changes
Developer workflows remained familiar
Platform complexity is handled internally rather than by end users
Secure access is available when needed for troubleshooting and support

Future Readiness

The platform is now positioned to support further enhancements, including:

Full GitOps adoption
Advanced controls such as VPC Service Controls
Scalable multi-environment deployments
Secure break-glass access patterns

Outcome

The organisation now has a secure and scalable Kubernetes platform that delivers:

Fully private clusters
Zero-trust access
Seamless CI/CD
Controlled operational access
Reduced operational overhead
Improved security posture

Key Takeaway

This case study highlights a fundamental shift in platform security:

Secure access is no longer about protecting a network perimeter; it is about removing it entirely.

By replacing VPNs and bastion hosts with:

Private endpoints
Internal execution
Identity-based access
Controlled connectivity patterns

The team achieved a secure, zero-trust Kubernetes architecture without sacrificing usability, performance, or developer productivity.

Ready to move to a fully private, identity-driven model?

We can help you redesign your platform around private endpoints, internal execution, and zero-trust access, without compromising operability.

Reach out now

Blog

Case studies, strategies, and ideas shaping modern technology.