⚡
Milan.dev
>Home>Projects>Experience>Blog
GitHubLinkedIn
status: building
>Home>Projects>Experience>Blog
status: building

Connect

Let's collaborate on infrastructure challenges

Open to discussing DevOps strategies, cloud architecture optimization, security implementations, and interesting infrastructure problems.

send a message→

Find me elsewhere

GitHub
@milandangol
LinkedIn
/in/milan-dangol
Email
milandangol57@gmail.com
Forged with& code

© 2026 Milan Dangol — All systems reserved

back to blog
kubernetes

Implementing Private EKS with Transit Gateway and Hybrid Connectivity

Deploying a fully private EKS cluster with no public endpoints, Transit Gateway for multi-VPC and on-premises routing, PrivateLink for AWS services, and hybrid DNS resolution - achieving enterprise-grade network isolation.

M

Milan Dangol

Sr DevOps & DevSecOps Engineer

Jul 20, 2025
12 min read

Introduction

Running EKS with public endpoints is a non-starter for regulated industries. When I was tasked with building a Kubernetes platform for financial services, the requirements were clear:

  • No public IP addresses - anywhere
  • All traffic stays on private networks
  • Connectivity to on-premises data centers
  • Centralized egress for compliance logging

This post covers how I architected a fully private EKS deployment using Transit Gateway as the backbone for multi-VPC and hybrid connectivity.

Architecture Overview

flowchart TB subgraph OnPrem["On-Premises Data Center"] DC[Corporate Network] DNS_ONPREM[On-Prem DNS] APPS[Legacy Applications] end subgraph AWSCloud["AWS Cloud"] subgraph TransitGateway["Transit Gateway Hub"] TGW[Transit Gateway] TGW_RT_PROD[Prod Route Table] TGW_RT_SHARED[Shared Route Table] TGW_RT_EGRESS[Egress Route Table] end subgraph ProdVPC["Production VPC - 10.100.0.0/16"] subgraph PrivateSubnets["Private Subnets"] EKS_CP[EKS Control Plane ENIs] EKS_NODES[EKS Worker Nodes] PODS[Application Pods] end end subgraph SharedVPC["Shared Services VPC - 10.200.0.0/16"] subgraph SharedSubnets["Private Subnets"] ECR[ECR VPC Endpoint] S3[S3 VPC Endpoint] STS[STS VPC Endpoint] SECRETS[Secrets Manager Endpoint] end R53_RESOLVER[Route 53 Resolver] end subgraph EgressVPC["Egress VPC - 10.250.0.0/16"] NAT[NAT Gateway] PROXY[Squid Proxy] FIREWALL[Network Firewall] end subgraph SecurityVPC["Security VPC - 10.240.0.0/16"] SIEM[SIEM/Logging] VAULT[HashiCorp Vault] end end DC <-->|Direct Connect| TGW DNS_ONPREM <-->|DNS Forwarding| R53_RESOLVER TGW --> ProdVPC TGW --> SharedVPC TGW --> EgressVPC TGW --> SecurityVPC EKS_NODES --> ECR EKS_NODES --> S3 PODS --> SECRETS PODS -->|Outbound Internet| NAT NAT --> FIREWALL style OnPrem fill:#6c757d,stroke:#fff,stroke-width:2px,color:#fff style TransitGateway fill:#ff6b6b,stroke:#fff,stroke-width:2px,color:#fff style ProdVPC fill:#264653,stroke:#2a9d8f,stroke-width:2px,color:#fff style SharedVPC fill:#264653,stroke:#3a86ff,stroke-width:2px,color:#fff style EgressVPC fill:#264653,stroke:#f77f00,stroke-width:2px,color:#fff style SecurityVPC fill:#264653,stroke:#e63946,stroke-width:2px,color:#fff

Network CIDR Design

flowchart LR subgraph CIDRPlan["IP Address Plan"] direction TB subgraph AWS["AWS VPCs"] PROD["Production<br/>10.100.0.0/16"] SHARED["Shared Services<br/>10.200.0.0/16"] SECURITY["Security<br/>10.240.0.0/16"] EGRESS["Egress<br/>10.250.0.0/16"] end subgraph OnPremises["On-Premises"] DC1["Data Center 1<br/>172.16.0.0/12"] DC2["Data Center 2<br/>192.168.0.0/16"] end subgraph Reserved["Reserved for Growth"] FUTURE["Future VPCs<br/>10.0.0.0/8 range"] end end style CIDRPlan fill:#1a1a2e,stroke:#00d9ff,stroke-width:2px,color:#fff style AWS fill:#264653,stroke:#2a9d8f,stroke-width:2px,color:#fff style OnPremises fill:#6c757d,stroke:#fff,stroke-width:2px,color:#fff style Reserved fill:#3a3a5c,stroke:#fff,stroke-width:1px,color:#fff

Transit Gateway Configuration

# transit-gateway/main.tf

resource "aws_ec2_transit_gateway" "main" {
  description                     = "Central Transit Gateway"
  default_route_table_association = "disable"
  default_route_table_propagation = "disable"
  dns_support                     = "enable"
  vpn_ecmp_support               = "enable"

  tags = {
    Name = "central-tgw"
  }
}

# Route Tables for different traffic patterns
resource "aws_ec2_transit_gateway_route_table" "production" {
  transit_gateway_id = aws_ec2_transit_gateway.main.id
  
  tags = {
    Name = "tgw-rt-production"
  }
}

resource "aws_ec2_transit_gateway_route_table" "shared_services" {
  transit_gateway_id = aws_ec2_transit_gateway.main.id
  
  tags = {
    Name = "tgw-rt-shared-services"
  }
}

resource "aws_ec2_transit_gateway_route_table" "egress" {
  transit_gateway_id = aws_ec2_transit_gateway.main.id
  
  tags = {
    Name = "tgw-rt-egress"
  }
}

# VPC Attachments
resource "aws_ec2_transit_gateway_vpc_attachment" "production" {
  subnet_ids         = var.production_tgw_subnet_ids
  transit_gateway_id = aws_ec2_transit_gateway.main.id
  vpc_id             = var.production_vpc_id
  
  dns_support        = "enable"
  
  tags = {
    Name = "tgw-attach-production"
  }
}

resource "aws_ec2_transit_gateway_vpc_attachment" "shared_services" {
  subnet_ids         = var.shared_services_tgw_subnet_ids
  transit_gateway_id = aws_ec2_transit_gateway.main.id
  vpc_id             = var.shared_services_vpc_id
  
  dns_support        = "enable"
  
  tags = {
    Name = "tgw-attach-shared-services"
  }
}

resource "aws_ec2_transit_gateway_vpc_attachment" "egress" {
  subnet_ids         = var.egress_tgw_subnet_ids
  transit_gateway_id = aws_ec2_transit_gateway.main.id
  vpc_id             = var.egress_vpc_id
  
  appliance_mode_support = "enable"  # For firewall inspection
  dns_support            = "enable"
  
  tags = {
    Name = "tgw-attach-egress"
  }
}

# Direct Connect Gateway attachment
resource "aws_dx_gateway_association" "main" {
  dx_gateway_id         = var.dx_gateway_id
  associated_gateway_id = aws_ec2_transit_gateway.main.id
  
  allowed_prefixes = [
    "10.0.0.0/8",
  ]
}

Transit Gateway Route Tables

flowchart TB subgraph TGWRouting["Transit Gateway Route Tables"] subgraph ProdRT["Production Route Table"] PR1["10.200.0.0/16 -> Shared Services"] PR2["10.250.0.0/16 -> Egress VPC"] PR3["172.16.0.0/12 -> Direct Connect"] PR4["0.0.0.0/0 -> Egress VPC"] end subgraph SharedRT["Shared Services Route Table"] SR1["10.100.0.0/16 -> Production"] SR2["10.240.0.0/16 -> Security"] SR3["172.16.0.0/12 -> Direct Connect"] end subgraph EgressRT["Egress Route Table"] ER1["10.0.0.0/8 -> Blackhole"] ER2["Return traffic only"] end end style TGWRouting fill:#1a1a2e,stroke:#00d9ff,stroke-width:2px,color:#fff style ProdRT fill:#264653,stroke:#2a9d8f,stroke-width:2px,color:#fff style SharedRT fill:#264653,stroke:#3a86ff,stroke-width:2px,color:#fff style EgressRT fill:#264653,stroke:#f77f00,stroke-width:2px,color:#fff
# Transit Gateway Routes

# Production can reach Shared Services
resource "aws_ec2_transit_gateway_route" "prod_to_shared" {
  destination_cidr_block         = "10.200.0.0/16"
  transit_gateway_attachment_id  = aws_ec2_transit_gateway_vpc_attachment.shared_services.id
  transit_gateway_route_table_id = aws_ec2_transit_gateway_route_table.production.id
}

# Production default route to Egress for internet
resource "aws_ec2_transit_gateway_route" "prod_to_egress" {
  destination_cidr_block         = "0.0.0.0/0"
  transit_gateway_attachment_id  = aws_ec2_transit_gateway_vpc_attachment.egress.id
  transit_gateway_route_table_id = aws_ec2_transit_gateway_route_table.production.id
}

# Production to on-premises via Direct Connect
resource "aws_ec2_transit_gateway_route" "prod_to_onprem" {
  destination_cidr_block         = "172.16.0.0/12"
  transit_gateway_attachment_id  = aws_ec2_transit_gateway_dx_gateway_attachment.main.id
  transit_gateway_route_table_id = aws_ec2_transit_gateway_route_table.production.id
}

# Route table associations
resource "aws_ec2_transit_gateway_route_table_association" "production" {
  transit_gateway_attachment_id  = aws_ec2_transit_gateway_vpc_attachment.production.id
  transit_gateway_route_table_id = aws_ec2_transit_gateway_route_table.production.id
}

Private EKS Cluster

# eks/main.tf

resource "aws_eks_cluster" "private" {
  name     = var.cluster_name
  role_arn = aws_iam_role.cluster.arn
  version  = var.kubernetes_version

  vpc_config {
    subnet_ids              = var.private_subnet_ids
    endpoint_private_access = true
    endpoint_public_access  = false
    security_group_ids      = [aws_security_group.cluster.id]
  }

  # Enable secrets encryption
  encryption_config {
    provider {
      key_arn = var.kms_key_arn
    }
    resources = ["secrets"]
  }

  # Control plane logging
  enabled_cluster_log_types = [
    "api",
    "audit",
    "authenticator",
    "controllerManager",
    "scheduler"
  ]

  depends_on = [
    aws_iam_role_policy_attachment.cluster_policy,
    aws_cloudwatch_log_group.cluster,
  ]
}

# Security group for cluster
resource "aws_security_group" "cluster" {
  name_prefix = "${var.cluster_name}-cluster-"
  vpc_id      = var.vpc_id

  # Allow nodes to communicate with control plane
  ingress {
    from_port       = 443
    to_port         = 443
    protocol        = "tcp"
    security_groups = [aws_security_group.nodes.id]
  }

  # Allow control plane to communicate with nodes
  egress {
    from_port       = 1025
    to_port         = 65535
    protocol        = "tcp"
    security_groups = [aws_security_group.nodes.id]
  }

  tags = {
    Name = "${var.cluster_name}-cluster-sg"
  }
}

VPC Endpoints for Private Access

flowchart LR subgraph EKSNodes["EKS Worker Nodes"] NODE1[Node 1] NODE2[Node 2] NODE3[Node 3] end subgraph VPCEndpoints["VPC Endpoints - Shared Services VPC"] ECR_API[ecr.api] ECR_DKR[ecr.dkr] S3_EP[s3] STS_EP[sts] EC2_EP[ec2] LOGS_EP[logs] SSM_EP[ssm] SECRETS_EP[secretsmanager] KMS_EP[kms] end subgraph AWSServices["AWS Services"] ECR_SVC[ECR] S3_SVC[S3] STS_SVC[STS] CW_SVC[CloudWatch] end NODE1 & NODE2 & NODE3 --> ECR_API & ECR_DKR NODE1 & NODE2 & NODE3 --> S3_EP NODE1 & NODE2 & NODE3 --> STS_EP NODE1 & NODE2 & NODE3 --> LOGS_EP ECR_API & ECR_DKR --> ECR_SVC S3_EP --> S3_SVC STS_EP --> STS_SVC LOGS_EP --> CW_SVC style EKSNodes fill:#264653,stroke:#2a9d8f,stroke-width:2px,color:#fff style VPCEndpoints fill:#1a1a2e,stroke:#3a86ff,stroke-width:2px,color:#fff style AWSServices fill:#0d1b2a,stroke:#f77f00,stroke-width:2px,color:#fff
# vpc-endpoints/main.tf

locals {
  interface_endpoints = [
    "ec2",
    "ecr.api",
    "ecr.dkr",
    "sts",
    "logs",
    "ssm",
    "ssmmessages",
    "ec2messages",
    "secretsmanager",
    "kms",
    "elasticloadbalancing",
    "autoscaling",
  ]
}

# Interface endpoints
resource "aws_vpc_endpoint" "interface" {
  for_each = toset(local.interface_endpoints)

  vpc_id              = var.vpc_id
  service_name        = "com.amazonaws.${var.region}.${each.value}"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = var.endpoint_subnet_ids
  security_group_ids  = [aws_security_group.endpoints.id]
  private_dns_enabled = true

  tags = {
    Name = "vpce-${each.value}"
  }
}

# S3 Gateway endpoint (free, no PrivateLink charges)
resource "aws_vpc_endpoint" "s3" {
  vpc_id            = var.vpc_id
  service_name      = "com.amazonaws.${var.region}.s3"
  vpc_endpoint_type = "Gateway"
  route_table_ids   = var.private_route_table_ids

  tags = {
    Name = "vpce-s3"
  }
}

# Security group for VPC endpoints
resource "aws_security_group" "endpoints" {
  name_prefix = "vpc-endpoints-"
  vpc_id      = var.vpc_id

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = [var.vpc_cidr, "10.100.0.0/16"]  # Include EKS VPC
  }

  tags = {
    Name = "vpc-endpoints-sg"
  }
}

Hybrid DNS Resolution

flowchart TB subgraph OnPrem["On-Premises"] CORP_DNS[Corporate DNS<br/>172.16.1.10] CORP_ZONES["Zones:<br/>corp.internal<br/>legacy.local"] end subgraph SharedVPC["Shared Services VPC"] subgraph R53Resolver["Route 53 Resolver"] INBOUND[Inbound Endpoint<br/>10.200.10.10, 10.200.11.10] OUTBOUND[Outbound Endpoint<br/>10.200.10.20, 10.200.11.20] end R53_RULES["Resolver Rules"] end subgraph ProdVPC["Production VPC"] EKS_PODS[EKS Pods] VPC_DNS[VPC DNS<br/>10.100.0.2] end subgraph Route53["Route 53"] PRIVATE_ZONES["Private Hosted Zones:<br/>eks.internal<br/>prod.aws.internal"] end EKS_PODS -->|"*.corp.internal"| VPC_DNS VPC_DNS -->|Forward| OUTBOUND OUTBOUND -->|Forward| CORP_DNS CORP_DNS -->|Response| OUTBOUND OUTBOUND -->|Response| EKS_PODS CORP_DNS -->|"*.eks.internal"| INBOUND INBOUND -->|Resolve| PRIVATE_ZONES PRIVATE_ZONES -->|Response| INBOUND INBOUND -->|Response| CORP_DNS style OnPrem fill:#6c757d,stroke:#fff,stroke-width:2px,color:#fff style SharedVPC fill:#264653,stroke:#3a86ff,stroke-width:2px,color:#fff style ProdVPC fill:#264653,stroke:#2a9d8f,stroke-width:2px,color:#fff style Route53 fill:#1a1a2e,stroke:#f77f00,stroke-width:2px,color:#fff
# dns/route53-resolver.tf

# Inbound endpoint - allows on-prem to resolve AWS private zones
resource "aws_route53_resolver_endpoint" "inbound" {
  name      = "inbound-resolver"
  direction = "INBOUND"

  security_group_ids = [aws_security_group.resolver.id]

  dynamic "ip_address" {
    for_each = var.resolver_subnet_ids
    content {
      subnet_id = ip_address.value
    }
  }

  tags = {
    Name = "r53-resolver-inbound"
  }
}

# Outbound endpoint - allows AWS to resolve on-prem domains
resource "aws_route53_resolver_endpoint" "outbound" {
  name      = "outbound-resolver"
  direction = "OUTBOUND"

  security_group_ids = [aws_security_group.resolver.id]

  dynamic "ip_address" {
    for_each = var.resolver_subnet_ids
    content {
      subnet_id = ip_address.value
    }
  }

  tags = {
    Name = "r53-resolver-outbound"
  }
}

# Forward rule for on-prem domain
resource "aws_route53_resolver_rule" "forward_corp" {
  domain_name          = "corp.internal"
  name                 = "forward-to-corp-dns"
  rule_type            = "FORWARD"
  resolver_endpoint_id = aws_route53_resolver_endpoint.outbound.id

  target_ip {
    ip   = "172.16.1.10"
    port = 53
  }

  target_ip {
    ip   = "172.16.1.11"
    port = 53
  }

  tags = {
    Name = "forward-corp-internal"
  }
}

# Associate rule with Production VPC
resource "aws_route53_resolver_rule_association" "production" {
  resolver_rule_id = aws_route53_resolver_rule.forward_corp.id
  vpc_id           = var.production_vpc_id
}

# Share resolver rules via RAM
resource "aws_ram_resource_share" "resolver_rules" {
  name                      = "resolver-rules-share"
  allow_external_principals = false

  tags = {
    Name = "resolver-rules-share"
  }
}

resource "aws_ram_resource_association" "resolver_rule" {
  resource_arn       = aws_route53_resolver_rule.forward_corp.arn
  resource_share_arn = aws_ram_resource_share.resolver_rules.arn
}

Centralized Egress with Inspection

# egress-vpc/main.tf

# NAT Gateway for outbound internet
resource "aws_nat_gateway" "egress" {
  count         = length(var.public_subnet_ids)
  allocation_id = aws_eip.nat[count.index].id
  subnet_id     = var.public_subnet_ids[count.index]

  tags = {
    Name = "nat-egress-${count.index + 1}"
  }
}

# Network Firewall for traffic inspection
resource "aws_networkfirewall_firewall" "main" {
  name                = "egress-firewall"
  firewall_policy_arn = aws_networkfirewall_firewall_policy.main.arn
  vpc_id              = var.vpc_id

  dynamic "subnet_mapping" {
    for_each = var.firewall_subnet_ids
    content {
      subnet_id = subnet_mapping.value
    }
  }

  tags = {
    Name = "egress-network-firewall"
  }
}

# Firewall policy
resource "aws_networkfirewall_firewall_policy" "main" {
  name = "egress-policy"

  firewall_policy {
    stateless_default_actions          = ["aws:forward_to_sfe"]
    stateless_fragment_default_actions = ["aws:forward_to_sfe"]

    stateful_rule_group_reference {
      resource_arn = aws_networkfirewall_rule_group.allow_domains.arn
    }

    stateful_rule_group_reference {
      resource_arn = aws_networkfirewall_rule_group.deny_all.arn
      priority     = 100
    }
  }
}

# Allow specific domains
resource "aws_networkfirewall_rule_group" "allow_domains" {
  capacity = 100
  name     = "allow-domains"
  type     = "STATEFUL"

  rule_group {
    rules_source {
      rules_source_list {
        generated_rules_type = "ALLOWLIST"
        target_types         = ["HTTP_HOST", "TLS_SNI"]
        targets = [
          ".amazonaws.com",
          ".docker.io",
          ".docker.com",
          "ghcr.io",
          ".github.com",
          ".githubusercontent.com",
          "registry.k8s.io",
        ]
      }
    }
  }
}

# Deny everything else
resource "aws_networkfirewall_rule_group" "deny_all" {
  capacity = 10
  name     = "deny-all"
  type     = "STATEFUL"

  rule_group {
    rules_source {
      stateful_rule {
        action = "DROP"
        header {
          destination      = "ANY"
          destination_port = "ANY"
          direction        = "ANY"
          protocol         = "IP"
          source           = "ANY"
          source_port      = "ANY"
        }
        rule_option {
          keyword = "sid:100"
        }
      }
    }
  }
}

Traffic Flow Diagram

sequenceDiagram participant Pod as EKS Pod participant Node as EKS Node participant TGW as Transit Gateway participant FW as Network Firewall participant NAT as NAT Gateway participant IGW as Internet Gateway participant Ext as External Service Pod->>Node: Outbound request Node->>TGW: Route to 0.0.0.0/0 TGW->>FW: Inspect traffic alt Allowed domain FW->>NAT: Forward traffic NAT->>IGW: SNAT to public IP IGW->>Ext: Request Ext-->>IGW: Response IGW-->>NAT: Response NAT-->>FW: Response FW-->>TGW: Response TGW-->>Node: Response Node-->>Pod: Response else Blocked domain FW-->>TGW: DROP Note over Pod: Connection timeout end

EKS Node Security Group

# eks/security-groups.tf

resource "aws_security_group" "nodes" {
  name_prefix = "${var.cluster_name}-nodes-"
  vpc_id      = var.vpc_id

  # Node to node communication
  ingress {
    from_port = 0
    to_port   = 0
    protocol  = "-1"
    self      = true
  }

  # Control plane to nodes (kubelet)
  ingress {
    from_port       = 10250
    to_port         = 10250
    protocol        = "tcp"
    security_groups = [aws_security_group.cluster.id]
  }

  # Control plane to nodes (extensions)
  ingress {
    from_port       = 443
    to_port         = 443
    protocol        = "tcp"
    security_groups = [aws_security_group.cluster.id]
  }

  # CoreDNS
  ingress {
    from_port = 53
    to_port   = 53
    protocol  = "tcp"
    self      = true
  }

  ingress {
    from_port = 53
    to_port   = 53
    protocol  = "udp"
    self      = true
  }

  # Allow traffic from on-premises (for hybrid workloads)
  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["172.16.0.0/12"]
    description = "On-premises access"
  }

  # Egress to VPC endpoints
  egress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["10.200.0.0/16"]
    description = "VPC endpoints in shared services"
  }

  # Egress to control plane
  egress {
    from_port       = 443
    to_port         = 443
    protocol        = "tcp"
    security_groups = [aws_security_group.cluster.id]
  }

  # Egress to Transit Gateway (for all other traffic)
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "${var.cluster_name}-nodes-sg"
    "kubernetes.io/cluster/${var.cluster_name}" = "owned"
  }
}

Accessing Private Cluster

flowchart LR subgraph Access["Cluster Access Methods"] direction TB subgraph VPN["VPN Access"] ENGINEER[Engineer Laptop] CLIENT_VPN[AWS Client VPN] end subgraph Bastion["Bastion Host"] SSM[SSM Session Manager] BASTION_HOST[Private Bastion] end subgraph CICD["CI/CD Access"] RUNNER[GitHub Runner] RUNNER_VPC[Runner in VPC] end end subgraph EKS["Private EKS"] API[Kubernetes API] end ENGINEER --> CLIENT_VPN CLIENT_VPN --> API SSM --> BASTION_HOST BASTION_HOST --> API RUNNER --> RUNNER_VPC RUNNER_VPC --> API style Access fill:#1a1a2e,stroke:#00d9ff,stroke-width:2px,color:#fff style EKS fill:#264653,stroke:#2a9d8f,stroke-width:2px,color:#fff
# access/client-vpn.tf

resource "aws_ec2_client_vpn_endpoint" "main" {
  description            = "EKS cluster access VPN"
  server_certificate_arn = var.vpn_server_cert_arn
  client_cidr_block      = "10.254.0.0/16"
  split_tunnel           = true

  authentication_options {
    type                       = "certificate-authentication"
    root_certificate_chain_arn = var.vpn_client_cert_arn
  }

  connection_log_options {
    enabled               = true
    cloudwatch_log_group  = aws_cloudwatch_log_group.vpn.name
    cloudwatch_log_stream = aws_cloudwatch_log_stream.vpn.name
  }

  dns_servers = [
    "10.200.10.10",  # Route 53 Resolver inbound
    "10.200.11.10",
  ]

  tags = {
    Name = "eks-access-vpn"
  }
}

resource "aws_ec2_client_vpn_network_association" "main" {
  count                  = length(var.vpn_subnet_ids)
  client_vpn_endpoint_id = aws_ec2_client_vpn_endpoint.main.id
  subnet_id              = var.vpn_subnet_ids[count.index]
}

# Authorization rule for EKS VPC
resource "aws_ec2_client_vpn_authorization_rule" "eks" {
  client_vpn_endpoint_id = aws_ec2_client_vpn_endpoint.main.id
  target_network_cidr    = "10.100.0.0/16"
  authorize_all_groups   = true
}

Best Practices

Practice Why
Use Gateway endpoints for S3 Free, no PrivateLink charges
Centralize VPC endpoints Share across VPCs via Transit Gateway
Enable DNS support on TGW Required for cross-VPC DNS resolution
Use appliance mode for firewalls Ensures symmetric routing
Log all firewall decisions Compliance and troubleshooting
Separate egress VPC Isolate internet-facing infrastructure

Troubleshooting

"EKS nodes can't pull images"

# Check VPC endpoint connectivity
aws ec2 describe-vpc-endpoints --filters "Name=service-name,Values=*ecr*"

# Test from node (via SSM)
curl -v https://api.ecr.<region>.amazonaws.com/

# Check security groups allow 443 to endpoint

"Pods can't resolve on-prem DNS"

# Verify resolver rules
aws route53resolver list-resolver-rules

# Check rule associations
aws route53resolver list-resolver-rule-associations

# Test DNS from pod
kubectl run -it --rm debug --image=busybox -- nslookup app.corp.internal

"Traffic not reaching firewall"

  • Verify Transit Gateway route tables
  • Check appliance mode is enabled on egress attachment
  • Verify return routes in egress VPC

Conclusion

Building a fully private EKS cluster requires careful network architecture, but the security benefits are substantial. The combination of:

  • Transit Gateway for scalable multi-VPC connectivity
  • VPC Endpoints for private AWS service access
  • Route 53 Resolver for hybrid DNS
  • Network Firewall for egress inspection

Creates an enterprise-grade platform that satisfies even the most stringent compliance requirements. No public IPs, all traffic inspected, and full connectivity to on-premises systems.

Share this article

Tags

#eks#transit-gateway#privatelink#networking#hybrid-cloud#security

Related Articles

cloud9 min read

Multi-Region AWS Infrastructure for Resilience: A Terraform Deep Dive

Learn how to architect highly available, multi-region AWS infrastructure using Terraform, Transit Gateway, Network Load Balancers, and intelligent routing strategies for enterprise-grade applications.

cloud12 min read

Engineering AWS NLB Infrastructure for Financial Services Proxy Networks

Designing a multi-environment AWS NLB infrastructure for financial services using Terraform - featuring dual internal/external load balancers, JSON-driven per-port IP whitelists, intelligent port-range routing, and Transit Gateway hybrid connectivity.

kubernetes11 min read

Zero-Downtime EKS Upgrades in Production

Implementing a blue-green node group strategy for EKS cluster upgrades with automated rollback, PodDisruptionBudgets, and Terraform orchestration - achieving zero customer impact during Kubernetes version upgrades.