Skip to content

Network Architecture

Security | Back to index


This document covers the network topology of a deployed gateway: which components initiate which connections, how NetworkPolicy rules implement the isolation boundary, and how to validate that isolation is correctly enforced.


Component Connection Map

  System namespace (gmc-system)
  ═════════════════════════════
    GMC ──(1)──▶ K8s API Server (in-cluster) ─────────────┐
  Tenant namespace                                        │
  ════════════════                                        │
    AGC ──(2)──▶ K8s API Server (via service CIDR) ───────┘
     └─(3)──▶ Proxy ClusterIP Service ──(4)──▶ GitHub (external)
    Worker Pod ──(5)─────────┘

All GitHub-bound traffic — from both the AGC and worker pods — is routed through the per-tenant egress proxy pool. Kubernetes API traffic from the AGC travels directly in-cluster and bypasses the proxy.


Connection Inventory

# Initiator Destination Protocol In-cluster? Via proxy?
1 GMC K8s API server HTTPS (443 / 6443) Yes No
2 AGC K8s API server HTTPS (443 / 6443) Yes No
3 AGC Proxy ClusterIP Service HTTPS CONNECT (8080) Yes
4 Proxy pod GitHub API endpoints (see below) HTTPS (443) No (egress)
5 Worker pod Proxy ClusterIP Service HTTPS CONNECT (8080) Yes

Connections (3) and (5) to the proxy are HTTPS, not plain HTTP. The GMC generates a per-tenant self-signed cert for the proxy at provisioning time and pins it into the AGC's trust store (W7 / M-5). This protects the AGC↔proxy hop from in-cluster eavesdropping or impersonation by any tenant whose pods can reach the Service ClusterIP.

The GMC also makes one additional outbound call: GET https://api.github.com/meta every 24 hours to refresh the GitHub IP ranges used in tenant NetworkPolicy egress rules. This call originates from the GMC's own egress path in the system namespace, not through any tenant's proxy pool.

GitHub Endpoints Reached via Proxy

Endpoint Used by Purpose
api.github.com AGC GitHub App token exchange, rerun API
*.actions.githubusercontent.com AGC Broker API (GetMessage, AcquireJob, RenewJob)
pipelines.actions.githubusercontent.com Worker pod Twirp Results Service (live log streaming)
objects.githubusercontent.com Worker pod Action source downloads

GitHub publishes its current IP ranges at https://api.github.com/meta under the actions key. The GMC uses this list to populate proxy pod NetworkPolicy egress rules and refreshes them every 24 hours. If spec.proxy.managedNetworkPolicy is false, operators are responsible for keeping egress rules current.


NetworkPolicy Rules

The GMC creates three NetworkPolicy objects per tenant in the tenant namespace. The split (over a single combined policy) closes M-12 — worker pods inherit egress to the proxy and DNS only, not the Kubernetes API server. Only the AGC Deployment has API-server egress.

Policy 1: actions-gateway-workload — AGC and worker pods → proxy + DNS

Selects all "workload" pods (AGC and worker) by the actions-gateway/component: workload label. Allows egress to the proxy pods (port 8080) and DNS only. Denies all ingress.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: actions-gateway-workload
  namespace: <tenant>
spec:
  podSelector:
    matchLabels:
      actions-gateway/component: workload
  policyTypes:
    - Ingress
    - Egress
  ingress: []  # no ingress permitted
  egress:
    # DNS — needed for resolving the proxy Service name. Confined to cluster DNS,
    # not "any resolver": an open port-53 rule is an unattributed exfiltration
    # side-channel (Q105). Two OR'd peers cover both delivery paths: the kube-dns
    # / CoreDNS pods in kube-system (direct path), and the link-local block
    # 169.254.0.0/16 for NodeLocal DNSCache clusters where pods send DNS to a
    # per-node hostNetwork cache (Q136). Link-local is non-routable, so it does
    # not widen the exfil surface.
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
          podSelector:
            matchLabels:
              k8s-app: kube-dns
        - ipBlock:
            cidr: 169.254.0.0/16
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53
    # Proxy pods — selected by PodSelector, NOT the Service ClusterIP. kube-proxy
    # DNATs ClusterIP → PodIP before NetworkPolicy enforcement, so an
    # `ipBlock: <ClusterIP>/32` rule never matches actual packets and silently
    # drops all proxy-bound traffic (the PR #59 trap). Selecting the proxy pods
    # directly matches post-DNAT destinations and survives proxy pod churn from
    # rolling updates and HPA scaling.
    - to:
        - podSelector:
            matchLabels:
              app: actions-gateway-proxy
      ports:
        - port: 8080
          protocol: TCP

Policy 2: actions-gateway-controller — AGC → Kubernetes API server

Selects the AGC Deployment pods by app: actions-gateway-controller. Adds (additively) egress to the Kubernetes API server on ports 443 and 6443. Worker pods do not match this selector and so have no API-server egress.

Both apiserver ports are listed deliberately. NetworkPolicy port matches are evaluated against the post-DNAT destination port. Most production clusters expose the apiserver via the kubernetes Service at 443 → backends on 443, so a 443-only rule works. Kind (and any cluster where the apiserver Endpoints listen on 6443) translates ClusterIP 10.96.0.1:443<node-ip>:6443, and the policy evaluator sees 6443 — a 443-only rule silently drops every k8s API call. Allowing both ports keeps the rule precise (only apiserver-style ports) while working in both topologies. See docs/development/networkpolicy-port-matching.md for the diagnosis and a worked repro.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: actions-gateway-controller
  namespace: <tenant>
spec:
  podSelector:
    matchLabels:
      app: actions-gateway-controller
  policyTypes:
    - Egress
  egress:
    # DNS — confined to cluster DNS (kube-dns / CoreDNS in kube-system) plus the
    # link-local block for NodeLocal DNSCache; see Q105/Q136.
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
          podSelector:
            matchLabels:
              k8s-app: kube-dns
        - ipBlock:
            cidr: 169.254.0.0/16
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53
    # Kubernetes API server — ports 443 and 6443 to any destination.
    # Both ports are needed because NetworkPolicy enforcement evaluates
    # post-DNAT: production clusters typically expose the apiserver on 443,
    # kind translates Service:443 → node:6443. Allowing both works in both.
    - ports:
        - port: 443
          protocol: TCP
        - port: 6443
          protocol: TCP

Policy 3: actions-gateway-proxy — Proxy pods → GitHub

Selects proxy pods by app: actions-gateway-proxy. Allows ingress only from "workload" pods on port 8080, and egress only to GitHub IP ranges (port 443) and DNS.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: actions-gateway-proxy
  namespace: <tenant>
spec:
  podSelector:
    matchLabels:
      app: actions-gateway-proxy
  policyTypes:
    - Ingress
    - Egress
  ingress:
    # Only workload pods (AGC and workers) may CONNECT to the proxy
    - from:
        - podSelector:
            matchLabels:
              actions-gateway/component: workload
      ports:
        - port: 8080
          protocol: TCP
  egress:
    # DNS — proxy resolves GitHub hostnames on behalf of clients. Confined to
    # cluster DNS (kube-dns / CoreDNS in kube-system) plus the link-local block
    # for NodeLocal DNSCache; kube-dns recurses upstream so external names still
    # resolve, but the proxy cannot reach an arbitrary resolver — closing the
    # open-DNS exfiltration side-channel (Q105/Q136).
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
          podSelector:
            matchLabels:
              k8s-app: kube-dns
        - ipBlock:
            cidr: 169.254.0.0/16
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53
    # GitHub IP ranges — populated from api.github.com/meta, refreshed every 24h
    - to:
        - ipBlock:
            cidr: 192.30.252.0/22
        - ipBlock:
            cidr: 185.199.108.0/22
        - ipBlock:
            cidr: 140.82.112.0/20
        # ... additional ranges from api.github.com/meta .actions
      ports:
        - port: 443
          protocol: TCP

The actual IP ranges are fetched at provisioning time and refreshed every 24 hours. The example CIDRs above are illustrative; the authoritative list is at https://api.github.com/meta.

If spec.proxy.managedNetworkPolicy: false is set, the GMC omits the GitHub-CIDR egress rule from Policy 3 — operators using FQDN-based egress policies (Cilium, Calico) provide their own equivalent rule and the GMC stops fighting them on every IP range refresh.

DNS Resolution

All in-cluster service discovery uses Kubernetes DNS (kube-dns / CoreDNS). The proxy pool is reachable from the AGC and worker pods via the ClusterIP Service name: actions-gateway-proxy.<namespace>.svc.cluster.local. The NO_PROXY env var includes kubernetes.default.svc.cluster.local and the cluster service CIDR so that Kubernetes API calls are never routed through the egress proxy.

External DNS resolution (for GitHub hostnames) is performed by the proxy pods themselves, not by the AGC or worker pods — the AGC and workers connect to the proxy using CONNECT <hostname>:<port> and the proxy resolves the hostname on their behalf. This means the proxy pods must have egress access to the cluster's DNS resolver in addition to GitHub's IP ranges.

DNS egress on all three policies is confined to cluster DNS rather than left open to any resolver (Q105). An unrestricted port-53 rule (to: []) would let any pod smuggle data to an attacker-controlled resolver — an unattributed side-channel that bypasses the per-tenant egress-IP attribution every other egress path enforces. Confining DNS to the in-cluster resolver keeps resolution on the attributable path: kube-dns recurses upstream on the pod's behalf, so external GitHub names still resolve while no pod can reach an arbitrary DNS server directly.

Each DNS rule allows two OR'd peers, covering the two ways a pod reaches cluster DNS:

  • Direct path — the kube-dns / CoreDNS Service in kube-system, matched by namespaceSelector on the well-known kubernetes.io/metadata.name: kube-system label plus a podSelector on the conventional k8s-app: kube-dns label.
  • NodeLocal DNSCache path — the IPv4 link-local block 169.254.0.0/16, matched by an ipBlock (Q136). On clusters running NodeLocal DNSCache (node-local-dns), pods send DNS to a link-local address (169.254.20.10 by the kube-standard __PILLAR__LOCAL__DNS__ convention) served by a per-node hostNetwork DNSCache pod, which no pod/namespace selector can match. Allowing the whole link-local block is the simplest correct rule and preserves Q105's attribution property: 169.254.0.0/16 is non-routable and node-scoped, so it cannot reach an external resolver — the DNS-exfiltration channel Q105 closed stays closed.

Operators running a DNS service under a non-standard namespace or pod label must adjust the selector accordingly (or supply their own equivalent rule under spec.proxy.managedNetworkPolicy: false).


How to Validate Network Isolation

The AGC and proxy container images are distroless (no shell, no curl), so kubectl exec against the running pods can only inspect process state, not run probes. Instead, schedule a short-lived curlimages/curl pod and apply the same labels as the workload you want to simulate — Kubernetes selects NetworkPolicies by label, so a curl pod with actions-gateway/component: workload is governed by the same rules as the AGC and worker pods.

The negative checks below only hold on a CNI that enforces egress NetworkPolicy (Calico, Cilium, …). NetworkPolicy objects are inert without a CNI enforcer, and kind's default kindnet demonstrably does not drop egress for these cases — a "blocked" expectation will spuriously succeed there. Production clusters must run an egress-enforcing CNI for the workload isolation described in this document to exist at runtime. The workload-pod negatives below are automated as the Tier-A specs E2E_GMC_TenantProvisioning_WorkloadEgressBlockedToNonProxyPod and E2E_GMC_TenantProvisioning_WorkerCannotReachK8sAPI, observed enforcing on a Calico kind cluster (make e2e-cluster KIND_CNI=calico) on 2026-06-11 — see the worker-egress-proxy plan.

Confirm a workload pod can reach GitHub via the proxy

kubectl run nettest-workload -n <namespace> --rm -it --restart=Never \
  --image=curlimages/curl:latest \
  --labels='actions-gateway/component=workload' \
  --overrides='{"spec":{"automountServiceAccountToken":false}}' \
  -- curl -x https://actions-gateway-proxy:8080 -sI https://api.github.com
# Expected: HTTP/2 200

Confirm a workload pod cannot reach GitHub directly (bypassing proxy)

kubectl run nettest-workload -n <namespace> --rm -it --restart=Never \
  --image=curlimages/curl:latest \
  --labels='actions-gateway/component=workload' \
  --overrides='{"spec":{"automountServiceAccountToken":false}}' \
  -- curl --noproxy '*' -sI --connect-timeout 5 https://api.github.com
# Expected: connection timeout (actions-gateway-workload NetworkPolicy blocks direct egress)

Confirm a worker-like pod cannot reach the Kubernetes API server

The actions-gateway-controller NetworkPolicy only matches pods labelled app=actions-gateway-controller, so worker pods (labelled actions-gateway/component=workload but not the AGC app label) have no API-server egress.

kubectl run nettest-worker -n <namespace> --rm -it --restart=Never \
  --image=curlimages/curl:latest \
  --labels='actions-gateway/component=workload' \
  --overrides='{"spec":{"automountServiceAccountToken":false}}' \
  -- curl --noproxy '*' -sI --connect-timeout 5 https://kubernetes.default.svc
# Expected: connection timeout

Confirm nothing can open a connection to a worker pod (ingress default-deny)

Worker pods run untrusted job code and accept no inbound by design — the workload NP declares policyTypes: [Ingress, Egress] with an empty ingress rule set, so all ingress is denied (Q128). Start a workload-labelled listener, then probe it from an unrelated pod: the connection must fail.

# Listener: a workload-labelled pod serving on 8000 (simulates a worker pod).
kubectl run nettest-listener -n <namespace> --restart=Never \
  --image=python:3-alpine \
  --labels='actions-gateway/component=workload' \
  --overrides='{"spec":{"automountServiceAccountToken":false}}' \
  -- python3 -m http.server 8000
kubectl wait -n <namespace> --for=condition=Ready pod/nettest-listener

# Probe from an unlabelled pod in the same namespace.
LISTENER_IP=$(kubectl get pod nettest-listener -n <namespace> -o jsonpath='{.status.podIP}')
kubectl run nettest-prober -n <namespace> --rm -it --restart=Never \
  --image=curlimages/curl:latest \
  --overrides='{"spec":{"automountServiceAccountToken":false}}' \
  -- curl --noproxy '*' -sI --connect-timeout 5 "http://${LISTENER_IP}:8000"
# Expected: connection timeout (workload NP denies all ingress to worker pods)
kubectl delete pod nettest-listener -n <namespace>

Confirm a proxy-labelled pod can reach GitHub

kubectl run nettest-proxy -n <namespace> --rm -it --restart=Never \
  --image=curlimages/curl:latest \
  --labels='app=actions-gateway-proxy' \
  --overrides='{"spec":{"automountServiceAccountToken":false}}' \
  -- curl --noproxy '*' -sI --connect-timeout 5 https://api.github.com
# Expected: HTTP/2 200

Confirm a proxy-labelled pod cannot reach the K8s API server

kubectl run nettest-proxy -n <namespace> --rm -it --restart=Never \
  --image=curlimages/curl:latest \
  --labels='app=actions-gateway-proxy' \
  --overrides='{"spec":{"automountServiceAccountToken":false}}' \
  -- curl --noproxy '*' -sI --connect-timeout 5 https://kubernetes.default.svc
# Expected: connection timeout (proxy pods have no K8s API egress rule)

Confirm cross-tenant isolation

From tenant A's namespace, confirm a workload-labelled pod cannot reach tenant B's proxy:

kubectl run nettest-xtenant -n <tenant-a-namespace> --rm -it --restart=Never \
  --image=curlimages/curl:latest \
  --labels='actions-gateway/component=workload' \
  --overrides='{"spec":{"automountServiceAccountToken":false}}' \
  -- curl --noproxy '*' -sI --connect-timeout 5 \
       https://actions-gateway-proxy.<tenant-b-namespace>.svc.cluster.local:8080
# Expected: connection timeout (tenant A's workload NP only allows egress to
# tenant A's own proxy ClusterIP, not arbitrary in-cluster services)

Security | Back to index