Skip to content

High-Scale Virtualized GitHub Actions Gateway — Design Documentation

This folder contains the full system design for the GitHub Actions Gateway, organized into focused documents with cross-references. All documents are intended to render correctly on GitHub.


Table of Contents

  1. Executive Summary & Problem Statement
  2. For Executive Leadership: GPU Utilization & Cost Justification
  3. For Tenant Teams: Self-Service & Cost Ownership
  4. For Platform Engineering: Operational Leverage & Shift Left
  5. Overview for Architects & Engineers
  6. Core Architectural Components
  7. 2.1 Tier 1 — Gateway Manager Controller (GMC)
  8. 2.2 Tier 2 — Actions Gateway Controller (AGC)
  9. 2.3 Tier 3 — Egress Proxy Pool
  10. 2.4 Tier 4 — Ephemeral Worker Pod
  11. 2.5 Observability
  12. 2.6 Upgrade Strategy
  13. API & Data Contract Specifications
  14. 3.1 Kubernetes CRD Schemas
  15. 3.2 GitHub App Credentials Secret Schema
  16. 3.3 Re-implemented Broker API Endpoints
  17. 3.4 Broker Payload Blueprints (Go Structs)
  18. 3.5 GitHub API Rate Limit Budget
  19. Operational Lifecycle Execution Flows
  20. 4.1 Tenant Provisioning Flow (GMC)
  21. 4.2 Job Execution Flow (AGC)
  22. Security & Threat Risk Assessment
  23. 5.1 GMC-Level Threats (Cluster-Scoped)
  24. 5.2 AGC & Proxy-Level Threats (Namespace-Scoped)
  25. 5.3 Security Profiles and the Privileged Opt-In
  26. Implementation Phasing & Delivery Milestones
  27. Milestone 1: Wire Protocol Probe (Days 1–4)
  28. Milestone 2: AGC Controller & Reconciler (Days 5–10)
  29. Milestone 3: Worker Pod & Pipe Handoff (Days 11–16)
  30. Milestone 4: Gateway Manager Controller + Proxy (Days 17–22)
  31. Milestone 5: Hardening & Load Testing (Days 23–26)
  32. Test Plan
  33. 7.1 Unit Tests
  34. 7.2 Integration Tests
  35. 7.3 End-to-End Tests
  36. Glossary
  37. Appendix A — Capacity Targets & SLOs
  38. Appendix B — Worker Isolation Runtime (Optional)
  39. Appendix C — AI-Assisted Implementation Notes (Optional)
  40. Appendix D — Alternatives Considered
  41. Appendix E — Capacity Planning & RunnerGroup Design
  42. Appendix F — Cost Model
  43. Appendix G — Optional Future Enhancements
  44. Network Architecture

Operations


Reading Paths by Role

Architect — reviewing the overall design: start with 01-executive-summary.md, then 02-architecture.md, then 03-api-contracts.md. Read 04-operational-flows.md and 05-security.md for depth.

Platform engineer — deploying or operating the system: read Getting Started first, then 02-architecture.md §2.1 (GMC), Appendix A (SLOs), Observability, Runbook, and Upgrade & Rollback.

Security engineer — reviewing trust boundaries and threats: read 05-security.md, 02-architecture.md §2.4 (worker isolation), 03-api-contracts.md §3.2 (credentials), and Appendix B (runtime hardening).

Tenant team — authoring RunnerGroup configs: read Getting Started, 03-api-contracts.md §3.1 (CRD schemas), and Appendix E (sizing guidance).


System Overview

The gateway is a four-tier system for running GitHub Actions self-hosted runners at scale on Kubernetes:

Tier Component Scope Role
1 Gateway Manager Controller (GMC) Cluster Watches ActionsGateway CRs, provisions per-tenant resources
2 Actions Gateway Controller (AGC) Namespace Multiplexes GitHub broker sessions, acquires jobs, spawns worker pods
3 Egress Proxy Pool Namespace Stateless HTTPS CONNECT proxy pool; isolated egress IPs per tenant
4 Ephemeral Worker Pod Namespace Single-use pod that executes exactly one workflow job

For a quick orientation, start with 01-executive-summary.md, then follow links from there.