Multi-tenant runner platform for Kubernetes
Self-hosted GitHub Actions runners with zero idle compute¶
An Actions Runner Controller (ARC) alternative for multi-tenant Kubernetes. Free up GPU nodes the moment a job finishes, keep critical jobs scheduling even on a full cluster, and let tenants self-manage runners under safe per-tenant quotas.
Get started Why GAG? View on GitHub
Drop-in for your existing setup — jobs target the same runner labels, so nothing in your .github/workflows changes.
helm install gag oci://ghcr.io/actions-gateway/charts/actions-gateway \
--version 1.0.0 \
--namespace gmc-system --create-namespace \
--set gmc.image.digest=sha256:<gmc> \
--set agc.image.digest=sha256:<agc> \
--set proxy.image.digest=sha256:<proxy>
What GAG gives you¶
Most of these ladder up to one outcome — lower cost: no idle GPUs, fewer always-on resources, and guaranteed throughput instead of blocked critical jobs.
-
Tenant self-service under quotas
When a worker is evicted — preempted, OOM-killed, or blocked by a full
ResourceQuota— GAG fast-cancels the GitHub job lock and reruns it automatically. That's what makes per-tenant quotas safe to enforce: the platform team caps each tenant, and tenants self-manage their runners with no manual reruns. -
No blocked critical jobs
Reserve at least N slots for each runner type, so a flood of small fast tests can't starve the big expensive ones. Every PR's full test battery finishes — even on a full cluster — instead of the GPU and e2e jobs sitting pending.
-
No idle GPUs
Worker pods exist only while a job runs and are deleted on completion, so GPU nodes return to the scheduler the moment a job finishes — no idle runners pinned to mask cold starts. (ARC can scale to zero too; GAG makes it the default.)
-
Isolated egress IPs
Each tenant's GitHub traffic exits through its own proxy pool, so you can allow-list just your runners on GitHub EMU — no cluster-wide allow-list or NAT gateway needed. A tenant that gets throttled or flagged doesn't take the others down with it.
-
Lower listener overhead
Every runner group's listener is a ~60 KiB goroutine in one shared pod, not a ~256 MiB pod per scale set — roughly 600 KiB versus 2.5 GiB across ten groups. It adds up when memory is expensive.
-
Per-tenant utilization metrics
Prometheus metrics scoped per tenant and runner group, so teams can see their own GPU utilization and make the case for quota changes — without needing cluster-wide visibility.
How it fits together¶
A four-tier system: one cluster-scoped manager provisions a fully isolated
gateway per tenant from each ActionsGateway resource.
Read the architecture overview for the full breakdown, or jump to why GAG over ARC.