Skip to content

Multi-tenant runner platform for Kubernetes

Self-hosted GitHub Actions runners with zero idle compute

An Actions Runner Controller (ARC) alternative for multi-tenant Kubernetes. Free up GPU nodes the moment a job finishes, keep critical jobs scheduling even on a full cluster, and let tenants self-manage runners under safe per-tenant quotas.

Get started Why GAG? View on GitHub

Drop-in for your existing setup — jobs target the same runner labels, so nothing in your .github/workflows changes.

helm install gag oci://ghcr.io/actions-gateway/charts/actions-gateway \
  --version 1.0.0 \
  --namespace gmc-system --create-namespace \
  --set gmc.image.digest=sha256:<gmc> \
  --set agc.image.digest=sha256:<agc> \
  --set proxy.image.digest=sha256:<proxy>

What GAG gives you

Most of these ladder up to one outcome — lower cost: no idle GPUs, fewer always-on resources, and guaranteed throughput instead of blocked critical jobs.

  • Tenant self-service under quotas


    When a worker is evicted — preempted, OOM-killed, or blocked by a full ResourceQuota — GAG fast-cancels the GitHub job lock and reruns it automatically. That's what makes per-tenant quotas safe to enforce: the platform team caps each tenant, and tenants self-manage their runners with no manual reruns.

  • No blocked critical jobs


    Reserve at least N slots for each runner type, so a flood of small fast tests can't starve the big expensive ones. Every PR's full test battery finishes — even on a full cluster — instead of the GPU and e2e jobs sitting pending.

  • No idle GPUs


    Worker pods exist only while a job runs and are deleted on completion, so GPU nodes return to the scheduler the moment a job finishes — no idle runners pinned to mask cold starts. (ARC can scale to zero too; GAG makes it the default.)

  • Isolated egress IPs


    Each tenant's GitHub traffic exits through its own proxy pool, so you can allow-list just your runners on GitHub EMU — no cluster-wide allow-list or NAT gateway needed. A tenant that gets throttled or flagged doesn't take the others down with it.

  • Lower listener overhead


    Every runner group's listener is a ~60 KiB goroutine in one shared pod, not a ~256 MiB pod per scale set — roughly 600 KiB versus 2.5 GiB across ten groups. It adds up when memory is expensive.

  • Per-tenant utilization metrics


    Prometheus metrics scoped per tenant and runner group, so teams can see their own GPU utilization and make the case for quota changes — without needing cluster-wide visibility.

How it fits together

A four-tier system: one cluster-scoped manager provisions a fully isolated gateway per tenant from each ActionsGateway resource.

Tenant input ActionsGateway resource one per tenant · namespace-scoped
Tier 1 Gateway Manager Controller cluster-scoped · installed once
Tier 2 Actions Gateway Controller goroutine multiplexer
Tier 3 Egress proxy pool per-tenant egress IPs
Tier 4 Ephemeral worker pods one per job · GC'd on completion

Read the architecture overview for the full breakdown, or jump to why GAG over ARC.