Skip to content

Redesign cli-proxy: connect to external DIFC proxy started by compiler #1804

@lpcox

Description

@lpcox

Summary

Redesign the CLI proxy feature so that AWF connects to an external DIFC proxy (mcpg) started by the gh-aw compiler on the host, instead of managing the mcpg container internally. This eliminates the mcpg container crash issues and aligns with how gh-aw already runs mcpg successfully.

Companion compiler PR: github/gh-aw#25366

Current Approach (broken)

When --enable-cli-proxy is passed, AWF starts two containers in its Docker Compose:

  1. awf-cli-proxy-mcpg (172.30.0.51) — the mcpg image in proxy mode, holding GH_TOKEN
  2. awf-cli-proxy — Node.js HTTP server + gh CLI, sharing mcpg's network namespace via network_mode: service:cli-proxy-mcpg

The agent reaches cli-proxy at http://172.30.0.51:11000. The cli-proxy forwards gh commands to mcpg at localhost:18443 (reachable via shared network namespace). TLS certs are shared via a Docker named volume cli-proxy-tls.

Problems

The mcpg container consistently crashes with exit code 1 in AWF's hardened environment:

  • cap_drop: ALL — mcpg may need capabilities AWF strips
  • pids_limit: 50 — may be too restrictive for mcpg's goroutine-heavy Go runtime
  • run.shrun_containerized.sh redirection — the mcpg image detects /.dockerenv and redirects to a different entrypoint script with different requirements
  • HTTP_PROXY through Squid — mcpg's startup traffic is routed through Squid, which may interfere with initialization
  • Isolated Docker network — AWF runs mcpg on 172.30.0.51 while gh-aw runs it with --network host
  • Healthcheck incompatibility — mcpg image has no curl/wget, so healthchecks fail

In contrast, gh-aw runs mcpg via start_difc_proxy.sh with --network host and no restrictions — and it works reliably.

Multiple CI iterations (PR #1801, 5 commits) failed to resolve these issues.

New Approach

Split responsibilities:

New CLI Flags

Old Flag New Flag Notes
--enable-cli-proxy --difc-proxy-host <host:port> Presence enables cli-proxy. e.g. host.docker.internal:18443
--cli-proxy-policy <json> (removed) Compiler handles guard policy
--cli-proxy-mcpg-image <img> (removed) Compiler handles mcpg image
--cli-proxy-writable (removed) Write control handled by DIFC guard policy
(new) --difc-proxy-ca-cert <path> Path to TLS CA cert written by the host difc-proxy

New Architecture

Host (managed by gh-aw compiler):
  difc-proxy (mcpg in proxy mode) on 0.0.0.0:18443, --network host

AWF docker-compose:
  squid-proxy (172.30.0.10)
  cli-proxy (172.30.0.50) → host difc-proxy via host.docker.internal:18443
  agent (172.30.0.20) → cli-proxy at http://172.30.0.50:11000

TLS Hostname Matching

The difc-proxy's self-signed TLS cert has SANs for localhost and 127.0.0.1, but not host.docker.internal. To solve this, the cli-proxy container runs a Node.js TCP tunnel (tcp-tunnel.js):

localhost:18443 (inside cli-proxy) → TCP tunnel → host.docker.internal:18443 (host difc-proxy)

The gh CLI uses GH_HOST=localhost:18443, which matches the cert's SAN. The TCP tunnel transparently forwards the TLS connection to the real difc-proxy on the host.

Code Changes Required

1. src/cli.ts — Replace CLI flags

Remove:

  • --enable-cli-proxy (line ~1439)
  • --cli-proxy-writable (line ~1446)
  • --cli-proxy-policy (line ~1451)
  • --cli-proxy-mcpg-image (line ~1456)

Add:

  • --difc-proxy-host <host:port> — presence enables cli-proxy
  • --difc-proxy-ca-cert <path> — TLS CA cert from difc-proxy

Update config passthrough (line ~1832):

  • difcProxyHost: options.difcProxyHost
  • difcProxyCaCert: options.difcProxyCaCert
  • Remove: enableCliProxy, cliProxyWritable, cliProxyPolicy, cliProxyMcpgImage

2. src/types.ts — Replace config types

Remove:

  • enableCliProxy?: boolean (line 805)
  • cliProxyWritable?: boolean (line 816)
  • cliProxyPolicy?: string (line 827)
  • cliProxyMcpgImage?: string (line 855)

Add:

  • difcProxyHost?: string — e.g. "host.docker.internal:18443"
  • difcProxyCaCert?: string — e.g. "/tmp/gh-aw/difc-proxy-tls/ca.crt"

Keep:

  • githubToken?: string (line 840) — still needed for token exclusion from agent env
  • CLI_PROXY_PORT = 11000 (line 68)

3. src/docker-manager.ts — Major refactor

Remove (~170 lines):

  • CLI_PROXY_MCPG_CONTAINER_NAME constant (line 25)
  • cliProxyMcpgIp from network config interface (line 381) and DEFAULT_NETWORK_CONFIG (line 2004)
  • Entire mcpg service block (lines ~1667-1731): guard policy generation, mcpg command args, mcpg environment, mcpg healthcheck, mcpg volumes
  • cli-proxy-tls named volume (line ~1827)
  • GH_TOKEN validation for mcpg (if (!config.githubToken) throw) (line ~1649)
  • mcpg from container cleanup (line ~2208)

Modify:

  • Enablement check: config.enableCliProxyconfig.difcProxyHost (lines ~557, 632, 1382, 1640, 1826)
  • Token exclusion (line ~557): keep EXCLUDED_ENV_VARS.add('GITHUB_TOKEN') etc. when config.difcProxyHost is set
  • Agent NO_PROXY (line ~632): add cli-proxy IP 172.30.0.50 (not mcpg IP 172.30.0.51)
  • AWF_CLI_PROXY_IP (line ~1383): set to 172.30.0.50 (cli-proxy's own IP)
  • AWF_CLI_PROXY_URL (line ~1803): http://172.30.0.50:11000

Refactor cli-proxy service (lines ~1737-1794):

  • Give cli-proxy its own IP on awf-net: 172.30.0.50 (remove network_mode: service:cli-proxy-mcpg)
  • Add extra_hosts: ['host.docker.internal:host-gateway']
  • Mount host CA cert: ${config.difcProxyCaCert}:/tmp/proxy-tls/ca.crt:ro
  • Set new env vars:
    • AWF_DIFC_PROXY_HOST=host.docker.internal (or parsed from config.difcProxyHost)
    • AWF_DIFC_PROXY_PORT=18443 (or parsed from config.difcProxyHost)
  • NO_PROXY: localhost,127.0.0.1,::1,host.docker.internal
  • Remove AWF_MCPG_PORT env var
  • Remove AWF_CLI_PROXY_WRITABLE env var
  • Keep healthcheck: curl -f http://localhost:11000/health
  • depends_on: only squid-proxy (not cli-proxy-mcpg)

4. src/docker-manager.test.ts — Update tests

Remove mcpg-specific tests (~10 tests in the "CLI proxy sidecar" describe block, lines ~2709-2957):

  • Tests for guard policy format, mcpg image, mcpg healthcheck, mcpg environment, mcpg network namespace, cli-proxy-tls volume

Update remaining tests:

  • Change enableCliProxy: truedifcProxyHost: 'host.docker.internal:18443'
  • Add difcProxyCaCert: '/tmp/difc-proxy-tls/ca.crt' to test configs
  • Remove cliProxyMcpgIp from mock network configs
  • Update assertions for cli-proxy having its own IP (not network_mode)

Add new tests:

  • CA cert mounted as read-only volume
  • extra_hosts includes host.docker.internal:host-gateway
  • New env vars (AWF_DIFC_PROXY_HOST, AWF_DIFC_PROXY_PORT)
  • NO_PROXY includes host.docker.internal

5. containers/cli-proxy/entrypoint.sh — Connect to external difc-proxy

Replace current logic (assumes shared network namespace with mcpg) with:

  • Read AWF_DIFC_PROXY_HOST and AWF_DIFC_PROXY_PORT env vars
  • Start Node.js TCP tunnel: node /app/tcp-tunnel.js $AWF_DIFC_PROXY_PORT $AWF_DIFC_PROXY_HOST $AWF_DIFC_PROXY_PORT &
  • Wait for CA cert at /tmp/proxy-tls/ca.crt (mounted from host, not from shared volume)
  • Set GH_HOST=localhost:${AWF_DIFC_PROXY_PORT} (tunnel makes difc-proxy appear on localhost)
  • Keep NODE_EXTRA_CA_CERTS=/tmp/proxy-tls/ca.crt

6. containers/cli-proxy/tcp-tunnel.js — New file

~15-line Node.js TCP forwarder using net.createServer:

const net = require('net');
const [localPort, remoteHost, remotePort] = [process.argv[2], process.argv[3], process.argv[4]];
net.createServer(client => {
  const upstream = net.connect(+remotePort, remoteHost);
  client.pipe(upstream);
  upstream.pipe(client);
  client.on('error', () => upstream.destroy());
  upstream.on('error', () => client.destroy());
}).listen(+localPort, '127.0.0.1', () => {
  console.log(`[tcp-tunnel] Forwarding localhost:${localPort}${remoteHost}:${remotePort}`);
});

7. containers/cli-proxy/server.js — Minor changes

  • Remove AWF_CLI_PROXY_WRITABLE env var handling
  • Remove writable mode from validateArgs() — DIFC guard policy handles write control
  • Remove WRITABLE_MODE constant and all references
  • Simplify: always allow all subcommands (guard policy is the enforcement layer)

8. containers/agent/setup-iptables.sh — No changes needed

AWF_CLI_PROXY_IP env var and iptables RETURN rule (line ~178) work unchanged — the IP just changes from 172.30.0.51 to 172.30.0.50.

9. containers/agent/entrypoint.sh — No changes needed

AWF_CLI_PROXY_URL env var check and gh wrapper installation (lines ~481-497, 828-840) work unchanged.

10. containers/agent/gh-cli-proxy-wrapper.sh — No changes needed

Uses AWF_CLI_PROXY_URL to POST to /exec endpoint — unchanged.

11. Integration tests — Simplify

tests/integration/cli-proxy.test.ts: Remove tests that depend on mcpg being started by AWF. Integration coverage is provided by the smoke-copilot workflow in CI. Unit tests in docker-manager.test.ts cover compose generation.

12. Documentation updates

  • docs/gh-cli-proxy-design.md — Update architecture diagrams and flow descriptions
  • AGENTS.md — Update CLI proxy section to describe external DIFC proxy
  • Update any README references to --enable-cli-proxy

What Gets Removed

Component Location Lines
mcpg service generation src/docker-manager.ts ~1667-1731
mcpg constants/config src/docker-manager.ts line 25, 381, 2004
Guard policy logic src/docker-manager.ts ~1656-1665
GH_TOKEN validation src/docker-manager.ts ~1646-1654
cli-proxy-tls volume src/docker-manager.ts ~1827
4 CLI flags src/cli.ts ~1439-1460
4 type fields src/types.ts 805, 816, 827, 855
~10 mcpg tests src/docker-manager.test.ts ~2709-2957
Write-mode logic containers/cli-proxy/server.js throughout

What Gets Added

Component Location Description
--difc-proxy-host flag src/cli.ts Connection to external DIFC proxy
--difc-proxy-ca-cert flag src/cli.ts TLS CA cert path
difcProxyHost type src/types.ts Config field
difcProxyCaCert type src/types.ts Config field
tcp-tunnel.js containers/cli-proxy/ Node.js TCP forwarder for TLS
New unit tests src/docker-manager.test.ts CA cert, host access, env vars

Related

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions