- Add docs/architecture.md with 4 mermaid diagrams - Add docs/mitm.md with 3 mermaid diagrams (replaces mitm-interception-status) - Add docs/traces.md documenting per-call trace system - Rewrite README.md to be concise with mermaid and doc refs - Rewrite GEMINI.md for core philosophy and agent usage - Clean extension-server-analysis.md (remove stale debug sections) - Delete temp docs: standalone-ls-todo, panel-stream-investigation, endpoint-gap-analysis, request-comparison
6.1 KiB
Antigravity Rust Proxy
OpenAI-compatible proxy that intercepts and relays requests to Google's Antigravity language server, impersonating the real Electron webview.
Core Philosophy
Stealth Goal
The primary objective is to make Google's upstream API unable to distinguish proxy requests from real Antigravity webview traffic. Unlike cliProxyApi or other known proxy patterns, this proxy:
- Produces byte-exact protobuf matching real webview format
- Uses BoringSSL TLS fingerprinting with Chrome JA3/JA4 + H2 signatures (version auto-detected)
- Performs warmup and heartbeat RPCs mimicking real webview lifecycle
- Applies jitter to all intervals to avoid automation fingerprints
- Reuses cascades for multi-turn just like the real webview
Stability Approach
The Language Server (LS) binary is a closed-source Go program with many unknown mechanics. To avoid instability:
- Send dummy prompts to the LS — the proxy sends
"."as the cascade message. The LS receives minimal input to reduce the chance of panics or unexpected behavior. - All real content goes through MITM — the MITM proxy intercepts the LS's outgoing request and replaces the dummy prompt with the real user input, injects tools, images, generation params, etc.
- Never send results back to the LS — tool results, function responses, and follow-ups are injected into the next MITM-intercepted request. The LS is used as a dumb relay that triggers API calls — nothing more.
- Pass as little as possible — the LS only needs a cascade ID and a dummy message. Everything else is handled by the MITM layer.
This "LS as dumb relay" pattern keeps the LS interactions minimal and predictable, avoiding the many unknown edge cases in its internal state machine.
Agent Quick Reference
proxyctl — Daemon Manager
proxyctl commands exit immediately (not foreground) — safe for agent use via fast-bash MCP.
# Rebuild and restart after code changes
proxyctl restart
# Quick test
proxyctl test "say hi in 3 words"
# Check status
proxyctl status
# Check health
proxyctl health
| Command | Description |
|---|---|
proxyctl start |
Start the proxy daemon |
proxyctl stop |
Stop the proxy daemon |
proxyctl restart |
Rebuild + restart |
proxyctl rebuild |
Build release binary only |
proxyctl status |
Service status + quota + usage |
proxyctl logs [N] |
Tail last N lines + follow |
proxyctl logs-all |
Full log dump (no follow) |
proxyctl test [msg] |
Quick test request (gemini-3-flash) |
proxyctl health |
Health check |
Testing After Changes
# 1. Rebuild + restart
proxyctl restart
# 2. Test an endpoint
curl -s http://localhost:8741/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "gemini-3-flash", "messages": [{"role": "user", "content": "Say hi"}]}' | jq .
# 3. Inspect latest trace
TRACE_DIR=~/.config/antigravity-proxy/traces/$(date +%Y-%m-%d)
cat "$TRACE_DIR/$(ls -t "$TRACE_DIR" | head -1)/summary.md"
Dev vs Production Models
gemini-3-flash— use for all development and testingopus-4.6— production only, has quota limits
Endpoints
| Method | Path | Description |
|---|---|---|
POST |
/v1/responses |
Responses API (sync + streaming) |
POST |
/v1/chat/completions |
Chat Completions API (OpenAI compat) |
POST |
/v1/gemini |
Native Gemini API |
POST |
/v1beta/models/{model}:{action} |
Official Gemini v1beta routes |
GET/POST |
/v1/search |
Web Search via Google grounding |
GET |
/v1/models |
List available models |
GET |
/v1/sessions |
List active sessions |
DELETE |
/v1/sessions/{id} |
Delete a session |
POST |
/v1/token |
Set OAuth token at runtime |
GET |
/v1/usage |
MITM-intercepted token usage |
GET |
/v1/quota |
LS quota and rate limits |
GET |
/health |
Health check |
Authentication
The proxy needs an OAuth token:
- Env var:
ANTIGRAVITY_OAUTH_TOKEN=ya29.xxx - Token file:
~/.config/antigravity-proxy-token - Runtime:
curl -X POST http://localhost:8741/v1/token -d '{"token":"ya29.xxx"}'
CLI Flags
| Flag | Default | Description |
|---|---|---|
--headless |
true |
Fully standalone — no running Antigravity app needed |
--classic |
false |
Attach to running Antigravity (alias for --no-headless) |
--port <PORT> |
8741 |
Proxy listen port |
--no-mitm |
false |
Disable MITM proxy |
--mitm-port <PORT> |
8742 |
MITM proxy port |
--no-standalone |
false |
Attach to real LS instead of spawning standalone |
--no-trace |
false |
Disable per-call debug traces |
Documentation
See docs/ for detailed documentation:
architecture.md— system overview, module map, request lifecycle (mermaid diagrams)mitm.md— MITM proxy internals, event flow, request modificationtraces.md— per-call debug trace systemextension-server-analysis.md— extension server protocol reverse engineeringls-binary-analysis.md— LS binary reverse engineering, model catalog, gRPC services