Files

Nikketryhard 3d87c04d20 docs: overhaul docs, add architecture and traces, update README/GEMINI

- Add docs/architecture.md with 4 mermaid diagrams
- Add docs/mitm.md with 3 mermaid diagrams (replaces mitm-interception-status)
- Add docs/traces.md documenting per-call trace system
- Rewrite README.md to be concise with mermaid and doc refs
- Rewrite GEMINI.md for core philosophy and agent usage
- Clean extension-server-analysis.md (remove stale debug sections)
- Delete temp docs: standalone-ls-todo, panel-stream-investigation,
  endpoint-gap-analysis, request-comparison

2026-02-18 01:31:18 -06:00

6.1 KiB

Raw Blame History

Antigravity Rust Proxy

OpenAI-compatible proxy that intercepts and relays requests to Google's Antigravity language server, impersonating the real Electron webview.

Core Philosophy

Stealth Goal

The primary objective is to make Google's upstream API unable to distinguish proxy requests from real Antigravity webview traffic. Unlike cliProxyApi or other known proxy patterns, this proxy:

Produces byte-exact protobuf matching real webview format
Uses BoringSSL TLS fingerprinting with Chrome JA3/JA4 + H2 signatures (version auto-detected)
Performs warmup and heartbeat RPCs mimicking real webview lifecycle
Applies jitter to all intervals to avoid automation fingerprints
Reuses cascades for multi-turn just like the real webview

Stability Approach

The Language Server (LS) binary is a closed-source Go program with many unknown mechanics. To avoid instability:

Send dummy prompts to the LS — the proxy sends "." as the cascade message. The LS receives minimal input to reduce the chance of panics or unexpected behavior.
All real content goes through MITM — the MITM proxy intercepts the LS's outgoing request and replaces the dummy prompt with the real user input, injects tools, images, generation params, etc.
Never send results back to the LS — tool results, function responses, and follow-ups are injected into the next MITM-intercepted request. The LS is used as a dumb relay that triggers API calls — nothing more.
Pass as little as possible — the LS only needs a cascade ID and a dummy message. Everything else is handled by the MITM layer.

This "LS as dumb relay" pattern keeps the LS interactions minimal and predictable, avoiding the many unknown edge cases in its internal state machine.

Agent Quick Reference

`proxyctl` — Daemon Manager

proxyctl commands exit immediately (not foreground) — safe for agent use via fast-bash MCP.

# Rebuild and restart after code changes
proxyctl restart

# Quick test
proxyctl test "say hi in 3 words"

# Check status
proxyctl status

# Check health
proxyctl health

Command	Description
`proxyctl start`	Start the proxy daemon
`proxyctl stop`	Stop the proxy daemon
`proxyctl restart`	Rebuild + restart
`proxyctl rebuild`	Build release binary only
`proxyctl status`	Service status + quota + usage
`proxyctl logs [N]`	Tail last N lines + follow
`proxyctl logs-all`	Full log dump (no follow)
`proxyctl test [msg]`	Quick test request (gemini-3-flash)
`proxyctl health`	Health check

Testing After Changes

# 1. Rebuild + restart
proxyctl restart

# 2. Test an endpoint
curl -s http://localhost:8741/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gemini-3-flash", "messages": [{"role": "user", "content": "Say hi"}]}' | jq .

# 3. Inspect latest trace
TRACE_DIR=~/.config/antigravity-proxy/traces/$(date +%Y-%m-%d)
cat "$TRACE_DIR/$(ls -t "$TRACE_DIR" | head -1)/summary.md"

Dev vs Production Models

gemini-3-flash — use for all development and testing
opus-4.6 — production only, has quota limits

Endpoints

Method	Path	Description
`POST`	`/v1/responses`	Responses API (sync + streaming)
`POST`	`/v1/chat/completions`	Chat Completions API (OpenAI compat)
`POST`	`/v1/gemini`	Native Gemini API
`POST`	`/v1beta/models/{model}:{action}`	Official Gemini v1beta routes
`GET/POST`	`/v1/search`	Web Search via Google grounding
`GET`	`/v1/models`	List available models
`GET`	`/v1/sessions`	List active sessions
`DELETE`	`/v1/sessions/{id}`	Delete a session
`POST`	`/v1/token`	Set OAuth token at runtime
`GET`	`/v1/usage`	MITM-intercepted token usage
`GET`	`/v1/quota`	LS quota and rate limits
`GET`	`/health`	Health check

Authentication

The proxy needs an OAuth token:

Env var: ANTIGRAVITY_OAUTH_TOKEN=ya29.xxx
Token file: ~/.config/antigravity-proxy-token
Runtime: curl -X POST http://localhost:8741/v1/token -d '{"token":"ya29.xxx"}'

CLI Flags

Flag	Default	Description
`--headless`	`true`	Fully standalone — no running Antigravity app needed
`--classic`	`false`	Attach to running Antigravity (alias for `--no-headless`)
`--port <PORT>`	`8741`	Proxy listen port
`--no-mitm`	`false`	Disable MITM proxy
`--mitm-port <PORT>`	`8742`	MITM proxy port
`--no-standalone`	`false`	Attach to real LS instead of spawning standalone
`--no-trace`	`false`	Disable per-call debug traces

Documentation

See docs/ for detailed documentation:

architecture.md — system overview, module map, request lifecycle (mermaid diagrams)
mitm.md — MITM proxy internals, event flow, request modification
traces.md — per-call debug trace system
extension-server-analysis.md — extension server protocol reverse engineering
ls-binary-analysis.md — LS binary reverse engineering, model catalog, gRPC services

6.1 KiB Raw Blame History