docs: overhaul docs, add architecture and traces, update README/GEMINI

- Add docs/architecture.md with 4 mermaid diagrams
- Add docs/mitm.md with 3 mermaid diagrams (replaces mitm-interception-status)
- Add docs/traces.md documenting per-call trace system
- Rewrite README.md to be concise with mermaid and doc refs
- Rewrite GEMINI.md for core philosophy and agent usage
- Clean extension-server-analysis.md (remove stale debug sections)
- Delete temp docs: standalone-ls-todo, panel-stream-investigation,
  endpoint-gap-analysis, request-comparison
This commit is contained in:
Nikketryhard
2026-02-18 01:31:18 -06:00
parent 28d3296c87
commit 3d87c04d20
11 changed files with 679 additions and 1305 deletions

361
GEMINI.md
View File

@@ -2,288 +2,125 @@
OpenAI-compatible proxy that intercepts and relays requests to Google's Antigravity language server, impersonating the real Electron webview.
## Quick Start
## Core Philosophy
```bash
# Headless mode (no running Antigravity app needed)
RUST_LOG=info ./target/release/antigravity-proxy --headless
### Stealth Goal
# Classic mode (requires running Antigravity + sudo setup for MITM)
sudo ./scripts/mitm-redirect.sh install
proxyctl start
The primary objective is to make Google's upstream API unable to distinguish proxy requests from real Antigravity webview traffic. Unlike `cliProxyApi` or other known proxy patterns, this proxy:
# Or run directly
RUST_LOG=info ./target/release/antigravity-proxy
```
- Produces **byte-exact protobuf** matching real webview format
- Uses **BoringSSL TLS fingerprinting** with Chrome JA3/JA4 + H2 signatures (version auto-detected)
- Performs **warmup and heartbeat RPCs** mimicking real webview lifecycle
- Applies **jitter** to all intervals to avoid automation fingerprints
- **Reuses cascades** for multi-turn just like the real webview
Default port: **8741**
### Stability Approach
## CLI Tools
The Language Server (LS) binary is a closed-source Go program with many unknown mechanics. To avoid instability:
1. **Send dummy prompts to the LS** — the proxy sends `"."` as the cascade message. The LS receives minimal input to reduce the chance of panics or unexpected behavior.
2. **All real content goes through MITM** — the MITM proxy intercepts the LS's outgoing request and replaces the dummy prompt with the real user input, injects tools, images, generation params, etc.
3. **Never send results back to the LS** — tool results, function responses, and follow-ups are injected into the _next_ MITM-intercepted request. The LS is used as a dumb relay that triggers API calls — nothing more.
4. **Pass as little as possible** — the LS only needs a cascade ID and a dummy message. Everything else is handled by the MITM layer.
This "LS as dumb relay" pattern keeps the LS interactions minimal and predictable, avoiding the many unknown edge cases in its internal state machine.
## Agent Quick Reference
### `proxyctl` — Daemon Manager
Symlinked to `~/.local/bin/proxyctl` for global access. Manages the proxy as a systemd user service.
| Command | Description |
| --------------------- | --------------------------------------- |
| `proxyctl start` | Start the proxy daemon |
| `proxyctl stop` | Stop the proxy daemon |
| `proxyctl restart` | Rebuild + restart |
| `proxyctl rebuild` | Build release binary only |
| `proxyctl status` | Service status + quota + usage |
| `proxyctl logs [N]` | Tail last N lines (default 30) + follow |
| `proxyctl logs-all` | Full log dump (no follow) |
| `proxyctl test [msg]` | Quick test request (gemini-3-flash) |
| `proxyctl health` | Health check |
### `mitm-redirect.sh` — MITM Setup
One-time setup script for UID-scoped iptables traffic redirection.
`proxyctl` commands exit immediately (not foreground) — safe for agent use via fast-bash MCP.
```bash
sudo ./scripts/mitm-redirect.sh install # create user + iptables rule
sudo ./scripts/mitm-redirect.sh uninstall # remove user + iptables rule
sudo ./scripts/mitm-redirect.sh status # check current state
# Rebuild and restart after code changes
proxyctl restart
# Quick test
proxyctl test "say hi in 3 words"
# Check status
proxyctl status
# Check health
proxyctl health
```
| Command | Description |
| --------------------- | ----------------------------------- |
| `proxyctl start` | Start the proxy daemon |
| `proxyctl stop` | Stop the proxy daemon |
| `proxyctl restart` | Rebuild + restart |
| `proxyctl rebuild` | Build release binary only |
| `proxyctl status` | Service status + quota + usage |
| `proxyctl logs [N]` | Tail last N lines + follow |
| `proxyctl logs-all` | Full log dump (no follow) |
| `proxyctl test [msg]` | Quick test request (gemini-3-flash) |
| `proxyctl health` | Health check |
### Testing After Changes
```bash
# 1. Rebuild + restart
proxyctl restart
# 2. Test an endpoint
curl -s http://localhost:8741/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "gemini-3-flash", "messages": [{"role": "user", "content": "Say hi"}]}' | jq .
# 3. Inspect latest trace
TRACE_DIR=~/.config/antigravity-proxy/traces/$(date +%Y-%m-%d)
cat "$TRACE_DIR/$(ls -t "$TRACE_DIR" | head -1)/summary.md"
```
### Dev vs Production Models
- **`gemini-3-flash`** — use for all development and testing
- **`opus-4.6`** — production only, has quota limits
## Endpoints
| Method | Path | Description |
| ---------- | ---------------------- | ----------------------------------------------------------- |
| `POST` | `/v1/responses` | **Responses API** (primary) — supports `stream: true/false` |
| `POST` | `/v1/chat/completions` | Chat Completions API (OpenAI compat shim) |
| `GET/POST` | `/v1/search` | **Web Search** — Google Search grounding, returns results |
| `GET` | `/v1/models` | List available models |
| `GET` | `/v1/sessions` | List active sessions |
| `DELETE` | `/v1/sessions/:id` | Delete a session |
| `POST` | `/v1/token` | Set OAuth token at runtime |
| `GET` | `/v1/usage` | MITM-intercepted token usage stats |
| `GET` | `/v1/quota` | LS quota — credits, per-model rate limits, reset timers |
| `GET` | `/health` | Health check |
## Available Models
| Name | Label |
| ------------------- | ---------------------------------------- |
| `opus-4.6` | Claude Opus 4.6 (Thinking) — **default** |
| `opus-4.5` | Claude Opus 4.5 (Thinking) |
| `gemini-3-pro-high` | Gemini 3 Pro (High) |
| `gemini-3-pro` | Gemini 3 Pro (Low) |
| `gemini-3-flash` | Gemini 3 Flash |
## Development & Testing
- **Dev/testing model**: `gemini-3-flash` — use this for all development, debugging, and iterative testing
- **Production model**: `opus-4.6` — use sparingly for real-world validation only (has quota limit)
- See `docs/ls-binary-analysis.md` for full reverse-engineered model catalog and proto enum mappings
## Example: Responses API
### Sync
```bash
curl -s http://localhost:8741/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3-flash",
"input": "Say hello in exactly 3 words",
"stream": false,
"timeout": 60
}' | jq .
```
### Streaming
```bash
curl -N http://localhost:8741/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3-flash",
"input": "Say hello in exactly 3 words",
"stream": true,
"timeout": 60
}'
```
### Multi-turn (session reuse)
```bash
curl -s http://localhost:8741/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3-flash",
"input": "What is 2+2?",
"conversation": "my-session-1",
"stream": false
}' | jq .
# Follow-up in same cascade:
curl -s http://localhost:8741/v1/responses \\
-H "Content-Type: application/json" \\
-d '{
"model": "gemini-3-flash",
"input": "Now multiply that by 10",
"conversation": "my-session-1",
"stream": false
}' | jq .
```
## Web Search
The proxy supports Google Search grounding in two ways:
### 1. Dedicated Search Endpoint (`/v1/search`)
Returns structured search results with citations:
```bash
# Quick GET search
curl -s 'http://localhost:8741/v1/search?q=latest+rust+news' | jq .
# Full POST search with options
curl -s http://localhost:8741/v1/search \\
-H "Content-Type: application/json" \\
-d '{
"query": "latest Rust programming news",
"model": "gemini-3-flash",
"timeout": 30
}' | jq .
```
Response includes `summary`, `results[]` (title + URL), `citations[]`, and raw `grounding_metadata`.
### 2. Inline Grounding (on any endpoint)
Enable Google Search grounding on regular requests:
```bash
# Completions API
curl -s http://localhost:8741/v1/chat/completions \\
-H "Content-Type: application/json" \\
-d '{
"model": "gemini-3-flash",
"messages": [{"role": "user", "content": "What happened in tech today?"}],
"web_search": true
}' | jq .
# Responses API (OpenAI-style tool)
curl -s http://localhost:8741/v1/responses \\
-H "Content-Type: application/json" \\
-d '{
"model": "gemini-3-flash",
"input": "What happened in tech today?",
"tools": [{"type": "web_search_preview"}],
"stream": false
}' | jq .
# Gemini API
curl -s http://localhost:8741/v1/gemini \\
-H "Content-Type: application/json" \\
-d '{
"model": "gemini-3-flash",
"message": "What happened in tech today?",
"google_search": true
}' | jq .
```
| Method | Path | Description |
| ---------- | --------------------------------- | ------------------------------------ |
| `POST` | `/v1/responses` | Responses API (sync + streaming) |
| `POST` | `/v1/chat/completions` | Chat Completions API (OpenAI compat) |
| `POST` | `/v1/gemini` | Native Gemini API |
| `POST` | `/v1beta/models/{model}:{action}` | Official Gemini v1beta routes |
| `GET/POST` | `/v1/search` | Web Search via Google grounding |
| `GET` | `/v1/models` | List available models |
| `GET` | `/v1/sessions` | List active sessions |
| `DELETE` | `/v1/sessions/{id}` | Delete a session |
| `POST` | `/v1/token` | Set OAuth token at runtime |
| `GET` | `/v1/usage` | MITM-intercepted token usage |
| `GET` | `/v1/quota` | LS quota and rate limits |
| `GET` | `/health` | Health check |
## Authentication
The proxy needs an OAuth token. Three ways to provide it:
The proxy needs an OAuth token:
1. **Environment variable**: `export ANTIGRAVITY_OAUTH_TOKEN=ya29.xxx`
2. **Token file**: `echo 'ya29.xxx' > ~/.config/antigravity-proxy-token`
3. **Runtime API**: `curl -X POST http://localhost:8741/v1/token -d '{"token":"ya29.xxx"}'`
1. **Env var**: `ANTIGRAVITY_OAUTH_TOKEN=ya29.xxx`
2. **Token file**: `~/.config/antigravity-proxy-token`
3. **Runtime**: `curl -X POST http://localhost:8741/v1/token -d '{"token":"ya29.xxx"}'`
## Version Detection
## CLI Flags
Version strings (Antigravity, Chrome, Electron, Client) are **auto-detected** at startup from the installed Antigravity app:
| Flag | Default | Description |
| -------------------- | ------- | --------------------------------------------------------- |
| `--headless` | `true` | Fully standalone — no running Antigravity app needed |
| `--classic` | `false` | Attach to running Antigravity (alias for `--no-headless`) |
| `--port <PORT>` | `8741` | Proxy listen port |
| `--no-mitm` | `false` | Disable MITM proxy |
| `--mitm-port <PORT>` | `8742` | MITM proxy port |
| `--no-standalone` | `false` | Attach to real LS instead of spawning standalone |
| `--no-trace` | `false` | Disable per-call debug traces |
- `product.json` → app version + client/IDE version
- Binary → Chrome + Electron versions via `strings`
## Documentation
Falls back to hardcoded values if the app isn't installed. No manual updates needed when Antigravity updates.
See `docs/` for detailed documentation:
## Standalone LS
By default, the proxy spawns its own Language Server instance for full isolation.
### Headless Mode (`--headless`)
Fully independent — no running Antigravity app, no sudo, no iptables:
1. Generates its own CSRF token (random UUID)
2. Passes `-standalone=true` and `-extension_server_port=0` to the LS binary
3. Uses `HTTPS_PROXY` for MITM (no iptables required)
4. Only needs the LS binary installed at the standard path
### Classic Mode (default)
1. Discovers the main LS config (`extension_server_port`, `csrf_token`) from the running Antigravity app
2. Spawns a standalone LS binary on a random port
3. Builds init metadata protobuf (model config, `detect_and_use_proxy=ENABLED`)
4. If MITM is active, spawns as `antigravity-ls` user for UID-scoped traffic interception
5. Kills the child on proxy shutdown
Disable with `--no-standalone` to attach to the real LS instead.
**Module:** `src/standalone.rs`
## Stealth Features
- **TLS fingerprint**: BoringSSL with Chrome JA3/JA4 + H2 fingerprint via `wreq` (version auto-detected)
- **Protobuf**: Hand-rolled encoder producing byte-exact match to real webview traffic
- **Warmup**: Mimics real webview startup RPC calls
- **Heartbeat**: Periodic keep-alive matching real webview lifecycle
- **Reactive streaming**: `StreamCascadeReactiveUpdates` for real-time state diffs (polling fallback)
- **Jitter**: Randomized intervals to avoid automation fingerprint
- **Session reuse**: Cascades reused for multi-turn, matching real webview behavior
- **MITM proxy**: TLS-intercepting proxy for real token usage capture
## MITM Proxy
Built-in MITM proxy intercepts LS ↔ Google API traffic to capture **real** token usage (input, output, thinking tokens). Enabled by default with the standalone LS. Disable with `--no-mitm`.
### How It Works
```
Client → Proxy (8741) → Standalone LS (as antigravity-ls user)
↓ (port 443 traffic)
iptables REDIRECT (UID-scoped)
MITM Proxy (8742)
↓ (TLS decrypt + parse SSE)
Google API (daily-cloudcode-pa.googleapis.com)
```
### Setup
```bash
# One-time setup (creates user + iptables rule)
sudo ./scripts/mitm-redirect.sh install
# Run proxy (standalone LS + MITM are both on by default)
RUST_LOG=info ./target/release/antigravity-proxy
# Check intercepted usage
curl -s http://localhost:8741/v1/usage | jq .
# Cleanup
sudo ./scripts/mitm-redirect.sh uninstall
```
### Details
- **UID-scoped iptables**: Only the standalone LS's traffic is intercepted (no side effects)
- **Combined CA bundle**: System CAs + MITM CA → `/tmp/antigravity-mitm-combined-ca.pem`
- **Google SSE parsing**: Extracts `promptTokenCount`, `candidatesTokenCount`, `thoughtsTokenCount`
- **Init metadata**: Protobuf field 34 `detect_and_use_proxy` set to ENABLED (1)
- See `docs/mitm-interception-status.md` for full technical details
- See `docs/ls-binary-analysis.md` for proto enum mappings and model IDs
### CLI Flags
- `--headless`: Fully standalone — no running Antigravity app required
- `--no-mitm`: Disable MITM proxy entirely
- `--no-standalone`: Attach to existing LS instead of spawning standalone
- `--mitm-port <PORT>`: Override MITM proxy port (default: auto-assign)
- `--port <PORT>`: Override proxy listen port (default: 8741)
- `architecture.md` — system overview, module map, request lifecycle (mermaid diagrams)
- `mitm.md` — MITM proxy internals, event flow, request modification
- `traces.md` — per-call debug trace system
- `extension-server-analysis.md` — extension server protocol reverse engineering
- `ls-binary-analysis.md` — LS binary reverse engineering, model catalog, gRPC services