docs: overhaul docs, add architecture and traces, update README/GEMINI

- Add docs/architecture.md with 4 mermaid diagrams - Add docs/mitm.md with 3 mermaid diagrams (replaces mitm-interception-status) - Add docs/traces.md documenting per-call trace system - Rewrite README.md to be concise with mermaid and doc refs - Rewrite GEMINI.md for core philosophy and agent usage - Clean extension-server-analysis.md (remove stale debug sections) - Delete temp docs: standalone-ls-todo, panel-stream-investigation, endpoint-gap-analysis, request-comparison
2026-02-18 01:31:18 -06:00
parent 28d3296c87
commit 3d87c04d20
11 changed files with 679 additions and 1305 deletions
--- a/README.md
+++ b/README.md
@@ -1,396 +1,81 @@
 # Antigravity Proxy

-OpenAI-compatible proxy that intercepts and relays requests to Google's Antigravity language server, impersonating the real Electron webview. Supports the Responses API, Chat Completions API, and a native Gemini endpoint with full streaming, multi-turn conversations, tool calling, image uploads, web search grounding, and real token usage capture via MITM interception.
-
-## Architecture
+OpenAI-compatible proxy that intercepts and relays requests to Google's Antigravity language server, impersonating the real Electron webview.

 ```mermaid
 %%{init: {'theme': 'dark', 'themeVariables': {'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'primaryBorderColor': '#7c3aed', 'lineColor': '#7c3aed', 'secondaryColor': '#16213e', 'tertiaryColor': '#0f3460', 'edgeLabelBackground': '#1a1a2e', 'nodeTextColor': '#e0e0e0'}}}%%
-graph TB
-    subgraph client["Client Layer"]
-        style client fill:#1a1a2e,stroke:#7c3aed,stroke-width:2px,color:#e0e0e0
-        APP["OpenAI SDK / curl / Any HTTP Client"]
-    end
+graph LR
+    Client["Client"] -->|"OpenAI / Gemini API"| Proxy["Proxy :8741"]
+    Proxy -->|"gRPC (dummy prompt)"| LS["Standalone LS"]
+    LS -->|"HTTPS :443"| MITM["MITM :8742"]
+    MITM -->|"Modified request\n(real prompt + tools)"| Google["Google API"]
+    Google -->|"SSE response"| MITM
+    MITM -->|"Usage, errors,\nfunction calls"| Proxy
+    LS -.->|"iptables redirect\n(UID-scoped)"| MITM

-    subgraph proxy["Proxy Layer :8741"]
-        style proxy fill:#16213e,stroke:#7c3aed,stroke-width:2px,color:#e0e0e0
-        API["API Router<br/>responses | completions | gemini | search"]
-        STORE["MitmStore<br/>tools | images | errors | usage"]
-        PROTO["Protobuf Encoder<br/>byte-exact webview match"]
-    end
-
-    subgraph ls["Language Server"]
-        style ls fill:#0f3460,stroke:#7c3aed,stroke-width:2px,color:#e0e0e0
-        STANDALONE["Standalone LS<br/>isolated process, UID: antigravity-ls"]
-    end
-
-    subgraph mitm["MITM Layer :8742"]
-        style mitm fill:#1a1a2e,stroke:#e94560,stroke-width:2px,color:#e0e0e0
-        INTERCEPT["TLS Intercept<br/>decrypt + modify + re-encrypt"]
-        MODIFY["Request Modifier<br/>inject tools, images, params"]
-        PARSE["Response Parser<br/>usage, errors, function calls"]
-    end
-
-    subgraph google["Google API"]
-        style google fill:#0f3460,stroke:#7c3aed,stroke-width:2px,color:#e0e0e0
-        GAPI["daily-cloudcode-pa.googleapis.com<br/>v1internal:streamGenerateContent"]
-    end
-
-    APP -->|"HTTP POST"| API
-    API --> STORE
-    API --> PROTO
-    PROTO -->|"gRPC"| STANDALONE
-    STANDALONE -->|"HTTPS :443"| INTERCEPT
-    INTERCEPT --> MODIFY
-    MODIFY -->|"inject tools, images,<br/>generation params"| GAPI
-    GAPI -->|"SSE response"| PARSE
-    PARSE -->|"usage, errors,<br/>function calls"| STORE
-    INTERCEPT -.->|"iptables REDIRECT<br/>UID-scoped"| STANDALONE
-
-    classDef highlight fill:#7c3aed,stroke:#e94560,stroke-width:2px,color:#fff
+    style Proxy fill:#7c3aed,color:#fff
+    style MITM fill:#e94560,color:#fff
+    style LS fill:#2563eb,color:#fff
+    style Google fill:#059669,color:#fff
 ```

-### Request Flow
-
-1. Client sends an OpenAI-compatible request to the proxy
-2. Proxy encodes the message as a protobuf matching the real webview format
-3. Proxy sends it to the standalone Language Server via gRPC
-4. LS makes an HTTPS request to Google's API
-5. iptables redirects the LS's traffic (UID-scoped) to the MITM proxy
-6. MITM decrypts TLS, modifies the request (injects tools, images, params), re-encrypts and forwards to Google
-7. Google's SSE response flows back through MITM, which captures usage, errors, and function calls
-8. Proxy polls the LS for cascade state, supplementing with MITM-captured data
-9. Client receives the response in OpenAI-compatible format
-
 ## Quick Start

 ```bash
-# First-time setup (creates user + iptables for MITM)
-sudo ./scripts/mitm-redirect.sh install
+# Headless mode (no running Antigravity app needed)
+RUST_LOG=info ./target/release/antigravity-proxy --headless

-# Start as daemon (builds if needed)
+# Or use the daemon manager
 proxyctl start
-
-# Or run directly
-RUST_LOG=info ./target/release/antigravity-proxy
 ```

-Default port: **8741**
-
 ## Endpoints

-| Method     | Path                   | Description                                                  |
-| ---------- | ---------------------- | ------------------------------------------------------------ |
-| `POST`     | `/v1/responses`        | **Responses API** (primary) -- supports `stream: true/false` |
-| `POST`     | `/v1/chat/completions` | Chat Completions API (OpenAI compat)                         |
-| `POST`     | `/v1/gemini`           | Native Gemini API                                            |
-| `GET/POST` | `/v1/search`           | Web Search via Google Search grounding                       |
-| `GET`      | `/v1/models`           | List available models                                        |
-| `GET`      | `/v1/sessions`         | List active sessions                                         |
-| `DELETE`   | `/v1/sessions/:id`     | Delete a session                                             |
-| `POST`     | `/v1/token`            | Set OAuth token at runtime                                   |
-| `GET`      | `/v1/usage`            | MITM-intercepted token usage stats                           |
-| `GET`      | `/v1/quota`            | LS quota -- credits, per-model rate limits, reset timers     |
-| `GET`      | `/health`              | Health check                                                 |
-
-## Available Models
-
-| Name                | Label                                     |
-| ------------------- | ----------------------------------------- |
-| `opus-4.6`          | Claude Opus 4.6 (Thinking) -- **default** |
-| `opus-4.5`          | Claude Opus 4.5 (Thinking)                |
-| `gemini-3-pro-high` | Gemini 3 Pro (High)                       |
-| `gemini-3-pro`      | Gemini 3 Pro (Low)                        |
-| `gemini-3-flash`    | Gemini 3 Flash                            |
-
-## Features
-
-### Core
-
- **Sync and streaming** on all endpoints
- **Multi-turn conversations** via `conversation` session ID (cascade reuse)
- **Full message history** forwarded for Chat Completions
- **Thinking/reasoning** exposed in both sync and streaming modes
- **Thinking signatures** preserved for multi-turn thinking model chains
-
-### Tool Calling
-
- **OpenAI-format tools** auto-converted to Gemini format via MITM injection
- **`tool_choice`** support (`auto`, `none`, `required`, named function)
- **`max_tool_calls`** limit on tool calls per response
- **Function call results** (`function_call_output`) routed back correctly
- **Native Gemini tools** passed through on the `/v1/gemini` endpoint
-
-### Image Uploads
-
-Images are injected directly into Google's API request via MITM (the LS does not forward images natively).
-
-Supported input formats:
-
- Responses API: `{type: "input_image", image_url: "data:image/png;base64,..."}`
- Chat Completions: `{type: "image_url", image_url: {url: "data:image/png;base64,..."}}`
- Gemini API: `{type: "input_image", image_url: "data:image/png;base64,..."}`
-
-### Web Search
-
-Google Search grounding can be enabled on any endpoint:
-
- Completions: `"web_search": true`
- Responses: `"tools": [{"type": "web_search_preview"}]`
- Gemini: `"google_search": true`
- Dedicated: `GET/POST /v1/search` returns structured results with citations
-
-### Generation Parameters
-
-All parameters are forwarded to Google via MITM injection:
-
-| Parameter                | Endpoints                                             |
-| ------------------------ | ----------------------------------------------------- |
-| `temperature`            | All                                                   |
-| `top_p` / `topP`         | All                                                   |
-| `top_k` / `topK`         | Gemini                                                |
-| `max_output_tokens`      | All                                                   |
-| `stop` / `stopSequences` | All                                                   |
-| `frequency_penalty`      | Completions                                           |
-| `presence_penalty`       | Completions                                           |
-| `reasoning_effort`       | All (mapped to `thinkingLevel`)                       |
-| `response_format`        | Completions, Responses (`json_object`, `json_schema`) |
-
-### Error Propagation
-
-When Google's API returns an error (400, 429, 500, etc.), the MITM proxy captures it and the API handler returns it immediately to the client instead of hanging until timeout.
-
-Error status mapping:
-
-| Google Status        | HTTP Code | OpenAI Error Type       |
-| -------------------- | --------- | ----------------------- |
-| `INVALID_ARGUMENT`   | 400       | `invalid_request_error` |
-| `RESOURCE_EXHAUSTED` | 429       | `rate_limit_error`      |
-| `PERMISSION_DENIED`  | 403       | `authentication_error`  |
-| `INTERNAL`           | 500       | `server_error`          |
-| `UNAVAILABLE`        | 503       | `server_error`          |
-
-## Usage Examples
-
-### Responses API (sync)
-
-```bash
-curl -s http://localhost:8741/v1/responses \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gemini-3-flash",
-    "input": "Say hello in exactly 3 words",
-    "stream": false,
-    "timeout": 60
-  }' | jq .
-```
-
-### Responses API (streaming)
-
-```bash
-curl -N http://localhost:8741/v1/responses \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gemini-3-flash",
-    "input": "Say hello in exactly 3 words",
-    "stream": true,
-    "timeout": 60
-  }'
-```
-
-### Multi-turn Conversation
-
-```bash
-curl -s http://localhost:8741/v1/responses \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gemini-3-flash",
-    "input": "What is 2+2?",
-    "conversation": "my-session-1",
-    "stream": false
-  }' | jq .
-
-# Follow-up in same cascade
-curl -s http://localhost:8741/v1/responses \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gemini-3-flash",
-    "input": "Now multiply that by 10",
-    "conversation": "my-session-1",
-    "stream": false
-  }' | jq .
-```
-
-### Image Upload
-
-```bash
-curl -s http://localhost:8741/v1/responses \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gemini-3-flash",
-    "input": [
-      {"type": "input_image", "image_url": "data:image/png;base64,iVBORw0KGgo..."},
-      {"type": "input_text", "text": "What is in this image?"}
-    ],
-    "stream": false
-  }' | jq .
-```
-
-### Web Search
-
-```bash
-# Dedicated search endpoint
-curl -s 'http://localhost:8741/v1/search?q=latest+rust+news' | jq .
-
-# Inline grounding on any endpoint
-curl -s http://localhost:8741/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gemini-3-flash",
-    "messages": [{"role": "user", "content": "What happened in tech today?"}],
-    "web_search": true
-  }' | jq .
-```
-
-### Tool Calling
-
-```bash
-curl -s http://localhost:8741/v1/responses \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gemini-3-flash",
-    "input": "What is the weather in Tokyo?",
-    "tools": [{
-      "type": "function",
-      "function": {
-        "name": "get_weather",
-        "description": "Get weather for a location",
-        "parameters": {
-          "type": "object",
-          "properties": {"location": {"type": "string"}},
-          "required": ["location"]
-        }
-      }
-    }],
-    "stream": false
-  }' | jq .
-```
+| Method     | Path                              | Description                          |
+| ---------- | --------------------------------- | ------------------------------------ |
+| `POST`     | `/v1/responses`                   | Responses API (sync + streaming)     |
+| `POST`     | `/v1/chat/completions`            | Chat Completions API (OpenAI compat) |
+| `POST`     | `/v1/gemini`                      | Native Gemini API                    |
+| `POST`     | `/v1beta/models/{model}:{action}` | Official Gemini v1beta routes        |
+| `GET/POST` | `/v1/search`                      | Web Search via Google grounding      |
+| `GET`      | `/v1/models`                      | List available models                |
+| `GET`      | `/v1/sessions`                    | List active sessions                 |
+| `DELETE`   | `/v1/sessions/{id}`               | Delete a session                     |
+| `POST`     | `/v1/token`                       | Set OAuth token at runtime           |
+| `GET`      | `/v1/usage`                       | MITM-intercepted token usage         |
+| `GET`      | `/v1/quota`                       | LS quota and rate limits             |
+| `GET`      | `/health`                         | Health check                         |

 ## Authentication

-The proxy needs an OAuth token. Three ways to provide it:
+The proxy needs an OAuth token:

-1. **Environment variable**: `export ANTIGRAVITY_OAUTH_TOKEN=ya29.xxx`
-2. **Token file**: `echo 'ya29.xxx' > ~/.config/antigravity-proxy-token`
-3. **Runtime API**: `curl -X POST http://localhost:8741/v1/token -d '{"token":"ya29.xxx"}'`
+1. **Env var**: `ANTIGRAVITY_OAUTH_TOKEN=ya29.xxx`
+2. **Token file**: `~/.config/antigravity-proxy-token`
+3. **Runtime**: `curl -X POST http://localhost:8741/v1/token -d '{"token":"ya29.xxx"}'`

-## Stealth Features
+## `proxyctl` Commands

- **TLS fingerprint** -- BoringSSL with Chrome JA3/JA4 + H2 fingerprint via `wreq` (version auto-detected)
- **Protobuf** -- Hand-rolled encoder producing byte-exact match to real webview traffic
- **Warmup** -- Mimics real webview startup RPC calls
- **Heartbeat** -- Periodic keep-alive matching real webview lifecycle
- **Reactive streaming** -- `StreamCascadeReactiveUpdates` for real-time state diffs (polling fallback)
- **Jitter** -- Randomized intervals to avoid automation fingerprint
- **Session reuse** -- Cascades reused for multi-turn, matching real webview behavior
- **Version detection** -- Auto-detects Antigravity/Chrome/Electron versions from installed app
+| Command               | Description                    |
+| --------------------- | ------------------------------ |
+| `proxyctl start`      | Start the proxy daemon         |
+| `proxyctl stop`       | Stop the proxy daemon          |
+| `proxyctl restart`    | Rebuild + restart              |
+| `proxyctl rebuild`    | Build release binary only      |
+| `proxyctl status`     | Service status + quota + usage |
+| `proxyctl logs [N]`   | Tail last N lines + follow     |
+| `proxyctl test [msg]` | Quick test request             |
+| `proxyctl health`     | Health check                   |

-## CLI Reference
+## Documentation

-### `proxyctl` -- Daemon Manager
-
-Symlinked to `~/.local/bin/proxyctl` for global access.
-
-| Command               | Description                             |
-| --------------------- | --------------------------------------- |
-| `proxyctl start`      | Start the proxy daemon                  |
-| `proxyctl stop`       | Stop the proxy daemon                   |
-| `proxyctl restart`    | Rebuild + restart                       |
-| `proxyctl rebuild`    | Build release binary only               |
-| `proxyctl status`     | Service status + quota + usage          |
-| `proxyctl logs [N]`   | Tail last N lines (default 30) + follow |
-| `proxyctl logs-all`   | Full log dump (no follow)               |
-| `proxyctl test [msg]` | Quick test request (gemini-3-flash)     |
-| `proxyctl health`     | Health check                            |
-
-### `mitm-redirect.sh` -- MITM Setup
-
-One-time setup script for UID-scoped iptables traffic redirection.
-
-```bash
-sudo ./scripts/mitm-redirect.sh install    # create user + iptables rule
-sudo ./scripts/mitm-redirect.sh uninstall  # remove user + iptables rule
-sudo ./scripts/mitm-redirect.sh status     # check current state
-```
-
-### Proxy Binary
-
-```
-antigravity-proxy [OPTIONS]
-
-Options:
-  --port <PORT>          API server port (default: 8741)
-  --no-standalone        Attach to existing LS instead of spawning standalone
-  --no-mitm              Disable MITM proxy entirely
-  --mitm-port <PORT>     Override MITM proxy port (default: auto-assign)
-```
-
-## MITM Proxy
-
-### How It Works
-
-```mermaid
-%%{init: {'theme': 'dark', 'themeVariables': {'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'primaryBorderColor': '#e94560', 'lineColor': '#e94560', 'secondaryColor': '#16213e', 'tertiaryColor': '#0f3460'}}}%%
-graph LR
-    subgraph proxy_layer["Proxy :8741"]
-        style proxy_layer fill:#16213e,stroke:#7c3aed,stroke-width:2px,color:#e0e0e0
-        P["API Handler"]
-        S["MitmStore"]
-    end
-
-    subgraph ls_layer["Standalone LS"]
-        style ls_layer fill:#0f3460,stroke:#7c3aed,stroke-width:2px,color:#e0e0e0
-        LS["language_server<br/>UID: antigravity-ls"]
-    end
-
-    subgraph mitm_layer["MITM :8742"]
-        style mitm_layer fill:#1a1a2e,stroke:#e94560,stroke-width:2px,color:#e0e0e0
-        M["TLS Decrypt"]
-        MOD["Modify Request<br/>tools | images | params"]
-        CAP["Capture Response<br/>usage | errors | calls"]
-    end
-
-    subgraph google_layer["Google API"]
-        style google_layer fill:#0f3460,stroke:#7c3aed,stroke-width:2px,color:#e0e0e0
-        G["streamGenerateContent"]
-    end
-
-    P -->|"image, tools,<br/>params"| S
-    P -->|"protobuf"| LS
-    LS -->|":443 traffic"| M
-    M --> MOD
-    MOD -->|"modified request"| G
-    G -->|"SSE response"| CAP
-    CAP -->|"usage, errors"| S
-    S -->|"error or result"| P
-
-    linkStyle 2 stroke:#e94560,stroke-width:2px
-```
-
- **UID-scoped iptables** -- only the standalone LS's traffic is intercepted (zero side effects)
- **Combined CA bundle** -- system CAs + MITM CA written to `/tmp/antigravity-mitm-combined-ca.pem`
- **Google SSE parsing** -- extracts `promptTokenCount`, `candidatesTokenCount`, `thoughtsTokenCount`
- **Request modification** -- strips LS bloat, injects client tools/images/params (97%+ size reduction typical)
- **Error capture** -- upstream errors stored in MitmStore for instant client forwarding
- **Init metadata** -- protobuf field 34 `detect_and_use_proxy` set to ENABLED (1)
-
-## Development
-
- **Dev/testing model**: `gemini-3-flash` -- use for all development and iterative testing
- **Production model**: `opus-4.6` -- use sparingly (quota limited)
- See `docs/ls-binary-analysis.md` for reverse-engineered model catalog and proto enum mappings
- See `docs/endpoint-gap-analysis.md` for full API coverage audit
- See `docs/mitm-interception-status.md` for MITM technical details
+| Doc                                                               | Contents                                                             |
+| ----------------------------------------------------------------- | -------------------------------------------------------------------- |
+| [architecture.md](docs/architecture.md)                           | System overview, module map, request lifecycle (mermaid)             |
+| [mitm.md](docs/mitm.md)                                           | MITM proxy internals, event flow, request modification               |
+| [traces.md](docs/traces.md)                                       | Per-call debug trace system                                          |
+| [extension-server-analysis.md](docs/extension-server-analysis.md) | Extension server protocol reverse engineering                        |
+| [ls-binary-analysis.md](docs/ls-binary-analysis.md)               | LS binary reverse engineering — model catalog, gRPC services, protos |

 ## License