- Add docs/architecture.md with 4 mermaid diagrams - Add docs/mitm.md with 3 mermaid diagrams (replaces mitm-interception-status) - Add docs/traces.md documenting per-call trace system - Rewrite README.md to be concise with mermaid and doc refs - Rewrite GEMINI.md for core philosophy and agent usage - Clean extension-server-analysis.md (remove stale debug sections) - Delete temp docs: standalone-ls-todo, panel-stream-investigation, endpoint-gap-analysis, request-comparison
243 lines
11 KiB
Markdown
243 lines
11 KiB
Markdown
# Architecture
|
|
|
|
## System Overview
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
Client["Client\n(curl, SDK, etc.)"]
|
|
Proxy["Proxy\n:8741"]
|
|
LS["Standalone LS\n:random"]
|
|
MITM["MITM Proxy\n:8742"]
|
|
Google["Google API\ndaily-cloudcode-pa\n.googleapis.com"]
|
|
|
|
Client -- "OpenAI / Gemini\nHTTP API" --> Proxy
|
|
Proxy -- "gRPC\n(protobuf)" --> LS
|
|
LS -- "HTTPS :443\n(iptables redirect)" --> MITM
|
|
MITM -- "TLS\n(BoringSSL)" --> Google
|
|
|
|
style Proxy fill:#7c3aed,color:#fff
|
|
style MITM fill:#dc2626,color:#fff
|
|
style LS fill:#2563eb,color:#fff
|
|
style Google fill:#059669,color:#fff
|
|
```
|
|
|
|
The proxy translates OpenAI/Gemini API requests into gRPC calls to a standalone Language Server (LS) binary. A MITM proxy sits between the LS and Google's API to intercept traffic, inject tools/params, and capture real token usage.
|
|
|
|
---
|
|
|
|
## Request Lifecycle
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant C as Client
|
|
participant P as Proxy
|
|
participant S as MitmStore
|
|
participant LS as Standalone LS
|
|
participant M as MITM Proxy
|
|
participant G as Google API
|
|
|
|
C->>P: POST /v1/chat/completions
|
|
P->>P: Parse request, resolve model
|
|
P->>S: register_request(cascade_id, tools, params, image)
|
|
P->>LS: SendMessage(cascade_id, ".")
|
|
Note over P: Waits on MITM channel
|
|
|
|
LS->>M: HTTPS POST streamGenerateContent
|
|
M->>S: take_request(cascade_id)
|
|
M->>M: modify_request(inject tools, params, user text)
|
|
M->>G: Forward modified request
|
|
G-->>M: SSE stream (text deltas + usage)
|
|
M->>S: dispatch TextDelta, Usage events
|
|
M-->>LS: Forward (original) response
|
|
|
|
S-->>P: MitmEvent::TextDelta
|
|
S-->>P: MitmEvent::Usage
|
|
S-->>P: MitmEvent::ResponseComplete
|
|
P-->>C: OpenAI-format JSON/SSE response
|
|
```
|
|
|
|
---
|
|
|
|
## Module Map
|
|
|
|
```mermaid
|
|
graph TD
|
|
subgraph "API Layer"
|
|
mod_api["api/mod.rs\n(router)"]
|
|
completions["completions.rs"]
|
|
responses["responses.rs"]
|
|
gemini["gemini.rs"]
|
|
search["search.rs"]
|
|
models["models.rs"]
|
|
types["types.rs"]
|
|
util["util.rs"]
|
|
polling["polling.rs"]
|
|
end
|
|
|
|
subgraph "MITM Layer"
|
|
proxy_mitm["proxy.rs\n(TLS termination)"]
|
|
h2["h2_handler.rs\n(HTTP/2 framing)"]
|
|
intercept["intercept.rs\n(SSE parsing)"]
|
|
modify["modify.rs\n(request injection)"]
|
|
store["store.rs\n(MitmStore)"]
|
|
proto_mitm["proto.rs\n(protobuf codec)"]
|
|
ca["ca.rs\n(cert generation)"]
|
|
end
|
|
|
|
subgraph "Core"
|
|
main["main.rs"]
|
|
backend["backend.rs\n(gRPC client)"]
|
|
session["session.rs"]
|
|
trace["trace.rs"]
|
|
warmup["warmup.rs"]
|
|
constants["constants.rs"]
|
|
quota["quota.rs"]
|
|
end
|
|
|
|
subgraph "Standalone LS"
|
|
spawn["spawn.rs"]
|
|
discovery["discovery.rs"]
|
|
stub["stub.rs\n(extension server)"]
|
|
end
|
|
|
|
subgraph "Protobuf"
|
|
proto_mod["proto/mod.rs"]
|
|
wire["proto/wire.rs"]
|
|
end
|
|
|
|
main --> mod_api
|
|
main --> backend
|
|
main --> store
|
|
main --> spawn
|
|
mod_api --> completions & responses & gemini & search
|
|
completions & responses & gemini --> store
|
|
completions & responses & gemini --> backend
|
|
store --> intercept
|
|
proxy_mitm --> h2 --> intercept & modify
|
|
modify --> store
|
|
intercept --> store
|
|
spawn --> discovery & stub
|
|
backend --> proto_mod --> wire
|
|
|
|
style store fill:#dc2626,color:#fff
|
|
style mod_api fill:#7c3aed,color:#fff
|
|
style proxy_mitm fill:#ea580c,color:#fff
|
|
style main fill:#0d9488,color:#fff
|
|
```
|
|
|
|
---
|
|
|
|
## Endpoints
|
|
|
|
| Method | Path | Handler | Description |
|
|
| ---------- | ---------------------- | --------------------------------- | --------------------------------------- |
|
|
| `POST` | `/v1/responses` | `responses::handle_responses` | OpenAI Responses API (streaming + sync) |
|
|
| `POST` | `/v1/chat/completions` | `completions::handle_completions` | OpenAI Chat Completions API |
|
|
| `POST` | `/v1/gemini` | `gemini::handle_gemini` | Custom Gemini endpoint |
|
|
| `POST` | `/v1beta/{*path}` | `gemini::handle_gemini_v1beta` | Official Gemini v1beta routes |
|
|
| `GET/POST` | `/v1/search` | `search::handle_search_*` | Web search via Google grounding |
|
|
| `GET` | `/v1/models` | `handle_models` | List available models |
|
|
| `GET` | `/v1/sessions` | `handle_list_sessions` | List active sessions |
|
|
| `DELETE` | `/v1/sessions/{id}` | `handle_delete_session` | Delete a session |
|
|
| `POST` | `/v1/token` | `handle_set_token` | Set OAuth token at runtime |
|
|
| `GET` | `/v1/usage` | `handle_usage` | MITM-intercepted token usage |
|
|
| `GET` | `/v1/quota` | `handle_quota` | LS quota (credits, rate limits) |
|
|
| `GET` | `/health` | `handle_health` | Health check |
|
|
|
|
---
|
|
|
|
## MITM Event Flow
|
|
|
|
```mermaid
|
|
stateDiagram-v2
|
|
[*] --> Registered: register_request()
|
|
|
|
Registered --> GateWait: LS sends HTTPS request
|
|
GateWait --> Matched: MITM matches cascade_id
|
|
|
|
Matched --> Modifying: modify_request()
|
|
Modifying --> Streaming: Forward to Google
|
|
|
|
Streaming --> Streaming: TextDelta / ThinkingDelta
|
|
Streaming --> UsageCaptured: Usage event
|
|
UsageCaptured --> Complete: ResponseComplete
|
|
Streaming --> Error: UpstreamError
|
|
Streaming --> FnCall: FunctionCall
|
|
|
|
Complete --> [*]
|
|
Error --> [*]
|
|
FnCall --> Registered: Tool round (re-register)
|
|
```
|
|
|
|
---
|
|
|
|
## CLI Flags
|
|
|
|
| Flag | Default | Description |
|
|
| -------------------- | ------- | --------------------------------------------------------- |
|
|
| `--port <PORT>` | `8741` | Proxy listen port |
|
|
| `--headless` | `true` | Fully standalone — no running Antigravity app needed |
|
|
| `--classic` | `false` | Attach to running Antigravity (alias for `--no-headless`) |
|
|
| `--no-mitm` | `false` | Disable MITM proxy entirely |
|
|
| `--mitm-port <PORT>` | `8742` | MITM proxy port |
|
|
| `--no-standalone` | `false` | Attach to real LS instead of spawning standalone |
|
|
| `--no-trace` | `false` | Disable per-call debug traces |
|
|
| `-v, --verbose` | `false` | Info-level logging |
|
|
| `-d, --debug` | `false` | Debug-level logging |
|
|
|
|
---
|
|
|
|
## Source Files
|
|
|
|
| File | Lines | Purpose |
|
|
| ------------------------- | ----: | ---------------------------------------------------------- |
|
|
| `api/responses.rs` | 1796 | Responses API handler (sync, streaming, multi-turn, tools) |
|
|
| `mitm/modify.rs` | 1418 | Request modification (tool/image/param injection) |
|
|
| `api/completions.rs` | 1241 | Chat Completions handler (OpenAI compat) |
|
|
| `mitm/proxy.rs` | 1165 | TLS-terminating MITM proxy |
|
|
| `api/gemini.rs` | 1055 | Gemini API handler (native format) |
|
|
| `snapshot.rs` | 695 | State snapshots |
|
|
| `backend.rs` | 660 | gRPC client to LS |
|
|
| `mitm/store.rs` | 651 | Central state store + event channels |
|
|
| `mitm/proto.rs` | 649 | Protobuf encode/decode for MITM |
|
|
| `mitm/intercept.rs` | 640 | SSE response parser + usage extraction |
|
|
| `main.rs` | 527 | CLI, startup, wiring |
|
|
| `trace.rs` | 509 | Per-call debug trace system |
|
|
| `mitm/h2_handler.rs` | 477 | HTTP/2 frame handling |
|
|
| `standalone/spawn.rs` | 464 | LS process spawning |
|
|
| `api/search.rs` | 443 | Web search endpoint |
|
|
| `api/types.rs` | 416 | Shared request/response types |
|
|
| `standalone/discovery.rs` | 340 | LS config discovery from `/proc` |
|
|
| `proto/mod.rs` | 340 | Hand-rolled protobuf encoder |
|
|
| `api/polling.rs` | 340 | Cascade polling fallback |
|
|
| `standalone/stub.rs` | ~300 | Extension server gRPC stub |
|
|
| `proto/wire.rs` | ~200 | Wire-format protobuf helpers |
|
|
| `constants.rs` | ~100 | Model IDs, service names |
|
|
|
|
---
|
|
|
|
## Models
|
|
|
|
| Proxy Name | LS Placeholder | Description |
|
|
| ------------------- | ----------------------- | ---------------------------------------- |
|
|
| `opus-4.6` | `MODEL_PLACEHOLDER_M26` | Claude Opus 4.6 (Thinking) — **default** |
|
|
| `opus-4.5` | `MODEL_PLACEHOLDER_M12` | Claude Opus 4.5 (Thinking) |
|
|
| `gemini-3-pro-high` | `MODEL_PLACEHOLDER_M8` | Gemini 3 Pro (High quality) |
|
|
| `gemini-3-pro` | `MODEL_PLACEHOLDER_M7` | Gemini 3 Pro (Low quality) |
|
|
| `gemini-3-flash` | `MODEL_PLACEHOLDER_M18` | Gemini 3 Flash |
|
|
|
|
---
|
|
|
|
## Stealth Features
|
|
|
|
| Feature | Implementation |
|
|
| ------------------ | --------------------------------------------------------------- |
|
|
| TLS fingerprint | BoringSSL via `wreq` — Chrome JA3/JA4 + H2 fingerprint |
|
|
| Protobuf | Hand-rolled encoder producing byte-exact match to real webview |
|
|
| Warmup | Mimics real webview startup RPC sequence |
|
|
| Heartbeat | Periodic keep-alive matching real webview lifecycle |
|
|
| Reactive streaming | `StreamCascadeReactiveUpdates` for real-time state diffs |
|
|
| Jitter | Randomized intervals on warmup/heartbeat |
|
|
| Session reuse | Cascades reused for multi-turn (matches real webview) |
|
|
| Version detection | Auto-detects Chrome/Electron/app versions from installed binary |
|