- Delete handle_gemini handler (identical to handle_gemini_v1beta) - Remove /v1/gemini route from router - Update root handler service name to zerogravity - Clean all doc references
11 KiB
11 KiB
Architecture
System Overview
flowchart LR
Client["Client<br/>(curl, SDK, etc.)"]
Proxy["Proxy<br/>:8741"]
LS["Standalone LS<br/>:random"]
MITM["MITM Proxy<br/>:8742"]
Google["Google API<br/>daily-cloudcode-pa<br/>.googleapis.com"]
Client -- "OpenAI / Gemini<br/>HTTP API" --> Proxy
Proxy -- "gRPC<br/>(protobuf)" --> LS
LS -- "HTTPS :443<br/>(iptables redirect)" --> MITM
MITM -- "TLS<br/>(BoringSSL)" --> Google
style Proxy fill:#7c3aed,color:#fff
style MITM fill:#dc2626,color:#fff
style LS fill:#2563eb,color:#fff
style Google fill:#059669,color:#fff
The proxy translates OpenAI/Gemini API requests into gRPC calls to a standalone Language Server (LS) binary. A MITM proxy sits between the LS and Google's API to intercept traffic, inject tools/params, and capture real token usage.
Request Lifecycle
sequenceDiagram
participant C as Client
participant P as Proxy
participant S as MitmStore
participant LS as Standalone LS
participant M as MITM Proxy
participant G as Google API
C->>P: POST /v1/chat/completions
P->>P: Parse request, resolve model
P->>S: register_request(cascade_id, tools, params, image)
P->>LS: SendMessage(cascade_id, ".")
Note over P: Waits on MITM channel
LS->>M: HTTPS POST streamGenerateContent
M->>S: take_request(cascade_id)
M->>M: modify_request(inject tools, params, user text)
M->>G: Forward modified request
G-->>M: SSE stream (text deltas + usage)
M->>S: dispatch TextDelta, Usage events
M-->>LS: Forward (original) response
S-->>P: MitmEvent::TextDelta
S-->>P: MitmEvent::Usage
S-->>P: MitmEvent::ResponseComplete
P-->>C: OpenAI-format JSON/SSE response
Module Map
graph TD
subgraph "API Layer"
mod_api["api/mod.rs<br/>(router)"]
completions["completions.rs"]
responses["responses.rs"]
gemini["gemini.rs"]
search["search.rs"]
models["models.rs"]
types["types.rs"]
util["util.rs"]
polling["polling.rs"]
end
subgraph "MITM Layer"
proxy_mitm["proxy.rs<br/>(TLS termination)"]
h2["h2_handler.rs<br/>(HTTP/2 framing)"]
intercept["intercept.rs<br/>(SSE parsing)"]
modify["modify.rs<br/>(request injection)"]
store["store.rs<br/>(MitmStore)"]
proto_mitm["proto.rs<br/>(protobuf codec)"]
ca["ca.rs<br/>(cert generation)"]
end
subgraph "Core"
main["main.rs"]
backend["backend.rs<br/>(gRPC client)"]
session["session.rs"]
trace["trace.rs"]
warmup["warmup.rs"]
constants["constants.rs"]
quota["quota.rs"]
end
subgraph "Standalone LS"
spawn["spawn.rs"]
discovery["discovery.rs"]
stub["stub.rs<br/>(extension server)"]
end
subgraph "Protobuf"
proto_mod["proto/mod.rs"]
wire["proto/wire.rs"]
end
main --> mod_api
main --> backend
main --> store
main --> spawn
mod_api --> completions & responses & gemini & search
completions & responses & gemini --> store
completions & responses & gemini --> backend
store --> intercept
proxy_mitm --> h2 --> intercept & modify
modify --> store
intercept --> store
spawn --> discovery & stub
backend --> proto_mod --> wire
style store fill:#dc2626,color:#fff
style mod_api fill:#7c3aed,color:#fff
style proxy_mitm fill:#ea580c,color:#fff
style main fill:#0d9488,color:#fff
Endpoints
| Method | Path | Handler | Description |
|---|---|---|---|
POST |
/v1/responses |
responses::handle_responses |
OpenAI Responses API (streaming + sync) |
POST |
/v1/chat/completions |
completions::handle_completions |
OpenAI Chat Completions API |
POST |
/v1beta/{*path} |
gemini::handle_gemini_v1beta |
Official Gemini v1beta routes |
GET/POST |
/v1/search |
search::handle_search_* |
Web search via Google grounding |
GET |
/v1/models |
handle_models |
List available models |
GET |
/v1/sessions |
handle_list_sessions |
List active sessions |
DELETE |
/v1/sessions/{id} |
handle_delete_session |
Delete a session |
POST |
/v1/token |
handle_set_token |
Set OAuth token at runtime |
GET |
/v1/usage |
handle_usage |
MITM-intercepted token usage |
GET |
/v1/quota |
handle_quota |
LS quota (credits, rate limits) |
GET |
/health |
handle_health |
Health check |
MITM Event Flow
stateDiagram-v2
[*] --> Registered: register_request()
Registered --> GateWait: LS sends HTTPS request
GateWait --> Matched: MITM matches cascade_id
Matched --> Modifying: modify_request()
Modifying --> Streaming: Forward to Google
Streaming --> Streaming: TextDelta / ThinkingDelta
Streaming --> UsageCaptured: Usage event
UsageCaptured --> Complete: ResponseComplete
Streaming --> Error: UpstreamError
Streaming --> FnCall: FunctionCall
Complete --> [*]
Error --> [*]
FnCall --> Registered: Tool round (re-register)
CLI Flags
| Flag | Default | Description |
|---|---|---|
--port <PORT> |
8741 |
Proxy listen port |
--headless |
true |
Fully standalone — no running Antigravity app needed |
--classic |
false |
Attach to running Antigravity (alias for --no-headless) |
--no-mitm |
false |
Disable MITM proxy entirely |
--mitm-port <PORT> |
8742 |
MITM proxy port |
--no-standalone |
false |
Attach to real LS instead of spawning standalone |
--no-trace |
false |
Disable per-call debug traces |
-v, --verbose |
false |
Info-level logging |
-d, --debug |
false |
Debug-level logging |
Source Files
| File | Lines | Purpose |
|---|---|---|
api/responses.rs |
1796 | Responses API handler (sync, streaming, multi-turn, tools) |
mitm/modify.rs |
1418 | Request modification (tool/image/param injection) |
api/completions.rs |
1241 | Chat Completions handler (OpenAI compat) |
mitm/proxy.rs |
1165 | TLS-terminating MITM proxy |
api/gemini.rs |
1055 | Gemini API handler (native format) |
snapshot.rs |
695 | State snapshots |
backend.rs |
660 | gRPC client to LS |
mitm/store.rs |
651 | Central state store + event channels |
mitm/proto.rs |
649 | Protobuf encode/decode for MITM |
mitm/intercept.rs |
640 | SSE response parser + usage extraction |
main.rs |
527 | CLI, startup, wiring |
trace.rs |
509 | Per-call debug trace system |
mitm/h2_handler.rs |
477 | HTTP/2 frame handling |
standalone/spawn.rs |
464 | LS process spawning |
api/search.rs |
443 | Web search endpoint |
api/types.rs |
416 | Shared request/response types |
standalone/discovery.rs |
340 | LS config discovery from /proc |
proto/mod.rs |
340 | Hand-rolled protobuf encoder |
api/polling.rs |
340 | Cascade polling fallback |
standalone/stub.rs |
~300 | Extension server gRPC stub |
proto/wire.rs |
~200 | Wire-format protobuf helpers |
constants.rs |
~100 | Model IDs, service names |
Models
| Proxy Name | LS Placeholder | Description |
|---|---|---|
opus-4.6 |
MODEL_PLACEHOLDER_M26 |
Claude Opus 4.6 (Thinking) — default |
opus-4.5 |
MODEL_PLACEHOLDER_M12 |
Claude Opus 4.5 (Thinking) |
gemini-3-pro-high |
MODEL_PLACEHOLDER_M8 |
Gemini 3 Pro (High quality) |
gemini-3-pro |
MODEL_PLACEHOLDER_M7 |
Gemini 3 Pro (Low quality) |
gemini-3-flash |
MODEL_PLACEHOLDER_M18 |
Gemini 3 Flash |
Stealth Features
| Feature | Implementation |
|---|---|
| TLS fingerprint | BoringSSL via wreq — Chrome JA3/JA4 + H2 fingerprint |
| Protobuf | Hand-rolled encoder producing byte-exact match to real webview |
| Warmup | Mimics real webview startup RPC sequence |
| Heartbeat | Periodic keep-alive matching real webview lifecycle |
| Reactive streaming | StreamCascadeReactiveUpdates for real-time state diffs |
| Jitter | Randomized intervals on warmup/heartbeat |
| Session reuse | Cascades reused for multi-turn (matches real webview) |
| Version detection | Auto-detects Chrome/Electron/app versions from installed binary |