Files
zerogravity/docs/architecture.md
Nikketryhard 3d87c04d20 docs: overhaul docs, add architecture and traces, update README/GEMINI
- Add docs/architecture.md with 4 mermaid diagrams
- Add docs/mitm.md with 3 mermaid diagrams (replaces mitm-interception-status)
- Add docs/traces.md documenting per-call trace system
- Rewrite README.md to be concise with mermaid and doc refs
- Rewrite GEMINI.md for core philosophy and agent usage
- Clean extension-server-analysis.md (remove stale debug sections)
- Delete temp docs: standalone-ls-todo, panel-stream-investigation,
  endpoint-gap-analysis, request-comparison
2026-02-18 01:31:18 -06:00

11 KiB

Architecture

System Overview

flowchart LR
    Client["Client\n(curl, SDK, etc.)"]
    Proxy["Proxy\n:8741"]
    LS["Standalone LS\n:random"]
    MITM["MITM Proxy\n:8742"]
    Google["Google API\ndaily-cloudcode-pa\n.googleapis.com"]

    Client -- "OpenAI / Gemini\nHTTP API" --> Proxy
    Proxy -- "gRPC\n(protobuf)" --> LS
    LS -- "HTTPS :443\n(iptables redirect)" --> MITM
    MITM -- "TLS\n(BoringSSL)" --> Google

    style Proxy fill:#7c3aed,color:#fff
    style MITM fill:#dc2626,color:#fff
    style LS fill:#2563eb,color:#fff
    style Google fill:#059669,color:#fff

The proxy translates OpenAI/Gemini API requests into gRPC calls to a standalone Language Server (LS) binary. A MITM proxy sits between the LS and Google's API to intercept traffic, inject tools/params, and capture real token usage.


Request Lifecycle

sequenceDiagram
    participant C as Client
    participant P as Proxy
    participant S as MitmStore
    participant LS as Standalone LS
    participant M as MITM Proxy
    participant G as Google API

    C->>P: POST /v1/chat/completions
    P->>P: Parse request, resolve model
    P->>S: register_request(cascade_id, tools, params, image)
    P->>LS: SendMessage(cascade_id, ".")
    Note over P: Waits on MITM channel

    LS->>M: HTTPS POST streamGenerateContent
    M->>S: take_request(cascade_id)
    M->>M: modify_request(inject tools, params, user text)
    M->>G: Forward modified request
    G-->>M: SSE stream (text deltas + usage)
    M->>S: dispatch TextDelta, Usage events
    M-->>LS: Forward (original) response

    S-->>P: MitmEvent::TextDelta
    S-->>P: MitmEvent::Usage
    S-->>P: MitmEvent::ResponseComplete
    P-->>C: OpenAI-format JSON/SSE response

Module Map

graph TD
    subgraph "API Layer"
        mod_api["api/mod.rs\n(router)"]
        completions["completions.rs"]
        responses["responses.rs"]
        gemini["gemini.rs"]
        search["search.rs"]
        models["models.rs"]
        types["types.rs"]
        util["util.rs"]
        polling["polling.rs"]
    end

    subgraph "MITM Layer"
        proxy_mitm["proxy.rs\n(TLS termination)"]
        h2["h2_handler.rs\n(HTTP/2 framing)"]
        intercept["intercept.rs\n(SSE parsing)"]
        modify["modify.rs\n(request injection)"]
        store["store.rs\n(MitmStore)"]
        proto_mitm["proto.rs\n(protobuf codec)"]
        ca["ca.rs\n(cert generation)"]
    end

    subgraph "Core"
        main["main.rs"]
        backend["backend.rs\n(gRPC client)"]
        session["session.rs"]
        trace["trace.rs"]
        warmup["warmup.rs"]
        constants["constants.rs"]
        quota["quota.rs"]
    end

    subgraph "Standalone LS"
        spawn["spawn.rs"]
        discovery["discovery.rs"]
        stub["stub.rs\n(extension server)"]
    end

    subgraph "Protobuf"
        proto_mod["proto/mod.rs"]
        wire["proto/wire.rs"]
    end

    main --> mod_api
    main --> backend
    main --> store
    main --> spawn
    mod_api --> completions & responses & gemini & search
    completions & responses & gemini --> store
    completions & responses & gemini --> backend
    store --> intercept
    proxy_mitm --> h2 --> intercept & modify
    modify --> store
    intercept --> store
    spawn --> discovery & stub
    backend --> proto_mod --> wire

    style store fill:#dc2626,color:#fff
    style mod_api fill:#7c3aed,color:#fff
    style proxy_mitm fill:#ea580c,color:#fff
    style main fill:#0d9488,color:#fff

Endpoints

Method Path Handler Description
POST /v1/responses responses::handle_responses OpenAI Responses API (streaming + sync)
POST /v1/chat/completions completions::handle_completions OpenAI Chat Completions API
POST /v1/gemini gemini::handle_gemini Custom Gemini endpoint
POST /v1beta/{*path} gemini::handle_gemini_v1beta Official Gemini v1beta routes
GET/POST /v1/search search::handle_search_* Web search via Google grounding
GET /v1/models handle_models List available models
GET /v1/sessions handle_list_sessions List active sessions
DELETE /v1/sessions/{id} handle_delete_session Delete a session
POST /v1/token handle_set_token Set OAuth token at runtime
GET /v1/usage handle_usage MITM-intercepted token usage
GET /v1/quota handle_quota LS quota (credits, rate limits)
GET /health handle_health Health check

MITM Event Flow

stateDiagram-v2
    [*] --> Registered: register_request()

    Registered --> GateWait: LS sends HTTPS request
    GateWait --> Matched: MITM matches cascade_id

    Matched --> Modifying: modify_request()
    Modifying --> Streaming: Forward to Google

    Streaming --> Streaming: TextDelta / ThinkingDelta
    Streaming --> UsageCaptured: Usage event
    UsageCaptured --> Complete: ResponseComplete
    Streaming --> Error: UpstreamError
    Streaming --> FnCall: FunctionCall

    Complete --> [*]
    Error --> [*]
    FnCall --> Registered: Tool round (re-register)

CLI Flags

Flag Default Description
--port <PORT> 8741 Proxy listen port
--headless true Fully standalone — no running Antigravity app needed
--classic false Attach to running Antigravity (alias for --no-headless)
--no-mitm false Disable MITM proxy entirely
--mitm-port <PORT> 8742 MITM proxy port
--no-standalone false Attach to real LS instead of spawning standalone
--no-trace false Disable per-call debug traces
-v, --verbose false Info-level logging
-d, --debug false Debug-level logging

Source Files

File Lines Purpose
api/responses.rs 1796 Responses API handler (sync, streaming, multi-turn, tools)
mitm/modify.rs 1418 Request modification (tool/image/param injection)
api/completions.rs 1241 Chat Completions handler (OpenAI compat)
mitm/proxy.rs 1165 TLS-terminating MITM proxy
api/gemini.rs 1055 Gemini API handler (native format)
snapshot.rs 695 State snapshots
backend.rs 660 gRPC client to LS
mitm/store.rs 651 Central state store + event channels
mitm/proto.rs 649 Protobuf encode/decode for MITM
mitm/intercept.rs 640 SSE response parser + usage extraction
main.rs 527 CLI, startup, wiring
trace.rs 509 Per-call debug trace system
mitm/h2_handler.rs 477 HTTP/2 frame handling
standalone/spawn.rs 464 LS process spawning
api/search.rs 443 Web search endpoint
api/types.rs 416 Shared request/response types
standalone/discovery.rs 340 LS config discovery from /proc
proto/mod.rs 340 Hand-rolled protobuf encoder
api/polling.rs 340 Cascade polling fallback
standalone/stub.rs ~300 Extension server gRPC stub
proto/wire.rs ~200 Wire-format protobuf helpers
constants.rs ~100 Model IDs, service names

Models

Proxy Name LS Placeholder Description
opus-4.6 MODEL_PLACEHOLDER_M26 Claude Opus 4.6 (Thinking) — default
opus-4.5 MODEL_PLACEHOLDER_M12 Claude Opus 4.5 (Thinking)
gemini-3-pro-high MODEL_PLACEHOLDER_M8 Gemini 3 Pro (High quality)
gemini-3-pro MODEL_PLACEHOLDER_M7 Gemini 3 Pro (Low quality)
gemini-3-flash MODEL_PLACEHOLDER_M18 Gemini 3 Flash

Stealth Features

Feature Implementation
TLS fingerprint BoringSSL via wreq — Chrome JA3/JA4 + H2 fingerprint
Protobuf Hand-rolled encoder producing byte-exact match to real webview
Warmup Mimics real webview startup RPC sequence
Heartbeat Periodic keep-alive matching real webview lifecycle
Reactive streaming StreamCascadeReactiveUpdates for real-time state diffs
Jitter Randomized intervals on warmup/heartbeat
Session reuse Cascades reused for multi-turn (matches real webview)
Version detection Auto-detects Chrome/Electron/app versions from installed binary