Files

Nikketryhard 3d87c04d20 docs: overhaul docs, add architecture and traces, update README/GEMINI

- Add docs/architecture.md with 4 mermaid diagrams
- Add docs/mitm.md with 3 mermaid diagrams (replaces mitm-interception-status)
- Add docs/traces.md documenting per-call trace system
- Rewrite README.md to be concise with mermaid and doc refs
- Rewrite GEMINI.md for core philosophy and agent usage
- Clean extension-server-analysis.md (remove stale debug sections)
- Delete temp docs: standalone-ls-todo, panel-stream-investigation,
  endpoint-gap-analysis, request-comparison

2026-02-18 01:31:18 -06:00

11 KiB

Raw Blame History

Architecture

System Overview

flowchart LR
    Client["Client\n(curl, SDK, etc.)"]
    Proxy["Proxy\n:8741"]
    LS["Standalone LS\n:random"]
    MITM["MITM Proxy\n:8742"]
    Google["Google API\ndaily-cloudcode-pa\n.googleapis.com"]

    Client -- "OpenAI / Gemini\nHTTP API" --> Proxy
    Proxy -- "gRPC\n(protobuf)" --> LS
    LS -- "HTTPS :443\n(iptables redirect)" --> MITM
    MITM -- "TLS\n(BoringSSL)" --> Google

    style Proxy fill:#7c3aed,color:#fff
    style MITM fill:#dc2626,color:#fff
    style LS fill:#2563eb,color:#fff
    style Google fill:#059669,color:#fff

The proxy translates OpenAI/Gemini API requests into gRPC calls to a standalone Language Server (LS) binary. A MITM proxy sits between the LS and Google's API to intercept traffic, inject tools/params, and capture real token usage.

Request Lifecycle

sequenceDiagram
    participant C as Client
    participant P as Proxy
    participant S as MitmStore
    participant LS as Standalone LS
    participant M as MITM Proxy
    participant G as Google API

    C->>P: POST /v1/chat/completions
    P->>P: Parse request, resolve model
    P->>S: register_request(cascade_id, tools, params, image)
    P->>LS: SendMessage(cascade_id, ".")
    Note over P: Waits on MITM channel

    LS->>M: HTTPS POST streamGenerateContent
    M->>S: take_request(cascade_id)
    M->>M: modify_request(inject tools, params, user text)
    M->>G: Forward modified request
    G-->>M: SSE stream (text deltas + usage)
    M->>S: dispatch TextDelta, Usage events
    M-->>LS: Forward (original) response

    S-->>P: MitmEvent::TextDelta
    S-->>P: MitmEvent::Usage
    S-->>P: MitmEvent::ResponseComplete
    P-->>C: OpenAI-format JSON/SSE response

Module Map

graph TD
    subgraph "API Layer"
        mod_api["api/mod.rs\n(router)"]
        completions["completions.rs"]
        responses["responses.rs"]
        gemini["gemini.rs"]
        search["search.rs"]
        models["models.rs"]
        types["types.rs"]
        util["util.rs"]
        polling["polling.rs"]
    end

    subgraph "MITM Layer"
        proxy_mitm["proxy.rs\n(TLS termination)"]
        h2["h2_handler.rs\n(HTTP/2 framing)"]
        intercept["intercept.rs\n(SSE parsing)"]
        modify["modify.rs\n(request injection)"]
        store["store.rs\n(MitmStore)"]
        proto_mitm["proto.rs\n(protobuf codec)"]
        ca["ca.rs\n(cert generation)"]
    end

    subgraph "Core"
        main["main.rs"]
        backend["backend.rs\n(gRPC client)"]
        session["session.rs"]
        trace["trace.rs"]
        warmup["warmup.rs"]
        constants["constants.rs"]
        quota["quota.rs"]
    end

    subgraph "Standalone LS"
        spawn["spawn.rs"]
        discovery["discovery.rs"]
        stub["stub.rs\n(extension server)"]
    end

    subgraph "Protobuf"
        proto_mod["proto/mod.rs"]
        wire["proto/wire.rs"]
    end

    main --> mod_api
    main --> backend
    main --> store
    main --> spawn
    mod_api --> completions & responses & gemini & search
    completions & responses & gemini --> store
    completions & responses & gemini --> backend
    store --> intercept
    proxy_mitm --> h2 --> intercept & modify
    modify --> store
    intercept --> store
    spawn --> discovery & stub
    backend --> proto_mod --> wire

    style store fill:#dc2626,color:#fff
    style mod_api fill:#7c3aed,color:#fff
    style proxy_mitm fill:#ea580c,color:#fff
    style main fill:#0d9488,color:#fff

Endpoints

Method	Path	Handler	Description
`POST`	`/v1/responses`	`responses::handle_responses`	OpenAI Responses API (streaming + sync)
`POST`	`/v1/chat/completions`	`completions::handle_completions`	OpenAI Chat Completions API
`POST`	`/v1/gemini`	`gemini::handle_gemini`	Custom Gemini endpoint
`POST`	`/v1beta/{*path}`	`gemini::handle_gemini_v1beta`	Official Gemini v1beta routes
`GET/POST`	`/v1/search`	`search::handle_search_*`	Web search via Google grounding
`GET`	`/v1/models`	`handle_models`	List available models
`GET`	`/v1/sessions`	`handle_list_sessions`	List active sessions
`DELETE`	`/v1/sessions/{id}`	`handle_delete_session`	Delete a session
`POST`	`/v1/token`	`handle_set_token`	Set OAuth token at runtime
`GET`	`/v1/usage`	`handle_usage`	MITM-intercepted token usage
`GET`	`/v1/quota`	`handle_quota`	LS quota (credits, rate limits)
`GET`	`/health`	`handle_health`	Health check

MITM Event Flow

stateDiagram-v2
    [*] --> Registered: register_request()

    Registered --> GateWait: LS sends HTTPS request
    GateWait --> Matched: MITM matches cascade_id

    Matched --> Modifying: modify_request()
    Modifying --> Streaming: Forward to Google

    Streaming --> Streaming: TextDelta / ThinkingDelta
    Streaming --> UsageCaptured: Usage event
    UsageCaptured --> Complete: ResponseComplete
    Streaming --> Error: UpstreamError
    Streaming --> FnCall: FunctionCall

    Complete --> [*]
    Error --> [*]
    FnCall --> Registered: Tool round (re-register)

CLI Flags

Flag	Default	Description
`--port <PORT>`	`8741`	Proxy listen port
`--headless`	`true`	Fully standalone — no running Antigravity app needed
`--classic`	`false`	Attach to running Antigravity (alias for `--no-headless`)
`--no-mitm`	`false`	Disable MITM proxy entirely
`--mitm-port <PORT>`	`8742`	MITM proxy port
`--no-standalone`	`false`	Attach to real LS instead of spawning standalone
`--no-trace`	`false`	Disable per-call debug traces
`-v, --verbose`	`false`	Info-level logging
`-d, --debug`	`false`	Debug-level logging

Source Files

File	Lines	Purpose
`api/responses.rs`	1796	Responses API handler (sync, streaming, multi-turn, tools)
`mitm/modify.rs`	1418	Request modification (tool/image/param injection)
`api/completions.rs`	1241	Chat Completions handler (OpenAI compat)
`mitm/proxy.rs`	1165	TLS-terminating MITM proxy
`api/gemini.rs`	1055	Gemini API handler (native format)
`snapshot.rs`	695	State snapshots
`backend.rs`	660	gRPC client to LS
`mitm/store.rs`	651	Central state store + event channels
`mitm/proto.rs`	649	Protobuf encode/decode for MITM
`mitm/intercept.rs`	640	SSE response parser + usage extraction
`main.rs`	527	CLI, startup, wiring
`trace.rs`	509	Per-call debug trace system
`mitm/h2_handler.rs`	477	HTTP/2 frame handling
`standalone/spawn.rs`	464	LS process spawning
`api/search.rs`	443	Web search endpoint
`api/types.rs`	416	Shared request/response types
`standalone/discovery.rs`	340	LS config discovery from `/proc`
`proto/mod.rs`	340	Hand-rolled protobuf encoder
`api/polling.rs`	340	Cascade polling fallback
`standalone/stub.rs`	~300	Extension server gRPC stub
`proto/wire.rs`	~200	Wire-format protobuf helpers
`constants.rs`	~100	Model IDs, service names

Models

Proxy Name	LS Placeholder	Description
`opus-4.6`	`MODEL_PLACEHOLDER_M26`	Claude Opus 4.6 (Thinking) — default
`opus-4.5`	`MODEL_PLACEHOLDER_M12`	Claude Opus 4.5 (Thinking)
`gemini-3-pro-high`	`MODEL_PLACEHOLDER_M8`	Gemini 3 Pro (High quality)
`gemini-3-pro`	`MODEL_PLACEHOLDER_M7`	Gemini 3 Pro (Low quality)
`gemini-3-flash`	`MODEL_PLACEHOLDER_M18`	Gemini 3 Flash

Stealth Features

Feature	Implementation
TLS fingerprint	BoringSSL via `wreq` — Chrome JA3/JA4 + H2 fingerprint
Protobuf	Hand-rolled encoder producing byte-exact match to real webview
Warmup	Mimics real webview startup RPC sequence
Heartbeat	Periodic keep-alive matching real webview lifecycle
Reactive streaming	`StreamCascadeReactiveUpdates` for real-time state diffs
Jitter	Randomized intervals on warmup/heartbeat
Session reuse	Cascades reused for multi-turn (matches real webview)
Version detection	Auto-detects Chrome/Electron/app versions from installed binary

11 KiB Raw Blame History