Files
zerogravity/GEMINI.md
Nikketryhard 1a5075dd20 refactor: remove /v1/gemini endpoint, replaced by /v1beta routes
- Delete handle_gemini handler (identical to handle_gemini_v1beta)
- Remove /v1/gemini route from router
- Update root handler service name to zerogravity
- Clean all doc references
2026-02-18 01:59:22 -06:00

5.9 KiB

Antigravity Rust Proxy

OpenAI-compatible proxy that intercepts and relays requests to Google's Antigravity language server, impersonating the real Electron webview.

Core Philosophy

Stealth Goal

The primary objective is to make Google's upstream API unable to distinguish proxy requests from real Antigravity webview traffic. Unlike cliProxyApi or other known proxy patterns, this proxy:

  • Produces byte-exact protobuf matching real webview format
  • Uses BoringSSL TLS fingerprinting with Chrome JA3/JA4 + H2 signatures (version auto-detected)
  • Performs warmup and heartbeat RPCs mimicking real webview lifecycle
  • Applies jitter to all intervals to avoid automation fingerprints
  • Reuses cascades for multi-turn just like the real webview

Stability Approach

The Language Server (LS) binary is a closed-source Go program with many unknown mechanics. To avoid instability:

  1. Send dummy prompts to the LS — the proxy sends "." as the cascade message. The LS receives minimal input to reduce the chance of panics or unexpected behavior.
  2. All real content goes through MITM — the MITM proxy intercepts the LS's outgoing request and replaces the dummy prompt with the real user input, injects tools, images, generation params, etc.
  3. Never send results back to the LS — tool results, function responses, and follow-ups are injected into the next MITM-intercepted request. The LS is used as a dumb relay that triggers API calls — nothing more.
  4. Pass as little as possible — the LS only needs a cascade ID and a dummy message. Everything else is handled by the MITM layer.

This "LS as dumb relay" pattern keeps the LS interactions minimal and predictable, avoiding the many unknown edge cases in its internal state machine.

Agent Quick Reference

zg — Daemon Manager

zg commands exit immediately (not foreground) — safe for agent use via fast-bash MCP.

# Rebuild and restart after code changes
zg restart

# Quick test
zg test "say hi in 3 words"

# Check status
zg status

# Check health
zg health
Command Description
zg start Start the proxy daemon
zg stop Stop the proxy daemon
zg restart Rebuild + restart
zg rebuild Build release binary only
zg status Service status + quota + usage
zg logs [N] Tail last N lines + follow
zg logs-all Full log dump (no follow)
zg test [msg] Quick test request (gemini-3-flash)
zg health Health check

Testing After Changes

# 1. Rebuild + restart
zg restart

# 2. Test an endpoint
curl -s http://localhost:8741/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gemini-3-flash", "messages": [{"role": "user", "content": "Say hi"}]}' | jq .

# 3. Inspect latest trace
TRACE_DIR=~/.config/zerogravity/traces/$(date +%Y-%m-%d)
cat "$TRACE_DIR/$(ls -t "$TRACE_DIR" | head -1)/summary.md"

Dev vs Production Models

  • gemini-3-flash — use for all development and testing
  • opus-4.6 — production only, has quota limits

Endpoints

Method Path Description
POST /v1/responses Responses API (sync + streaming)
POST /v1/chat/completions Chat Completions API (OpenAI compat)
POST /v1beta/models/{model}:{action} Official Gemini v1beta routes
GET/POST /v1/search Web Search via Google grounding
GET /v1/models List available models
GET /v1/sessions List active sessions
DELETE /v1/sessions/{id} Delete a session
POST /v1/token Set OAuth token at runtime
GET /v1/usage MITM-intercepted token usage
GET /v1/quota LS quota and rate limits
GET /health Health check

Authentication

The proxy needs an OAuth token:

  1. Env var: ZEROGRAVITY_TOKEN=ya29.xxx
  2. Token file: ~/.config/zerogravity-token
  3. Runtime: curl -X POST http://localhost:8741/v1/token -d '{"token":"ya29.xxx"}'

CLI Flags

Flag Default Description
--headless true Fully standalone — no running Antigravity app needed
--classic false Attach to running Antigravity (alias for --no-headless)
--port <PORT> 8741 Proxy listen port
--no-mitm false Disable MITM proxy
--mitm-port <PORT> 8742 MITM proxy port
--no-standalone false Attach to real LS instead of spawning standalone
--no-trace false Disable per-call debug traces

Documentation

See docs/ for detailed documentation:

  • architecture.md — system overview, module map, request lifecycle (mermaid diagrams)
  • mitm.md — MITM proxy internals, event flow, request modification
  • traces.md — per-call debug trace system
  • extension-server-analysis.md — extension server protocol reverse engineering
  • ls-binary-analysis.md — LS binary reverse engineering, model catalog, gRPC services