zerogravity/GEMINI.md

# Antigravity Rust Proxy

OpenAI-compatible proxy that intercepts and relays requests to Google's Antigravity language server, impersonating the real Electron webview.

## Quick Start

```bash
# First-time setup (creates user + iptables for MITM)
sudo ./scripts/mitm-redirect.sh install

# Start as daemon (builds if needed)
proxyctl start

# Or run directly
RUST_LOG=info ./target/release/antigravity-proxy
```

Default port: **8741**

## CLI Tools

### `proxyctl` — Daemon Manager

Symlinked to `~/.local/bin/proxyctl` for global access. Manages the proxy as a systemd user service.

| Command               | Description                             |
| --------------------- | --------------------------------------- |
| `proxyctl start`      | Start the proxy daemon                  |
| `proxyctl stop`       | Stop the proxy daemon                   |
| `proxyctl restart`    | Rebuild + restart                       |
| `proxyctl rebuild`    | Build release binary only               |
| `proxyctl status`     | Service status + quota + usage          |
| `proxyctl logs [N]`   | Tail last N lines (default 30) + follow |
| `proxyctl logs-all`   | Full log dump (no follow)               |
| `proxyctl test [msg]` | Quick test request (gemini-3-flash)     |
| `proxyctl health`     | Health check                            |

### `mitm-redirect.sh` — MITM Setup

One-time setup script for UID-scoped iptables traffic redirection.

```bash
sudo ./scripts/mitm-redirect.sh install    # create user + iptables rule
sudo ./scripts/mitm-redirect.sh uninstall  # remove user + iptables rule
sudo ./scripts/mitm-redirect.sh status     # check current state
```

## Endpoints

| Method     | Path                   | Description                                                 |
| ---------- | ---------------------- | ----------------------------------------------------------- |
| `POST`     | `/v1/responses`        | **Responses API** (primary) — supports `stream: true/false` |
| `POST`     | `/v1/chat/completions` | Chat Completions API (OpenAI compat shim)                   |
| `GET/POST` | `/v1/search`           | **Web Search** — Google Search grounding, returns results   |
| `GET`      | `/v1/models`           | List available models                                       |
| `GET`      | `/v1/sessions`         | List active sessions                                        |
| `DELETE`   | `/v1/sessions/:id`     | Delete a session                                            |
| `POST`     | `/v1/token`            | Set OAuth token at runtime                                  |
| `GET`      | `/v1/usage`            | MITM-intercepted token usage stats                          |
| `GET`      | `/v1/quota`            | LS quota — credits, per-model rate limits, reset timers     |
| `GET`      | `/health`              | Health check                                                |

## Available Models

| Name                | Label                                    |
| ------------------- | ---------------------------------------- |
| `opus-4.6`          | Claude Opus 4.6 (Thinking) — **default** |
| `opus-4.5`          | Claude Opus 4.5 (Thinking)               |
| `gemini-3-pro-high` | Gemini 3 Pro (High)                      |
| `gemini-3-pro`      | Gemini 3 Pro (Low)                       |
| `gemini-3-flash`    | Gemini 3 Flash                           |

## Development & Testing

- **Dev/testing model**: `gemini-3-flash` — use this for all development, debugging, and iterative testing
- **Production model**: `opus-4.6` — use sparingly for real-world validation only (has quota limit)
- See `docs/ls-binary-analysis.md` for full reverse-engineered model catalog and proto enum mappings

## Example: Responses API

### Sync

```bash
curl -s http://localhost:8741/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-flash",
    "input": "Say hello in exactly 3 words",
    "stream": false,
    "timeout": 60
  }' | jq .
```

### Streaming

```bash
curl -N http://localhost:8741/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-flash",
    "input": "Say hello in exactly 3 words",
    "stream": true,
    "timeout": 60
  }'
```

### Multi-turn (session reuse)

```bash
curl -s http://localhost:8741/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-flash",
    "input": "What is 2+2?",
    "conversation": "my-session-1",
    "stream": false
  }' | jq .

# Follow-up in same cascade:
curl -s http://localhost:8741/v1/responses \\
  -H "Content-Type: application/json" \\
  -d '{
    "model": "gemini-3-flash",
    "input": "Now multiply that by 10",
    "conversation": "my-session-1",
    "stream": false
  }' | jq .
```

## Web Search

The proxy supports Google Search grounding in two ways:

### 1. Dedicated Search Endpoint (`/v1/search`)

Returns structured search results with citations:

```bash
# Quick GET search
curl -s 'http://localhost:8741/v1/search?q=latest+rust+news' | jq .

# Full POST search with options
curl -s http://localhost:8741/v1/search \\
  -H "Content-Type: application/json" \\
  -d '{
    "query": "latest Rust programming news",
    "model": "gemini-3-flash",
    "timeout": 30
  }' | jq .
```

Response includes `summary`, `results[]` (title + URL), `citations[]`, and raw `grounding_metadata`.

### 2. Inline Grounding (on any endpoint)

Enable Google Search grounding on regular requests:

```bash
# Completions API
curl -s http://localhost:8741/v1/chat/completions \\
  -H "Content-Type: application/json" \\
  -d '{
    "model": "gemini-3-flash",
    "messages": [{"role": "user", "content": "What happened in tech today?"}],
    "web_search": true
  }' | jq .

# Responses API (OpenAI-style tool)
curl -s http://localhost:8741/v1/responses \\
  -H "Content-Type: application/json" \\
  -d '{
    "model": "gemini-3-flash",
    "input": "What happened in tech today?",
    "tools": [{"type": "web_search_preview"}],
    "stream": false
  }' | jq .

# Gemini API
curl -s http://localhost:8741/v1/gemini \\
  -H "Content-Type: application/json" \\
  -d '{
    "model": "gemini-3-flash",
    "message": "What happened in tech today?",
    "google_search": true
  }' | jq .
```

## Authentication

The proxy needs an OAuth token. Three ways to provide it:

1. **Environment variable**: `export ANTIGRAVITY_OAUTH_TOKEN=ya29.xxx`
2. **Token file**: `echo 'ya29.xxx' > ~/.config/antigravity-proxy-token`
3. **Runtime API**: `curl -X POST http://localhost:8741/v1/token -d '{"token":"ya29.xxx"}'`

## Version Detection

Version strings (Antigravity, Chrome, Electron, Client) are **auto-detected** at startup from the installed Antigravity app:

- `product.json` → app version + client/IDE version
- Binary → Chrome + Electron versions via `strings`

Falls back to hardcoded values if the app isn't installed. No manual updates needed when Antigravity updates.

## Standalone LS

By default, the proxy spawns its own Language Server instance for full isolation:

1. Discovers the main LS config (`extension_server_port`, `csrf_token`) from the running Antigravity app
2. Spawns a standalone LS binary on a random port
3. Builds init metadata protobuf (model config, `detect_and_use_proxy=ENABLED`)
4. If MITM is active, spawns as `antigravity-ls` user for UID-scoped traffic interception
5. Kills the child on proxy shutdown

Disable with `--no-standalone` to attach to the real LS instead.

**Module:** `src/standalone.rs`

## Stealth Features

- **TLS fingerprint**: BoringSSL with Chrome JA3/JA4 + H2 fingerprint via `wreq` (version auto-detected)
- **Protobuf**: Hand-rolled encoder producing byte-exact match to real webview traffic
- **Warmup**: Mimics real webview startup RPC calls
- **Heartbeat**: Periodic keep-alive matching real webview lifecycle
- **Reactive streaming**: `StreamCascadeReactiveUpdates` for real-time state diffs (polling fallback)
- **Jitter**: Randomized intervals to avoid automation fingerprint
- **Session reuse**: Cascades reused for multi-turn, matching real webview behavior
- **MITM proxy**: TLS-intercepting proxy for real token usage capture

## MITM Proxy

Built-in MITM proxy intercepts LS ↔ Google API traffic to capture **real** token usage (input, output, thinking tokens). Enabled by default with the standalone LS. Disable with `--no-mitm`.

### How It Works

```
Client → Proxy (8741) → Standalone LS (as antigravity-ls user)
                           ↓ (port 443 traffic)
                        iptables REDIRECT (UID-scoped)
                           ↓
                        MITM Proxy (8742)
                           ↓ (TLS decrypt + parse SSE)
                        Google API (daily-cloudcode-pa.googleapis.com)
```

### Setup

```bash
# One-time setup (creates user + iptables rule)
sudo ./scripts/mitm-redirect.sh install

# Run proxy (standalone LS + MITM are both on by default)
RUST_LOG=info ./target/release/antigravity-proxy

# Check intercepted usage
curl -s http://localhost:8741/v1/usage | jq .

# Cleanup
sudo ./scripts/mitm-redirect.sh uninstall
```

### Details

- **UID-scoped iptables**: Only the standalone LS's traffic is intercepted (no side effects)
- **Combined CA bundle**: System CAs + MITM CA → `/tmp/antigravity-mitm-combined-ca.pem`
- **Google SSE parsing**: Extracts `promptTokenCount`, `candidatesTokenCount`, `thoughtsTokenCount`
- **Init metadata**: Protobuf field 34 `detect_and_use_proxy` set to ENABLED (1)
- See `docs/mitm-interception-status.md` for full technical details
- See `docs/ls-binary-analysis.md` for proto enum mappings and model IDs

### CLI Flags

- `--no-mitm`: Disable MITM proxy entirely
- `--no-standalone`: Attach to existing LS instead of spawning standalone
- `--mitm-port <PORT>`: Override MITM proxy port (default: auto-assign)
- `--port <PORT>`: Override proxy listen port (default: 8741)