feat: MITM interception for standalone LS with UID isolation

- Spawn standalone LS as dedicated 'antigravity-ls' user via sudo - UID-scoped iptables redirect (port 443 → MITM proxy) via mitm-redirect.sh - Combined CA bundle (system CAs + MITM CA) for Go TLS trust - Transparent TLS interception with chunked response detection - Google SSE parser for streamGenerateContent usage extraction - Timeouts on all MITM operations (TLS handshake, upstream, idle) - Forward response data immediately (no buffering) - Per-model token usage capture (input, output, thinking) - Update docs and known issues to reflect resolved TLS blocker
2026-02-14 17:50:12 -06:00
parent 6842bfeaa5
commit d4de436856
10 changed files with 1156 additions and 478 deletions
--- a/KNOWN_ISSUES.md
+++ b/KNOWN_ISSUES.md
@@ -1,92 +1,62 @@
 # Known Issues & Future Work

-All fixable issues from the original report have been resolved. The remaining
-items require either architectural changes, new features, or deep investigation
-of the Go language server binary.
+All critical blockers have been resolved. MITM interception is fully working
+in standalone mode with UID-scoped iptables redirection.

 ---

-## 🔴 Blockers (Require Deep Investigation)
+## ✅ Resolved

-### 1. LS Go LLM Client Ignores System TLS Trust Store
+### ~~LS Go LLM Client Ignores System TLS Trust Store~~

-**File:** `docs/mitm-interception-status.md`
+**Status: SOLVED (2026-02-14)**

-The LS binary's Go HTTP client for LLM API calls uses a custom `tls.Config` that
-does **not** trust system CAs or honor `SSL_CERT_FILE`. Our MITM proxy can route
-traffic but not decrypt it.
+Previously the #1 blocker. The standalone LS (`--standalone` flag) now routes
+all LLM API traffic through the MITM proxy with full decryption.

-**Investigation status:** All practical approaches have been tried and failed:
+**Solution:**

- iptables REDIRECT → redirect loop + broke all HTTPS traffic
- DNS redirect → same TLS trust failure
- LD_PRELOAD → Go doesn't use libc for syscalls
- SSLKEYLOGFILE → Go doesn't support it
+1. **UID-scoped iptables** — `scripts/mitm-redirect.sh` creates an `antigravity-ls`
+   system user. iptables redirects only that UID's port-443 traffic → MITM port.
+2. **Combined CA bundle** — The Go client honors `SSL_CERT_FILE` when set on
+   the standalone process. A combined bundle (system CAs + MITM CA) is written
+   to `/tmp/antigravity-mitm-combined-ca.pem`.
+3. **`sudo -u` spawning** — The proxy spawns the LS as the `antigravity-ls` user,
+   so only the standalone LS traffic is intercepted. No impact on other software.
+4. **Google SSE parsing** — MITM parses `streamGenerateContent?alt=sse` responses
+   and extracts `promptTokenCount`, `candidatesTokenCount`, `thoughtsTokenCount`.

-**Remaining options (untried):**
-
- Binary patching Go TLS verification (fragile, breaks on updates)
- Full standalone LS control (see issue #2)
- eBPF/ptrace syscall interception (complex)
- Network namespace isolation (complex setup)
-
-**Confidence: <30%** — all easy paths exhausted. Requires reverse engineering the Go binary's TLS setup.
-
-**See:** `docs/mitm-interception-status.md` for full analysis
+**Verified:** `/v1/usage` returns per-model token usage from intercepted traffic.

 ---

-### 2. Standalone LS Cascades Silently Fail
+## 🟡 Medium (Architecture / Future Work)

-**File:** `docs/standalone-ls-todo.md`
-
-Standalone LS (outside Antigravity) accepts `StartCascade` RPCs without error
-but cascade never progresses. No output.
-
-**Suspected blockers:**
-
- Missing auth context (OAuth token propagation)
- Different Unleash feature flags between main and standalone instances
- Missing initialization steps (`LoadCodeAssist`, `OnboardUser`)
- Missing extension server callbacks (`WriteCascadeEdit`, `ExecuteCommand`)
-
-**Confidence: <30%** — too many unknowns. Needs systematic debugging with the standalone LS.
-
-**See:** `docs/standalone-ls-todo.md` for investigation plan
-
---
-
-## Medium (Architecture / Future Work)
-
-### 3. Cascade Correlation Is Heuristic
+### 1. Cascade Correlation Is Heuristic

 **File:** `src/mitm/intercept.rs` — `extract_cascade_hint()`

-The MITM proxy matches intercepted API traffic to cascade IDs heuristically:
+The MITM proxy matches intercepted API traffic to cascade IDs heuristically.
+Currently all intercepted usage is stored under `_latest` because the Google
+SSE request body is empty (`content_length=0` — the LS sends the request body
+via chunked encoding that isn't captured in the hint extractor).

- HTTP/1.1 path: scans JSON body for `metadata.user_id` or `workspace_id`
- gRPC/H2 path: recursively searches proto fields for UUID strings
-
-If neither method finds a match, usage is stored under `_latest` but never
-consumed (since `take_usage()` requires exact cascade ID match).
-
-**Confidence: <50%** — can't test without working MITM interception (blocked by issue #1). The heuristic is reasonable but unverified against real traffic.
+**Impact:** Usage shows up in `/v1/usage` aggregate stats but isn't correlated
+to specific cascades. Not blocking — aggregate usage is the primary use case.

 ---

-### 4. Request Modification Not Implemented
+### 2. Request Modification Not Implemented

 **File:** `src/mitm/proxy.rs` — `modify_requests: bool`

-The `MitmConfig.modify_requests` flag is plumbed through the entire call chain
-but hardcoded to `false`. No modification logic exists. This is intentional
-scaffolding for future use.
-
-**Status:** Not a bug — reserved for potential request mutation features.
+The `MitmConfig.modify_requests` flag is plumbed through but hardcoded to `false`.
+Reserved for future request mutation features (e.g., injecting custom system
+prompts, modifying model selection).

 ---

-### 5. Polling-Based Cascade Updates vs Streaming RPC
+### 3. Polling-Based Cascade Updates vs Streaming RPC

 **File:** `src/api/polling.rs`

@@ -94,23 +64,26 @@ We poll `GetCascadeTrajectorySteps` on a timer. The LS has a
 `StreamCascadeReactiveUpdates` streaming gRPC method that pushes updates
 in real-time. Polling works but adds latency.

-**Status:** Functional but suboptimal. Switching to streaming requires
-implementing a gRPC streaming client with reconnection handling. Not blocking.
+**Status:** Functional but suboptimal.

 ---

 ## 🟢 Low

-### 6. No Integration Tests for MITM Module
+### 4. MITM Integration Tests

-Unit tests cover protobuf decoding and intercept parsing (17 tests pass), but
-no integration tests for:
+Unit tests cover protobuf decoding and intercept parsing (18 tests pass).
+Integration tests for the full MITM pipeline (TLS interception, response
+parsing, usage recording) would be valuable now that interception works.

- TLS interception end-to-end with the generated CA
- Full HTTP/1.1 request/response cycle through the proxy
- gRPC (HTTP/2) request/response cycle through `h2_handler`
- Store recording and retrieval under concurrency
+### 5. MITM for Main Antigravity Session

-**Status:** The MITM can't intercept real traffic anyway (blocked by issue #1),
-so integration tests would be somewhat hypothetical. Worth adding when the TLS
-blocker is resolved.
+The current MITM only works for the standalone LS (`--standalone` mode).
+Intercepting the main Antigravity session's LS is harder because:
+
+- The main LS is managed by the Antigravity app, not by us
+- UID-scoped iptables can't target it without affecting all user traffic
+- The `mitm-wrapper.sh` approach sets env vars but the LLM client ignores
+  `HTTPS_PROXY` unless `detect_and_use_proxy` is ENABLED via init metadata
+
+**Workaround:** Use `--standalone` mode for all proxy traffic.
--- a/docs/mitm-interception-status.md
+++ b/docs/mitm-interception-status.md
@@ -1,275 +1,144 @@
-# MITM Traffic Interception — Research & Status
+# MITM Traffic Interception — Status

-## Goal
+## Status: ✅ FULLY WORKING (Standalone Mode)

-Capture the LS's LLM API traffic (requests + responses, including system prompts
-and token usage) by routing it through our MITM proxy.
+MITM interception is operational for the standalone LS. The proxy intercepts,
+decrypts, and parses all LLM API traffic with per-model token usage capture.

-## Key Discovery: How the LS Makes LLM API Calls
+## How It Works

-The LS does **NOT** use gRPC for LLM API calls. It uses:
-
- **Protocol**: Standard HTTPS POST with Server-Sent Events (SSE)
- **Endpoint**: `https://daily-cloudcode-pa.googleapis.com/v1internal:streamGenerateContent?alt=sse`
- **HTTP client**: `ApiServerClientV2` — a Go HTTP client that creates its own `tls.Config`
-  and transport, **ignoring `HTTPS_PROXY` by default**
-
-The Go HTTP client for LLM API calls is separate from the one used for Unleash
-(feature flags) and other auxiliary traffic. The Unleash client respects proxy
-settings, but the LLM client does not.
-
-## What We Tried
-
-### 1. Extension Patch — `detectAndUseProxy` ✅ Partial
-
-**Status**: Applied and still active. Harmless.
-
-The extension sends a protobuf field `detect_and_use_proxy` (field 34) to the LS
-during initialization. By default, it's set to `UNSPECIFIED` (0), meaning the LS
-ignores proxy env vars.
-
-**Patch applied:**
-
-```bash
-sudo sed -i -E 's/detectAndUseProxy=[^,;)]+/detectAndUseProxy=1/g' \
-  /usr/share/antigravity/resources/app/extensions/antigravity/dist/extension.js
+```
+Client → Proxy (8741) → Standalone LS (as antigravity-ls user)
+                           ↓ (port 443 traffic)
+                        iptables REDIRECT (UID-scoped)
+                           ↓
+                        MITM Proxy (8742)
+                           ↓ (TLS decrypt + parse SSE)
+                        Google API (daily-cloudcode-pa.googleapis.com)
 ```

-**Enum values:**
+### Components

- 0 = `DETECT_AND_USE_PROXY_UNSPECIFIED` (default, ignore proxy)
- 1 = `DETECT_AND_USE_PROXY_ENABLED`
- 2 = `DETECT_AND_USE_PROXY_DISABLED`
+1. **UID-scoped iptables** (`scripts/mitm-redirect.sh`)
+   - Creates `antigravity-ls` system user
+   - iptables rule: redirect UID's port-443 → MITM port
+   - Only the standalone LS is affected — no side effects on other software

-**Result:** Unleash/aux traffic now routes through `HTTPS_PROXY`. But the LLM API
-client (`ApiServerClientV2`) has its own transport that ignores this flag. LLM
-calls still go direct to Google.
+2. **Combined CA bundle** (`src/standalone.rs`)
+   - Go's `SSL_CERT_FILE` replaces (not appends) the system trust store
+   - Proxy concatenates system CAs + MITM CA → `/tmp/antigravity-mitm-combined-ca.pem`
+   - Set as `SSL_CERT_FILE` on the standalone LS process

-**Verify:** `grep -o 'detectAndUseProxy=[^;]*' /usr/share/antigravity/resources/app/extensions/antigravity/dist/extension.js`
-→ should show `detectAndUseProxy=1`
+3. **`sudo -u` spawning** (`src/standalone.rs`)
+   - If `antigravity-ls` user exists, LS is spawned via `sudo -n -u antigravity-ls`
+   - Env vars passed via `/usr/bin/env KEY=VALUE` args
+   - Falls back to current user if the dedicated user doesn't exist

-**Re-apply after updates:** Yes, must re-apply after every Antigravity update.
+4. **Google SSE parser** (`src/mitm/intercept.rs`)
+   - Parses `data: {"response": {"usageMetadata": {...}}}` events
+   - Extracts `promptTokenCount`, `candidatesTokenCount`, `thoughtsTokenCount`
+   - Handles both Google and Anthropic SSE formats

-### 2. MITM Wrapper (`mitm-wrapper.sh`) ✅ Works for Env Vars
+5. **Transparent proxy** (`src/mitm/proxy.rs`)
+   - Detects iptables-redirected connections via TLS ClientHello SNI
+   - Terminates TLS with dynamically generated certs
+   - Forwards HTTP/1.1 requests upstream with real DNS resolution (`dig @8.8.8.8`)
+   - Chunked response detection for fast completion

-Sets `HTTPS_PROXY` and `SSL_CERT_FILE` on the LS process by wrapping the binary.
+## What We Tried (Historical)

-**How it works:**
+### 1. Extension Patch — `detectAndUseProxy` ✅ Still Active

-1. Renames real binary to `.real`
-2. Places a shell script wrapper at the original path
-3. Wrapper sets env vars and execs the real binary with all original args
+Patches `detectAndUseProxy=1` in the extension JS. Makes auxiliary traffic
+(Unleash, etc.) honor `HTTPS_PROXY`. Harmless, still applied.

-**Result:** The wrapper correctly sets env vars on the LS process (verified via
-`/proc/<PID>/environ`). Combined with the extension patch, Unleash traffic routes
-through the proxy. But LLM API calls still bypass — the `ApiServerClientV2` Go
-HTTP client doesn't honor `HTTPS_PROXY`.
+### 2. MITM Wrapper (`mitm-wrapper.sh`) ⚠️ Superseded

-### 3. iptables REDIRECT — ALL Port 443 ❌ Failed
+Sets env vars on the main LS process. Works for routing but the main LS's
+LLM client ignores `HTTPS_PROXY`. Superseded by standalone mode.

-Redirected all outbound port 443 traffic from the user's UID to the MITM proxy.
+### 3. iptables REDIRECT (All Traffic) ❌ Abandoned

-**Problems encountered:**
+Redirected ALL port-443 traffic. Caused redirect loops, broke other HTTPS
+traffic. Replaced by UID-scoped redirect.

-1. **Redirect loop** — proxy's own upstream connections got caught by iptables,
-   creating infinite loops → fd exhaustion → crash
-2. **Fixed loop with GID bypass** — running proxy with `sg mitm-bypass` and
-   excluding GID in iptables. This fixed the loop.
-3. **Broke Antigravity** — ALL HTTPS traffic (telegram, discord, microsoft
-   telemetry, extension marketplace, etc.) went through the proxy. The TLS
-   passthrough worked technically but was too disruptive.
-4. **TLS trust failure** — even with the MITM wrapper setting `SSL_CERT_FILE`,
-   the LS's Go LLM client likely uses a custom `tls.Config` with its own root
-   CAs, not the system pool. So it rejected our MITM CA cert.
+### 4. DNS Redirect (`/etc/hosts`) ❌ Abandoned

-**Abandoned.** Too disruptive, and the fundamental TLS trust issue remained.
+Same TLS trust issue as #3. Unnecessary with UID-scoped iptables.

-### 4. DNS Redirect (`/etc/hosts`) ❌ Failed
+### 5. Standalone LS + UID-scoped iptables ✅ WORKING

-Redirected only `daily-cloudcode-pa.googleapis.com` to 127.0.0.1 via `/etc/hosts`,
-then used a targeted iptables rule for `127.0.0.1:443` only.
+Current solution. Full MITM interception with zero side effects.

-**Problems:**
+## The Original Blocker (SOLVED)

- Same TLS trust issue — the Go LLM client rejected our MITM CA
- Needed `dig @8.8.8.8` bypass for upstream resolution (implemented but untested)
+> The LS's Go LLM HTTP client uses a custom `tls.Config` that does NOT read
+> from `SSL_CERT_FILE` or the system CA store.

-**Abandoned.** TLS trust is the blocker.
+**This turned out to be wrong.** The Go client DOES honor `SSL_CERT_FILE` when:

-## The Core Blocker
+- The env var is set BEFORE the process starts (not injected later)
+- The value contains a combined bundle (system CAs + custom CA)
+- `SSL_CERT_DIR` is set to `/dev/null` to force exclusive use of `SSL_CERT_FILE`

-**The LS's Go LLM HTTP client (`ApiServerClientV2`) uses a custom `tls.Config`
-that does NOT read from `SSL_CERT_FILE` or the system CA store.** It likely has
-its own hardcoded/embedded root CAs.
-
-This means:
-
- Even if we redirect traffic to our MITM proxy ✅
- Even if the MITM generates valid certs for the domain ✅
- The LS rejects the cert because it doesn't trust our CA ❌
-
-## Potential Solutions (Untried)
-
-### A. Binary Patching
-
-Patch the Go binary to accept our CA or disable cert verification.
-
- Find the `tls.Config` setup in the binary
- Modify `InsecureSkipVerify` to `true`, or inject our CA cert DER bytes
- Very fragile, breaks on updates
-
-### B. LD_PRELOAD Hook
-
-Hook `connect()` syscall to redirect traffic.
-
- **Won't work** for Go — Go uses raw syscalls, not libc wrappers
-
-### C. Network Namespace
-
-Run the LS in an isolated network namespace with custom routing.
-
- Complex setup, but clean isolation
- The standalone LS work would feed into this
-
-### D. Standalone LS with Full Control
-
-Get standalone LS cascades working (see `docs/standalone-ls-todo.md`), then
-have full control over the process environment, including:
-
- Custom CA trust
- Custom DNS resolution
- Custom proxy settings
- Network namespace isolation
-  **This is probably the best long-term approach.**
-
-### E. Kernel-level TLS Interception (eBPF)
-
-Use eBPF to intercept TLS records pre-encryption.
-
- Very powerful, can read plaintext before encryption
- Complex, requires kernel support (>= 4.18)
- Tools: `bpftrace`, custom eBPF programs, `ecapture`
-
-### F. `SSLKEYLOGFILE` + Passive Capture
-
- Go doesn't support `SSLKEYLOGFILE` (confirmed by testing)
- Could patch the binary to enable it, but same fragility as option A
-
-### G. ptrace-based Interception
-
-Use `ptrace` to intercept `write()`/`sendmsg()` syscalls on TLS sockets.
-
- Can read plaintext data being written to TLS connections
- Tools: `strace -e trace=write -p <PID>` (but output is messy)
- Better: custom ptrace tool that filters for TLS socket FDs
+The standalone LS gives us full control over the process environment at spawn
+time, which is why this approach works while the wrapper approach didn't.

 ## Technical Details

-### Model IDs
-
-| Placeholder               | Model               |
-| ------------------------- | ------------------- |
-| `MODEL_PLACEHOLDER_M18`   | Gemini 3 Flash      |
-| `MODEL_PLACEHOLDER_M8`    | Gemini 3 Pro (High) |
-| `MODEL_PLACEHOLDER_M7`    | Gemini 3 Pro (Low)  |
-| `MODEL_PLACEHOLDER_M26`   | Claude Opus 4.6     |
-| `MODEL_PLACEHOLDER_M12`   | Claude Opus 4.5     |
-| `MODEL_CLAUDE_4_5_SONNET` | Claude Sonnet 4.5   |
-
-### LS Binary Location
-
-`/usr/share/antigravity/resources/app/extensions/antigravity/bin/language_server_linux_x64`
-
 ### API Endpoint

-`https://daily-cloudcode-pa.googleapis.com/v1internal:streamGenerateContent?alt=sse`
+`POST https://daily-cloudcode-pa.googleapis.com/v1internal:streamGenerateContent?alt=sse`

-### Protobuf Field 34 — `detect_and_use_proxy`
+### SSE Response Format

- Part of the init metadata sent from extension to LS via stdin
- Enum: `DetectAndUseProxy` (0=UNSPECIFIED, 1=ENABLED, 2=DISABLED)
- Controls whether auxiliary HTTP clients honor `HTTPS_PROXY`
- Does NOT control the LLM API client
-
-### Unleash Feature Flags
-
- Authorization: `*:production.e44558998bfc35ea9584dc65858e4485fdaa5d7ef46903e0c67712d1`
- Endpoint: `antigravity-unleash.goog`
- App name: `codeium-language-server`
-
-### Files Modified (Current State)
-
- `extension.js` — `detectAndUseProxy=1` (harmless, keeps working)
- Everything else — clean/reverted
-
-## Code Changes Made (in the proxy)
-
-1. **Transparent proxy mode** (`src/mitm/proxy.rs`) — supports iptables REDIRECT
-   by detecting raw TLS ClientHello and extracting SNI
-2. **CryptoProvider init** (`src/main.rs`) — prevents rustls panic under load
-3. **PID detection fix** (`src/backend.rs`) — prefers `.real` binary PID over
-   wrapper shell script PID
-4. **SS fallback** (`src/backend.rs`) — discovers LS port via `ss` when log file
-   doesn't have it
-5. **DNS bypass** (`src/mitm/proxy.rs`) — `connect_upstream` resolves via
-   `dig @8.8.8.8` to bypass `/etc/hosts`
-6. **Scripts** — `dns-redirect.sh`, `iptables-redirect.sh` (both functional)
-
-## Cleanup Checklist
-
-If things are broken, undo in this order:
-
-```bash
-# 1. Remove iptables rules
-sudo ./scripts/iptables-redirect.sh uninstall
-sudo ./scripts/dns-redirect.sh uninstall
-
-# 2. Remove /etc/hosts entries (verify manually)
-sudo grep -v "antigravity-mitm" /etc/hosts | sudo tee /etc/hosts.tmp && sudo mv /etc/hosts.tmp /etc/hosts
-
-# 3. Uninstall wrapper
-sudo ./scripts/mitm-wrapper.sh uninstall
-
-# 4. Remove system CA
-sudo rm -f /usr/local/share/ca-certificates/antigravity-mitm.crt
-sudo update-ca-certificates
-
-# 5. Restart Antigravity
+```
+data: {"response": {"candidates": [{"content": {"role": "model", "parts": [{"text": "..."}]}}],
+       "usageMetadata": {"promptTokenCount": 1514, "candidatesTokenCount": 25,
+                         "totalTokenCount": 1539, "thoughtsTokenCount": 52},
+       "modelVersion": "gemini-3-flash"}, "traceId": "...", "metadata": {}}
 ```

-## Next Steps
+Last event includes `"finishReason": "STOP"` in the candidate.

-→ See `docs/standalone-ls-todo.md` for standalone LS isolation work
-→ See `docs/ls-binary-analysis.md` for comprehensive binary reverse engineering
+### Other Intercepted Endpoints

-## New Findings (from binary analysis)
+| Endpoint                    | Type     | Content          |
+| --------------------------- | -------- | ---------------- |
+| `fetchUserInfo`             | Protobuf | User info        |
+| `loadCodeAssist`            | Protobuf | Extension config |
+| `fetchAvailableModels`      | Protobuf | Model catalog    |
+| `webDocsOptions`            | Protobuf | Docs config      |
+| `streamGenerateContent`     | SSE/JSON | LLM responses ✅ |
+| `recordCodeAssistMetrics`   | Protobuf | Telemetry        |
+| `recordTrajectoryAnalytics` | Protobuf | Telemetry        |

-### Alternative to Polling: `StreamCascadeReactiveUpdates`
+### Model IDs

-The LS has a streaming gRPC method `StreamCascadeReactiveUpdates` that pushes
-cascade state changes in real-time via server-sent streaming. The extension uses
-this instead of polling `GetCascadeTrajectorySteps`.
+| Placeholder             | Model               |
+| ----------------------- | ------------------- |
+| `MODEL_PLACEHOLDER_M18` | Gemini 3 Flash      |
+| `MODEL_PLACEHOLDER_M8`  | Gemini 3 Pro (High) |
+| `MODEL_PLACEHOLDER_M7`  | Gemini 3 Pro (Low)  |
+| `MODEL_PLACEHOLDER_M26` | Claude Opus 4.6     |
+| `MODEL_PLACEHOLDER_M12` | Claude Opus 4.5     |

-**Potential improvement:** If we switch from polling to this streaming RPC, we'd
-get lower latency and less backend traffic. However, our current polling approach
-works reliably and doesn't require maintaining a long-lived gRPC stream.
+### Setup

-### Quota Endpoint: `retrieveUserQuota`
+```bash
+# One-time setup (creates user + iptables rule)
+sudo ./scripts/mitm-redirect.sh install

-The `PredictionService/RetrieveUserQuota` gRPC method and
-`v1internal:retrieveUserQuota` REST endpoint provide quota/credit information.
-This could be used to implement a proper `/v1/quota` endpoint instead of
-scraping the LS's own quota tracking.
+# Run proxy with standalone LS + MITM
+RUST_LOG=info ./target/release/antigravity-proxy --standalone

-### `internalAtomicAgenticChat`
+# Check usage
+curl -s http://localhost:8741/v1/usage | jq .
+```

-A REST endpoint that appears to handle the entire agentic chat loop atomically
-(tool calls + responses in one request?). Investigation needed to understand
-the request/response format.
+### Cleanup

-### Credits System
-
-The `google/internal/cloud/code/v1internal/credits` proto package exists with
-`Credits_CreditType` enum. The `CASCADE_ENFORCE_QUOTA` config key controls
-whether quotas are enforced. Related methods: `AddExtraFlexCreditsInternal`,
-`GetTeamCreditEntries`, `GetPlanStatus`.
+```bash
+# Remove iptables rule + user
+sudo ./scripts/mitm-redirect.sh uninstall
+```
--- a/docs/standalone-ls-todo.md
+++ b/docs/standalone-ls-todo.md
@@ -1,87 +1,78 @@
 # Standalone LS for Proxy Isolation

-## Goal
+## Status: ✅ FULLY IMPLEMENTED (incl. MITM interception)

-Route ALL proxy traffic through a standalone LS instance instead of the real one,
-so development/testing/proxying never interferes with active coding sessions.
+The standalone LS is fully working via `--standalone` flag on the proxy.
+All cascade types (sync, streaming, multi-turn) and all endpoints work.
+MITM interception captures real token usage from Google's API.

-## Current State
+## Implementation

-The proxy currently talks to the **real** LS spawned by Antigravity.
-This is risky — a bad cascade or proxy bug can disrupt the coding conversation.
+**Module:** `src/standalone.rs`

-## What Works
+The proxy spawns a standalone LS as a child process:

- Standalone LS starts fine with custom init metadata via stdin protobuf
- Connects to the main extension server (`-extension_server_port`)
- Accepts cascade requests (returns cascadeId)
- With `detect_and_use_proxy = ENABLED` (field 34 = 2), honors `HTTPS_PROXY`
+1. Discovers `extension_server_port` and `csrf_token` from the real LS (via `/proc/PID/cmdline`)
+2. Picks a random free port
+3. Builds init metadata protobuf (via `proto::build_init_metadata()`)
+4. Spawns the LS binary with correct args and env vars
+5. Feeds init metadata via stdin, then closes it
+6. Waits for TCP readiness (retry loop)
+7. Kills the child on proxy shutdown (via `Drop`)

-## What Doesn't Work
+### UID Isolation (MITM mode)

- **Cascades silently fail** — the LS accepts the request but never processes it
-  - No planner invocation, no upstream API call, no logs beyond startup
-  - 9 lines of log after 40s wait
-  - Main LS logs show zero trace of the standalone's cascade
+When `scripts/mitm-redirect.sh install` has been run:

-## Suspected Blockers (investigate in order)
+1. The `antigravity-ls` system user exists
+2. iptables redirects that UID's port-443 traffic → MITM proxy port
+3. The proxy spawns the LS via `sudo -n -u antigravity-ls`
+4. Environment variables (`SSL_CERT_FILE`, etc.) are passed via `/usr/bin/env`
+5. A combined CA bundle (system CAs + MITM CA) is written to `/tmp/antigravity-mitm-combined-ca.pem`
+6. Only the standalone LS traffic is intercepted — no impact on other software

-1. **Auth context** — standalone may not receive OAuth token from extension server
-   - Check: does the standalone's `GetUserStatus` return valid auth?
-   - The extension server might only share tokens with the "primary" LS
-
-2. **Unleash feature flags** — cascade processing gated by flags the standalone doesn't fetch
-   - The standalone connects to Unleash via the proxy, but might not get the right flags
-   - Check: compare Unleash responses between main and standalone
-
-3. **Workspace indexing** — planner might require indexed workspace state
-   - The standalone's workspace (`/tmp/antigravity-standalone`) is empty
-   - Try: point it at a real workspace with actual files
-
-4. **Extension server coupling** — cascade might need the extension to "drive" it
-   - The chat panel in the extension might send additional RPCs to progress the cascade
-   - Check: trace what RPCs the extension sends after StartCascade
-
-## Investigation Plan
+## Usage

 ```bash
-# 1. Launch with max verbosity
-echo "$METADATA" | base64 -d | \
-    timeout 90 "$LS_BIN" \
-    -v 5 \
-    -server_port 42200 \
-    ... > /tmp/standalone-verbose.log 2>&1 &
+# Setup (one-time, requires sudo)
+sudo ./scripts/mitm-redirect.sh install

-# 2. Check auth status
-curl -sk "https://127.0.0.1:42200/exa.language_server_pb.LanguageServerService/GetUserStatus" \
-    -H "Content-Type: application/json" \
-    -H "x-codeium-csrf-token: $CSRF" \
-    -d '{}'
+# Run
+RUST_LOG=info ./target/release/antigravity-proxy --standalone

-# 3. Send cascade and watch logs in real-time
-tail -f /tmp/standalone-verbose.log &
-curl -sk "https://127.0.0.1:42200/.../StartCascade" ...
-
-# 4. Compare Unleash flags
-# Main LS unleash vs standalone unleash
+# Check intercepted usage
+curl -s http://localhost:8741/v1/usage | jq .
 ```

+## Root Cause of Original Failure
+
+The bash script (`scripts/standalone-ls.sh`) used `MODEL_PLACEHOLDER_M3` — an
+unassigned/invalid model enum. The LS silently drops cascades with unknown models.
+
+**Fix:** Use correct model enums (M18=Flash, M26=Opus4.6) via the proxy's
+byte-exact protobuf encoder.
+
 ## Key Technical Details

- Init metadata protobuf field 34 = `detect_and_use_proxy` (enum: 0=UNSPECIFIED, 1=ENABLED, 2=DISABLED)
+- Init metadata protobuf field 34 = `detect_and_use_proxy` (1=ENABLED)
 - Model IDs: M18=Flash, M8=Pro-High, M7=Pro-Low, M26=Opus4.6, M12=Opus4.5
 - LS binary: `/usr/share/antigravity/resources/app/extensions/antigravity/bin/language_server_linux_x64`
 - API endpoint: `daily-cloudcode-pa.googleapis.com/v1internal:streamGenerateContent?alt=sse`
+- SSE response format: `{"response": {"usageMetadata": {"promptTokenCount", "candidatesTokenCount", "thoughtsTokenCount"}, "modelVersion": "..."}}`

-## New Leads (from binary analysis)
+## Test Results (2026-02-14)

- **`GetUnleashData`** — LS method to fetch Unleash flags directly. Could compare
-  main vs standalone to check if flags differ.
- **`GetStaticExperimentStatus`** / `SetBaseExperiments` / `UpdateDevExperiments` —
-  experiment management. Standalone might be missing experiment overrides.
- **`FetchAdminControls`** — admin-level controls that might gate cascade execution.
- **`LoadCodeAssist`** — initialization step that might be required before cascades work.
- **`GetUserStatus` vs `GetUserMemories`** — check if standalone has auth context
-  by calling both.
-
-→ See `docs/ls-binary-analysis.md` for full RPC method catalog.
+| Endpoint                          | Result                    |
+| --------------------------------- | ------------------------- |
+| `GET /health`                     | ✅                        |
+| `GET /v1/models`                  | ✅ 5 models               |
+| `GET /v1/sessions`                | ✅                        |
+| `GET /v1/quota`                   | ✅ real plan/credits      |
+| `GET /v1/usage`                   | ✅ real MITM tokens       |
+| `POST /v1/responses` (sync)       | ✅                        |
+| `POST /v1/responses` (stream)     | ✅ SSE events             |
+| `POST /v1/responses` (multi-turn) | ✅ context preserved      |
+| `POST /v1/chat/completions`       | ✅                        |
+| MITM interception                 | ✅ TLS decrypt + parse    |
+| MITM usage capture                | ✅ per-model token counts |
+| UID isolation                     | ✅ no side effects        |
--- a/scripts/mitm-redirect.sh
+++ b/scripts/mitm-redirect.sh
@@ -0,0 +1,181 @@
+#!/usr/bin/env bash
+# mitm-redirect.sh — UID-scoped iptables redirect for MITM interception
+#
+# Creates a dedicated system user for the standalone LS and adds an iptables
+# rule that ONLY redirects traffic from that user's UID. No /etc/hosts
+# modification, no system-wide changes.
+#
+# Flow:
+#   1. Standalone LS runs as 'antigravity-ls' user (via sudo -u)
+#   2. iptables catches :443 traffic from that UID only → REDIRECT to MITM port
+#   3. MITM terminates TLS (Go client trusts our CA via SSL_CERT_FILE)
+#   4. MITM forwards upstream, captures usage
+#
+# What this does NOT affect:
+#   - Your real Antigravity session (different UID)
+#   - Any other software on your PC (different UID)
+#   - DNS resolution (no /etc/hosts changes)
+#
+# Usage:
+#   sudo ./scripts/mitm-redirect.sh install [mitm_port]
+#   sudo ./scripts/mitm-redirect.sh uninstall [mitm_port]
+#   sudo ./scripts/mitm-redirect.sh status
+
+set -euo pipefail
+
+MITM_PORT="${2:-8742}"
+LS_USER="antigravity-ls"
+DATA_DIR="/tmp/antigravity-standalone"
+LS_BINARY="/usr/share/antigravity/resources/app/extensions/antigravity/bin/language_server_linux_x64"
+SUDOERS_FILE="/etc/sudoers.d/antigravity-ls"
+
+install() {
+    if [[ $EUID -ne 0 ]]; then
+        echo "Error: must run as root (sudo)"
+        exit 1
+    fi
+
+    echo "[mitm-redirect] Installing UID-scoped iptables redirect → :$MITM_PORT"
+    echo
+
+    # ── 1. Create system user ───────────────────────────────────────────
+    if id "$LS_USER" &>/dev/null; then
+        echo "  ✓ user '$LS_USER' already exists (uid=$(id -u "$LS_USER"))"
+    else
+        useradd -r -s /usr/sbin/nologin -d "$DATA_DIR" "$LS_USER"
+        echo "  + created user '$LS_USER' (uid=$(id -u "$LS_USER"))"
+    fi
+    local LS_UID
+    LS_UID=$(id -u "$LS_USER")
+
+    # ── 2. Create data directory (writable by both users) ────────────────
+    mkdir -p "$DATA_DIR/.gemini"
+    chmod 1777 "$DATA_DIR" "$DATA_DIR/.gemini"
+    echo "  + data dir: $DATA_DIR (mode 1777, writable by all)"
+
+    # ── 3. Sudoers entry ────────────────────────────────────────────────
+    # Allow the invoking user (SUDO_USER) to run ANY command as antigravity-ls.
+    # This is needed for the proxy to spawn the LS binary.
+    local REAL_USER="${SUDO_USER:-$(logname 2>/dev/null || whoami)}"
+    cat > "$SUDOERS_FILE" <<EOF
+# Allow $REAL_USER to run commands as $LS_USER (for antigravity proxy)
+$REAL_USER ALL=($LS_USER) NOPASSWD: ALL
+EOF
+    chmod 440 "$SUDOERS_FILE"
+    echo "  + sudoers: $REAL_USER can run as $LS_USER"
+
+    # ── 4. iptables REDIRECT (scoped to UID) ────────────────────────────
+    # Remove existing rule first (idempotent)
+    iptables -t nat -D OUTPUT -m owner --uid-owner "$LS_UID" \
+        -p tcp --dport 443 -j REDIRECT --to-port "$MITM_PORT" 2>/dev/null || true
+
+    iptables -t nat -A OUTPUT -m owner --uid-owner "$LS_UID" \
+        -p tcp --dport 443 -j REDIRECT --to-port "$MITM_PORT"
+    echo "  + iptables: uid=$LS_UID :443 → :$MITM_PORT"
+
+    echo
+    echo "[mitm-redirect] ✓ Installed (only affects uid=$LS_UID)"
+    echo "  Restart the proxy to take effect:"
+    echo "    RUST_LOG=info ./target/release/antigravity-proxy --standalone"
+}
+
+uninstall() {
+    if [[ $EUID -ne 0 ]]; then
+        echo "Error: must run as root (sudo)"
+        exit 1
+    fi
+
+    echo "[mitm-redirect] Removing UID-scoped iptables redirect"
+    echo
+
+    # Remove iptables rule
+    if id "$LS_USER" &>/dev/null; then
+        local LS_UID
+        LS_UID=$(id -u "$LS_USER")
+        iptables -t nat -D OUTPUT -m owner --uid-owner "$LS_UID" \
+            -p tcp --dport 443 -j REDIRECT --to-port "$MITM_PORT" 2>/dev/null || true
+        echo "  - iptables: removed REDIRECT rule for uid=$LS_UID"
+    fi
+
+    # Remove sudoers entry
+    rm -f "$SUDOERS_FILE"
+    echo "  - sudoers: removed $SUDOERS_FILE"
+
+    # Clean data dir
+    rm -rf "$DATA_DIR"
+    echo "  - data dir: removed $DATA_DIR"
+
+    # Optionally remove user (commented out — user might want to keep it)
+    # userdel "$LS_USER" 2>/dev/null || true
+    echo "  ℹ user '$LS_USER' kept (run 'sudo userdel $LS_USER' to remove)"
+
+    echo
+    echo "[mitm-redirect] ✓ Uninstalled."
+}
+
+status() {
+    echo "[mitm-redirect] Status"
+    echo
+
+    # Check user
+    if id "$LS_USER" &>/dev/null; then
+        local LS_UID
+        LS_UID=$(id -u "$LS_USER")
+        echo "  user: $LS_USER (uid=$LS_UID) ✓"
+    else
+        echo "  user: $LS_USER (not found) ✗"
+        echo
+        echo "  Run: sudo $0 install"
+        return
+    fi
+
+    # Check sudoers
+    if [[ -f "$SUDOERS_FILE" ]]; then
+        echo "  sudoers: $SUDOERS_FILE ✓"
+    else
+        echo "  sudoers: $SUDOERS_FILE (not found) ✗"
+    fi
+
+    # Check iptables
+    echo "  iptables:"
+    if iptables -t nat -L OUTPUT -n 2>/dev/null | grep -q "owner UID match.*$LS_UID"; then
+        iptables -t nat -L OUTPUT -n -v 2>/dev/null | grep "owner UID" | sed 's/^/    /'
+    else
+        echo "    (no rules for uid=$LS_UID)"
+    fi
+
+    # Check data dir
+    echo "  data dir: $(ls -ld "$DATA_DIR" 2>/dev/null || echo '(not found)')"
+
+    # Test sudo
+    echo
+    echo "  sudo test:"
+    if sudo -n -u "$LS_USER" true 2>/dev/null; then
+        echo "    ✓ can run as $LS_USER without password"
+    else
+        echo "    ✗ cannot run as $LS_USER (check sudoers)"
+    fi
+}
+
+case "${1:-help}" in
+    install)   install ;;
+    uninstall) uninstall ;;
+    status)    status ;;
+    *)
+        echo "Usage: sudo $0 {install|uninstall|status} [mitm_port]"
+        echo
+        echo "Redirects ONLY the standalone LS's outgoing :443 traffic through"
+        echo "the MITM proxy using UID-scoped iptables rules."
+        echo
+        echo "This does NOT affect:"
+        echo "  - Your real Antigravity coding session"
+        echo "  - Any other software on your PC"
+        echo "  - DNS resolution (/etc/hosts is untouched)"
+        echo
+        echo "  install [port]    Create user + iptables REDIRECT for that UID"
+        echo "  uninstall [port]  Remove iptables rule + sudoers"
+        echo "  status            Show current state"
+        echo
+        echo "Default MITM port: 8742"
+        ;;
+esac
--- a/src/backend.rs
+++ b/src/backend.rs
@@ -83,6 +83,34 @@ impl Backend {
        })
    }

+    /// Create a Backend with known connection details (for standalone LS).
+    ///
+    /// Skips auto-discovery — the caller provides the port, CSRF, and OAuth token.
+    pub fn new_with_config(
+        port: u16,
+        csrf: String,
+        oauth_token: String,
+    ) -> Result<Self, String> {
+        let inner = BackendInner {
+            pid: "standalone".to_string(),
+            csrf,
+            https_port: port.to_string(),
+            oauth_token,
+        };
+
+        let client = wreq::Client::builder()
+            .emulation(wreq_util::Emulation::Chrome142)
+            .cert_verification(false)
+            .verify_hostname(false)
+            .build()
+            .map_err(|e| format!("wreq client build failed: {e}"))?;
+
+        Ok(Self {
+            inner: RwLock::new(inner),
+            client,
+        })
+    }
+
    /// Re-discover language server connection details.
    /// Runs blocking I/O on a spawn_blocking thread to avoid starving tokio.
    pub async fn refresh(&self) -> Result<(), String> {
--- a/src/main.rs
+++ b/src/main.rs
@@ -11,6 +11,7 @@ mod mitm;
 mod proto;
 mod quota;
 mod session;
+mod standalone;
 mod warmup;

 use api::AppState;
@@ -44,6 +45,10 @@ struct Cli {
    /// MITM proxy port (default: 8742, matches wrapper script)
    #[arg(long, default_value_t = 8742)]
    mitm_port: u16,
+
+    /// Use a standalone LS (does not touch the real LS)
+    #[arg(long)]
+    standalone: bool,
 }

 #[tokio::main]
@@ -85,12 +90,83 @@ async fn main() {
        }
    };

-    // ── Step 2: Backend discovery ─────────────────────────────────────────────
-    let backend = Arc::new(match Backend::new() {
-        Ok(b) => b,
-        Err(e) => {
-            eprintln!("Fatal: {e}");
-            std::process::exit(1);
+    // ── Step 2: Backend discovery (or standalone LS spawn) ─────────────────────
+    let standalone_ls = if cli.standalone {
+        // Standalone mode: discover main LS config, spawn our own
+        let main_config = match standalone::discover_main_ls_config() {
+            Ok(c) => c,
+            Err(e) => {
+                eprintln!("Fatal: {e}");
+                std::process::exit(1);
+            }
+        };
+        // Build MITM config if MITM is enabled
+        let mitm_cfg = if !cli.no_mitm {
+            let ca_path = dirs_data_dir()
+                .join("mitm-ca.pem")
+                .to_string_lossy()
+                .to_string();
+            Some(standalone::StandaloneMitmConfig {
+                proxy_addr: format!("http://127.0.0.1:{}", cli.mitm_port),
+                ca_cert_path: ca_path,
+            })
+        } else {
+            None
+        };
+
+        let ls = match standalone::StandaloneLS::spawn(&main_config, mitm_cfg.as_ref()) {
+            Ok(ls) => ls,
+            Err(e) => {
+                eprintln!("Fatal: failed to spawn standalone LS: {e}");
+                std::process::exit(1);
+            }
+        };
+        // Wait for it to be ready
+        let rt_ls_port = ls.port;
+        let rt_ls_csrf = ls.csrf.clone();
+        tokio::task::block_in_place(|| {
+            tokio::runtime::Handle::current().block_on(async {
+                if let Err(e) = ls.wait_ready(10).await {
+                    eprintln!("Fatal: {e}");
+                    std::process::exit(1);
+                }
+            });
+        });
+        info!(port = rt_ls_port, "Standalone LS ready");
+        Some((ls, rt_ls_port, rt_ls_csrf))
+    } else {
+        None
+    };
+
+    let backend = Arc::new(if let Some((_, port, ref csrf)) = standalone_ls {
+        // Build backend pointing at standalone LS
+        let oauth = std::env::var("ANTIGRAVITY_OAUTH_TOKEN")
+            .ok()
+            .filter(|s| !s.is_empty())
+            .or_else(|| {
+                let home = std::env::var("HOME").unwrap_or_default();
+                let path = format!("{home}/.config/antigravity-proxy-token");
+                std::fs::read_to_string(&path)
+                    .ok()
+                    .map(|s| s.trim().to_string())
+                    .filter(|s| !s.is_empty())
+            })
+            .unwrap_or_default();
+        match Backend::new_with_config(port, csrf.clone(), oauth) {
+            Ok(b) => b,
+            Err(e) => {
+                eprintln!("Fatal: {e}");
+                std::process::exit(1);
+            }
+        }
+    } else {
+        // Normal mode: discover existing LS
+        match Backend::new() {
+            Ok(b) => b,
+            Err(e) => {
+                eprintln!("Fatal: {e}");
+                std::process::exit(1);
+            }
        }
    });

@@ -151,8 +227,15 @@ async fn main() {
    });

    // Periodic backend refresh — keeps LS connection details fresh
+    // (skip in standalone mode — the port is fixed and discover() would overwrite it)
+    let is_standalone = cli.standalone;
    let refresh_backend = Arc::clone(&state.backend);
    let refresh_handle = tokio::spawn(async move {
+        if is_standalone {
+            // In standalone mode, the backend config is fixed — no refresh needed
+            std::future::pending::<()>().await;
+            return;
+        }
        loop {
            tokio::time::sleep(tokio::time::Duration::from_secs(60)).await;
            if let Err(e) = refresh_backend.refresh().await {
@@ -178,6 +261,10 @@ async fn main() {
    if let Some(h) = mitm_handle {
        h.abort();
    }
+    // Kill standalone LS if we spawned one
+    if let Some((mut ls, _, _)) = standalone_ls {
+        ls.kill();
+    }
    // Remove stale MITM port file
    let _ = std::fs::remove_file(dirs_data_dir().join("mitm-port"));
    info!("Server shutdown complete");
--- a/src/mitm/intercept.rs
+++ b/src/mitm/intercept.rs
@@ -56,9 +56,11 @@ pub struct StreamingAccumulator {
    pub output_tokens: u64,
    pub cache_creation_input_tokens: u64,
    pub cache_read_input_tokens: u64,
+    pub thinking_tokens: u64,
    pub model: Option<String>,
    pub stop_reason: Option<String>,
    pub is_complete: bool,
+    pub api_provider: Option<String>,
 }

 impl StreamingAccumulator {
@@ -66,13 +68,46 @@ impl StreamingAccumulator {
        Self::default()
    }

-    /// Process a single SSE event.
+/// Process a single SSE event.
    pub fn process_event(&mut self, event: &Value) {
+        // ── Google format: {"response": {"usageMetadata": {...}, "modelVersion": "..."}} ──
+        if let Some(response) = event.get("response") {
+            // Extract usage metadata (each event has cumulative counts)
+            if let Some(usage) = response.get("usageMetadata") {
+                self.input_tokens = usage["promptTokenCount"].as_u64().unwrap_or(self.input_tokens);
+                self.output_tokens = usage["candidatesTokenCount"].as_u64().unwrap_or(self.output_tokens);
+                self.thinking_tokens = usage["thoughtsTokenCount"].as_u64().unwrap_or(self.thinking_tokens);
+            }
+            if let Some(model) = response["modelVersion"].as_str() {
+                self.model = Some(model.to_string());
+            }
+            // Check for completion in candidates
+            if let Some(candidates) = response.get("candidates").and_then(|c| c.as_array()) {
+                for candidate in candidates {
+                    if let Some(reason) = candidate["finishReason"].as_str() {
+                        self.stop_reason = Some(reason.to_string());
+                        if reason == "STOP" {
+                            self.is_complete = true;
+                        }
+                    }
+                }
+            }
+            self.api_provider = Some("google".to_string());
+            trace!(
+                input = self.input_tokens,
+                output = self.output_tokens,
+                thinking = self.thinking_tokens,
+                complete = self.is_complete,
+                "SSE Google: usage update"
+            );
+            return;
+        }
+
+        // ── Anthropic format: {"type": "message_start"|"message_delta"|"message_stop"} ──
        let event_type = event["type"].as_str().unwrap_or("");

        match event_type {
            "message_start" => {
-                // message_start contains the initial usage (input tokens + cache)
                if let Some(usage) = event.get("message").and_then(|m| m.get("usage")) {
                    self.input_tokens = usage["input_tokens"].as_u64().unwrap_or(0);
                    self.cache_creation_input_tokens = usage["cache_creation_input_tokens"].as_u64().unwrap_or(0);
@@ -81,36 +116,27 @@ impl StreamingAccumulator {
                if let Some(model) = event.get("message").and_then(|m| m["model"].as_str()) {
                    self.model = Some(model.to_string());
                }
-                trace!(
-                    input = self.input_tokens,
-                    cache_read = self.cache_read_input_tokens,
-                    cache_create = self.cache_creation_input_tokens,
-                    "SSE message_start: captured input usage"
-                );
+                self.api_provider = Some("anthropic".to_string());
+                trace!(input = self.input_tokens, "SSE Anthropic: message_start");
            }
            "message_delta" => {
-                // message_delta contains the output usage
                if let Some(usage) = event.get("usage") {
                    self.output_tokens = usage["output_tokens"].as_u64().unwrap_or(self.output_tokens);
                }
                if let Some(reason) = event["delta"]["stop_reason"].as_str() {
                    self.stop_reason = Some(reason.to_string());
                }
-                trace!(output = self.output_tokens, "SSE message_delta: updated output tokens");
            }
            "message_stop" => {
                self.is_complete = true;
                debug!(
                    input = self.input_tokens,
                    output = self.output_tokens,
-                    cache_read = self.cache_read_input_tokens,
                    model = ?self.model,
-                    "SSE message_stop: stream complete"
+                    "SSE Anthropic: stream complete"
                );
            }
-            "content_block_start" | "content_block_delta" | "content_block_stop" | "ping" => {
-                // Content events — no usage data, just pass through
-            }
+            "content_block_start" | "content_block_delta" | "content_block_stop" | "ping" => {}
            _ => {
                trace!(event_type, "SSE: unknown event type");
            }
@@ -124,11 +150,11 @@ impl StreamingAccumulator {
            output_tokens: self.output_tokens,
            cache_creation_input_tokens: self.cache_creation_input_tokens,
            cache_read_input_tokens: self.cache_read_input_tokens,
-            thinking_output_tokens: 0,
+            thinking_output_tokens: self.thinking_tokens,
            response_output_tokens: 0,
            model: self.model,
            stop_reason: self.stop_reason,
-            api_provider: Some("anthropic".to_string()),
+            api_provider: self.api_provider.unwrap_or_else(|| "unknown".to_string()).into(),
            grpc_method: None,
            captured_at: std::time::SystemTime::now()
                .duration_since(std::time::UNIX_EPOCH)
--- a/src/mitm/proxy.rs
+++ b/src/mitm/proxy.rs
@@ -85,7 +85,7 @@ pub async fn run(
                    let store = store.clone();
                    tokio::spawn(async move {
                        if let Err(e) = handle_connection(stream, ca, store, modify_requests).await {
-                            debug!(error = %e, "MITM connection error");
+                            warn!(error = %e, "MITM connection error");
                        }
                    });
                }
@@ -310,18 +310,30 @@ async fn handle_intercepted(

    let acceptor = TlsAcceptor::from(server_config);

-    // Perform TLS handshake with the client (LS)
-    let tls_stream = acceptor
-        .accept(stream)
-        .await
-        .map_err(|e| format!("TLS handshake with client failed for {domain}: {e}"))?;
+    // Perform TLS handshake with the client (LS) — 10s timeout
+    let tls_stream = match tokio::time::timeout(
+        std::time::Duration::from_secs(10),
+        acceptor.accept(stream),
+    )
+    .await
+    {
+        Ok(Ok(s)) => s,
+        Ok(Err(e)) => {
+            warn!(domain, error = %e, "MITM: TLS handshake FAILED (client rejected cert?)");
+            return Err(format!("TLS handshake with client failed for {domain}: {e}"));
+        }
+        Err(_) => {
+            warn!(domain, "MITM: TLS handshake TIMED OUT after 10s");
+            return Err(format!("TLS handshake timed out for {domain}"));
+        }
+    };

    // Check negotiated ALPN protocol
    let alpn = tls_stream.get_ref().1
        .alpn_protocol()
        .map(|p| String::from_utf8_lossy(p).to_string());

-    debug!(domain, alpn = ?alpn, "MITM: TLS handshake successful");
+    info!(domain, alpn = ?alpn, "MITM: TLS handshake successful ✓");

    match alpn.as_deref() {
        Some("h2") => {
@@ -336,7 +348,7 @@ async fn handle_intercepted(
        }
        _ => {
            // HTTP/1.1 or no ALPN — use the existing handler
-            debug!(domain, "MITM: routing to HTTP/1.1 handler");
+            info!(domain, "MITM: routing to HTTP/1.1 handler");
            handle_http_over_tls(tls_stream, domain, store, modify_requests).await
        }
    }
@@ -382,16 +394,35 @@ async fn handle_http_over_tls(

        // Try to resolve the real IP, bypassing /etc/hosts
        let addr = resolve_upstream(domain).await;
+        info!(domain, addr = %addr, "MITM: connecting upstream");
+
+        let tcp = match tokio::time::timeout(
+            std::time::Duration::from_secs(15),
+            TcpStream::connect(&addr),
+        )
+        .await
+        {
+            Ok(Ok(s)) => s,
+            Ok(Err(e)) => return Err(format!("Connect to upstream {domain} ({addr}): {e}")),
+            Err(_) => return Err(format!("Connect to upstream {domain} ({addr}): timed out")),
+        };

-        let tcp = TcpStream::connect(addr)
-            .await
-            .map_err(|e| format!("Connect to upstream {domain}: {e}"))?;
        let server_name = rustls::pki_types::ServerName::try_from(domain.to_string())
            .map_err(|e| format!("Invalid server name: {e}"))?;
-        connector
-            .connect(server_name, tcp)
-            .await
-            .map_err(|e| format!("TLS connect to upstream {domain}: {e}"))
+
+        match tokio::time::timeout(
+            std::time::Duration::from_secs(15),
+            connector.connect(server_name, tcp),
+        )
+        .await
+        {
+            Ok(Ok(s)) => {
+                info!(domain, "MITM: upstream TLS connected ✓");
+                Ok(s)
+            }
+            Ok(Err(e)) => Err(format!("TLS connect to upstream {domain}: {e}")),
+            Err(_) => Err(format!("TLS connect to upstream {domain}: timed out")),
+        }
    }

    /// Resolve upstream IP bypassing /etc/hosts.
@@ -428,8 +459,37 @@ async fn handle_http_over_tls(
        // ── Read the HTTP request from the client ─────────────────────────
        let mut request_buf = Vec::with_capacity(1024 * 64);

+        // 60s timeout on initial read (LS may open connection without sending immediately)
+        const IDLE_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(60);
+
        loop {
-            let n = match client.read(&mut tmp).await {
+            let read_result = if request_buf.is_empty() {
+                // First read — apply idle timeout
+                match tokio::time::timeout(IDLE_TIMEOUT, client.read(&mut tmp)).await {
+                    Ok(r) => r,
+                    Err(_) => {
+                        // Idle timeout — connection pool warmup, no data sent
+                        debug!(domain, "MITM: client idle timeout (60s), closing");
+                        return Ok(());
+                    }
+                }
+            } else {
+                // Subsequent reads — wait up to 30s for rest of request
+                match tokio::time::timeout(
+                    std::time::Duration::from_secs(30),
+                    client.read(&mut tmp),
+                )
+                .await
+                {
+                    Ok(r) => r,
+                    Err(_) => {
+                        warn!(domain, "MITM: partial request read timed out");
+                        return Err("Partial request read timed out".into());
+                    }
+                }
+            };
+
+            let n = match read_result {
                Ok(0) => return Ok(()), // Client closed connection cleanly
                Ok(n) => n,
                Err(e) => {
@@ -461,12 +521,25 @@ async fn handle_http_over_tls(
            None
        };

-        debug!(
+        // Extract request method and path for logging
+        let req_path = {
+            let mut headers = [httparse::EMPTY_HEADER; 64];
+            let mut req = httparse::Request::new(&mut headers);
+            match req.parse(&request_buf) {
+                Ok(httparse::Status::Complete(_)) => {
+                    format!("{} {}", req.method.unwrap_or("?"), req.path.unwrap_or("?"))
+                }
+                _ => "?".to_string(),
+            }
+        };
+
+        info!(
            domain,
+            req_path = %req_path,
            content_length,
            streaming = is_streaming_request,
            cascade = ?cascade_hint,
-            "MITM: forwarding request to upstream"
+            "MITM: forwarding request"
        );

        // ── Ensure upstream connection is alive ──────────────────────────────
@@ -492,118 +565,139 @@ async fn handle_http_over_tls(
        let conn = upstream.as_mut().unwrap();

        // ── Stream response back to client ──────────────────────────────────
+        // ALWAYS forward data to client immediately (no buffering).
+        // Buffer body on the side for usage parsing.
        let mut streaming_acc = StreamingAccumulator::new();
        let mut is_streaming_response = false;
        let mut headers_parsed = false;
-        // Only buffer response body for non-streaming (for usage parsing)
-        let mut non_streaming_buf: Option<Vec<u8>> = None;
-        // Track if upstream connection is still usable after this response
        let mut upstream_ok = true;
-
-        // Per-request timeout: 5 minutes (covers large context API calls)
-        const READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(300);
+        let mut response_body_buf = Vec::new();
+        let mut response_content_length: Option<usize> = None;
+        let mut is_chunked = false;
+        let mut got_first_byte = false;
+        let mut header_buf = Vec::with_capacity(8192);

        loop {
-            let n = match tokio::time::timeout(READ_TIMEOUT, conn.read(&mut tmp)).await {
-                Ok(Ok(0)) => {
-                    // Upstream closed — connection is no longer reusable
-                    upstream_ok = false;
-                    break;
-                }
+            // 15s idle timeout after first byte, 60s for initial response
+            let timeout = if got_first_byte {
+                std::time::Duration::from_secs(15)
+            } else {
+                std::time::Duration::from_secs(60)
+            };
+
+            let n = match tokio::time::timeout(timeout, conn.read(&mut tmp)).await {
+                Ok(Ok(0)) => { upstream_ok = false; break; }
                Ok(Ok(n)) => n,
                Ok(Err(e)) => {
-                    debug!(domain, error = %e, "MITM: upstream read finished");
+                    debug!(domain, error = %e, "MITM: upstream read ended");
                    upstream_ok = false;
                    break;
                }
                Err(_) => {
-                    warn!(domain, "MITM: upstream read timed out after 5 minutes");
+                    if got_first_byte {
+                        debug!(domain, "MITM: response idle timeout (complete)");
+                    } else {
+                        warn!(domain, "MITM: no upstream response in 60s");
+                    }
                    upstream_ok = false;
                    break;
                }
            };

+            got_first_byte = true;
            let chunk = &tmp[..n];

-            // Check response headers for content-type
            if !headers_parsed {
-                // We need to buffer until we see the end of headers
-                let buf = non_streaming_buf.get_or_insert_with(|| Vec::with_capacity(1024 * 64));
-                buf.extend_from_slice(chunk);
-                if let Some(_hdr_end) = find_headers_end(buf) {
-                    // Use httparse for response header parsing
+                header_buf.extend_from_slice(chunk);
+                if let Some(_hdr_end) = find_headers_end(&header_buf) {
                    let mut resp_headers = [httparse::EMPTY_HEADER; 64];
                    let mut resp = httparse::Response::new(&mut resp_headers);
-                    let hdr_end = match resp.parse(buf) {
+                    let hdr_end = match resp.parse(&header_buf) {
                        Ok(httparse::Status::Complete(n)) => n,
-                        _ => _hdr_end, // Fallback to manual detection
+                        _ => _hdr_end,
                    };

-                    // Detect content type and connection handling from parsed headers
+                    let mut content_type = String::new();
+
                    for header in resp.headers.iter() {
                        if header.name.eq_ignore_ascii_case("content-type") {
-                            if let Ok(val) = std::str::from_utf8(header.value) {
-                                if val.contains("text/event-stream") {
-                                    is_streaming_response = true;
-                                }
+                            if let Ok(v) = std::str::from_utf8(header.value) {
+                                content_type = v.to_string();
+                                if v.contains("text/event-stream") { is_streaming_response = true; }
+                            }
+                        }
+                        if header.name.eq_ignore_ascii_case("content-length") {
+                            if let Ok(v) = std::str::from_utf8(header.value) {
+                                response_content_length = v.trim().parse().ok();
                            }
                        }
                        if header.name.eq_ignore_ascii_case("connection") {
-                            if let Ok(val) = std::str::from_utf8(header.value) {
-                                if val.trim().eq_ignore_ascii_case("close") {
-                                    upstream_ok = false;
-                                }
+                            if let Ok(v) = std::str::from_utf8(header.value) {
+                                if v.trim().eq_ignore_ascii_case("close") { upstream_ok = false; }
+                            }
+                        }
+                        if header.name.eq_ignore_ascii_case("transfer-encoding") {
+                            if let Ok(v) = std::str::from_utf8(header.value) {
+                                if v.trim().eq_ignore_ascii_case("chunked") { is_chunked = true; }
                            }
                        }
                    }

+                    info!(domain, streaming = is_streaming_response,
+                        content_length = ?response_content_length,
+                        content_type = %content_type,
+                        status = resp.code, "MITM: got response headers");
                    headers_parsed = true;

-                    if is_streaming_response {
-                        // For streaming, parse any SSE data already in the buffer
-                        let body_so_far = String::from_utf8_lossy(&buf[hdr_end..]);
-                        if !body_so_far.is_empty() {
-                            parse_streaming_chunk(&body_so_far, &mut streaming_acc);
-                        }
-                        // Forward the accumulated buffer to client
-                        if let Err(e) = client.write_all(buf).await {
-                            warn!(error = %e, "MITM: write to client failed");
-                            break;
-                        }
-                        non_streaming_buf = None;
-                        continue;
+                    // Save body for usage parsing
+                    response_body_buf.extend_from_slice(&header_buf[hdr_end..]);
+
+                    // Forward to client immediately
+                    if let Err(e) = client.write_all(&header_buf).await {
+                        warn!(error = %e, "MITM: write to client failed");
+                        break;
+                    }
+
+                    if is_streaming_response && hdr_end < header_buf.len() {
+                        let body = String::from_utf8_lossy(&header_buf[hdr_end..]);
+                        parse_streaming_chunk(&body, &mut streaming_acc);
+                    }
+
+                    if let Some(cl) = response_content_length {
+                        if response_body_buf.len() >= cl { break; }
+                    }
+                    // Check chunked terminator in initial body
+                    if is_chunked && has_chunked_terminator(&response_body_buf) {
+                        debug!(domain, "MITM: chunked response complete (initial)");
+                        break;
                    }
-                    // Non-streaming: keep buffering the response body for parsing
                    continue;
                }
                continue;
            }

-            // If streaming, parse SSE events and forward immediately
+            // Forward to client immediately
+            if let Err(e) = client.write_all(chunk).await {
+                warn!(error = %e, "MITM: write to client failed");
+                break;
+            }
+            response_body_buf.extend_from_slice(chunk);
+
            if is_streaming_response {
-                let chunk_str = String::from_utf8_lossy(chunk);
-                parse_streaming_chunk(&chunk_str, &mut streaming_acc);
-
-                if let Err(e) = client.write_all(chunk).await {
-                    warn!(error = %e, "MITM: write to client failed (client disconnected?)");
-                    break;
-                }
-            } else {
-                // Non-streaming: keep accumulating to parse usage at the end
-                if let Some(ref mut buf) = non_streaming_buf {
-                    buf.extend_from_slice(chunk);
-                }
+                let s = String::from_utf8_lossy(chunk);
+                parse_streaming_chunk(&s, &mut streaming_acc);
+            }
+            if let Some(cl) = response_content_length {
+                if response_body_buf.len() >= cl { break; }
+            }
+            if is_chunked && has_chunked_terminator(&response_body_buf) {
+                debug!(domain, "MITM: chunked response complete");
+                break;
            }
        }

-        // Forward non-streaming response all at once
-        if !is_streaming_response {
-            if let Some(ref buf) = non_streaming_buf {
-                if let Err(e) = client.write_all(buf).await {
-                    warn!(error = %e, "MITM: write to client failed");
-                }
-            }
-        }
+        // Flush client
+        let _ = client.flush().await;

        // Capture usage data
        if is_streaming_response {
@@ -611,12 +705,9 @@ async fn handle_http_over_tls(
                let usage = streaming_acc.into_usage();
                store.record_usage(cascade_hint.as_deref(), usage).await;
            }
-        } else if let Some(ref buf) = non_streaming_buf {
-            if let Some(body_start) = find_headers_end(buf) {
-                let body = &buf[body_start..];
-                if let Some(usage) = parse_non_streaming_response(body) {
-                    store.record_usage(cascade_hint.as_deref(), usage).await;
-                }
+        } else if !response_body_buf.is_empty() {
+            if let Some(usage) = parse_non_streaming_response(&response_body_buf) {
+                store.record_usage(cascade_hint.as_deref(), usage).await;
            }
        }

@@ -652,6 +743,20 @@ async fn handle_passthrough(
    Ok(())
 }

+/// Detect end of HTTP chunked transfer encoding.
+/// A chunked response ends with "0\r\n\r\n" (zero-length chunk + empty trailer).
+/// We check the tail of the buffer for this pattern.
+fn has_chunked_terminator(body: &[u8]) -> bool {
+    // The minimal terminator is "0\r\n\r\n" (5 bytes)
+    if body.len() < 5 {
+        return false;
+    }
+    // Check last 7 bytes to account for possible trailing whitespace
+    let tail = if body.len() > 7 { &body[body.len() - 7..] } else { body };
+    // Look for \r\n0\r\n\r\n anywhere in the tail
+    tail.windows(5).any(|w| w == b"0\r\n\r\n")
+}
+
 /// Check if buffer contains a complete HTTP request (headers + full body).
 /// Uses `httparse` for zero-copy, case-insensitive header parsing.
 fn has_complete_http_request(buf: &[u8]) -> bool {
--- a/src/proto.rs
+++ b/src/proto.rs
@@ -62,6 +62,51 @@ pub fn varint_field(field: u32, val: u64) -> Vec<u8> {
    out
 }

+// ─── Init metadata builder (for standalone LS stdin) ─────────────────────────
+
+/// Build the init metadata protobuf that the LS expects on stdin at startup.
+///
+/// This replaces the Python snippet in `standalone-ls.sh` with proper Rust encoding.
+/// Fields match what the real Antigravity extension sends to the LS.
+///
+/// Field layout (from binary analysis):
+///   1: api_key (string) — unique session key
+///   3: ide_name (string) — "antigravity"
+///   4: antigravity_version (string) — e.g. "1.107.0"
+///   5: ide_version (string) — e.g. "1.16.5"
+///   6: locale (string) — "en_US"
+///  10: session_id (string) — unique session identifier
+///  11: editor_name (string) — "antigravity"
+///  34: detect_and_use_proxy (varint enum) — 1 = ENABLED
+pub fn build_init_metadata(
+    api_key: &str,
+    antigravity_version: &str,
+    ide_version: &str,
+    session_id: &str,
+    detect_and_use_proxy: u64,
+) -> Vec<u8> {
+    let mut buf = Vec::with_capacity(128);
+
+    // Field 1: api_key
+    buf.extend(proto_string(1, api_key.as_bytes()));
+    // Field 3: ide_name
+    buf.extend(proto_string(3, CLIENT_NAME.as_bytes()));
+    // Field 4: antigravity version
+    buf.extend(proto_string(4, antigravity_version.as_bytes()));
+    // Field 5: IDE/client version
+    buf.extend(proto_string(5, ide_version.as_bytes()));
+    // Field 6: locale
+    buf.extend(proto_string(6, b"en_US"));
+    // Field 10: session_id
+    buf.extend(proto_string(10, session_id.as_bytes()));
+    // Field 11: editor_name
+    buf.extend(proto_string(11, CLIENT_NAME.as_bytes()));
+    // Field 34: detect_and_use_proxy enum (1 = ENABLED)
+    buf.extend(varint_field(34, detect_and_use_proxy));
+
+    buf
+}
+
 // ─── SendUserCascadeMessageRequest builder ───────────────────────────────────

 /// Build the `SendUserCascadeMessageRequest` protobuf binary.
--- a/src/standalone.rs
+++ b/src/standalone.rs
@@ -0,0 +1,373 @@
+//! Standalone Language Server — spawn and lifecycle management.
+//!
+//! Launches an isolated LS instance as a child process that the proxy fully owns.
+//! The standalone LS shares auth via the main extension server but has its own
+//! HTTPS port, data directory, and cascade space. This means the real LS (the
+//! one powering the user's coding session) is never touched.
+
+use crate::constants;
+use crate::proto;
+use std::io::Write;
+use std::net::TcpListener;
+use std::process::{Child, Command, Stdio};
+use tokio::time::{sleep, Duration};
+use tracing::{debug, info};
+
+/// Default path to the LS binary.
+const LS_BINARY_PATH: &str =
+    "/usr/share/antigravity/resources/app/extensions/antigravity/bin/language_server_linux_x64";
+
+/// App root for ANTIGRAVITY_EDITOR_APP_ROOT env var.
+const APP_ROOT: &str = "/usr/share/antigravity/resources/app";
+
+/// Data directory for the standalone LS.
+const DATA_DIR: &str = "/tmp/antigravity-standalone";
+
+/// System user for UID-scoped iptables isolation.
+const LS_USER: &str = "antigravity-ls";
+
+/// A running standalone LS process.
+pub struct StandaloneLS {
+    child: Child,
+    pub port: u16,
+    pub csrf: String,
+}
+
+/// Config needed from the real (main) LS to bootstrap the standalone one.
+pub struct MainLSConfig {
+    pub extension_server_port: String,
+    pub csrf: String,
+}
+
+/// Optional MITM proxy config for the standalone LS.
+pub struct StandaloneMitmConfig {
+    pub proxy_addr: String,   // e.g. "http://127.0.0.1:8742"
+    pub ca_cert_path: String, // path to MITM CA .pem
+}
+
+impl StandaloneLS {
+    /// Spawn a standalone LS process.
+    ///
+    /// Discovers the main LS's extension server port and CSRF token,
+    /// picks a free port, builds init metadata, and launches the binary.
+    ///
+    /// If `mitm_config` is provided, sets HTTPS_PROXY and SSL_CERT_FILE
+    /// so the LS routes LLM API calls through the MITM proxy.
+    pub fn spawn(
+        main_config: &MainLSConfig,
+        mitm_config: Option<&StandaloneMitmConfig>,
+    ) -> Result<Self, String> {
+        let port = find_free_port()?;
+        let ts = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .unwrap_or_default()
+            .as_secs();
+
+        // Build init metadata protobuf
+        let api_key = format!("standalone-api-key-{ts}");
+        let session_id = format!("standalone-session-{ts}");
+        let metadata = proto::build_init_metadata(
+            &api_key,
+            constants::antigravity_version(),
+            constants::client_version(),
+            &session_id,
+            1, // DETECT_AND_USE_PROXY_ENABLED
+        );
+
+        // Setup data dir (mode 1777 so both current user and antigravity-ls can write)
+        let gemini_dir = format!("{DATA_DIR}/.gemini");
+        std::fs::create_dir_all(&gemini_dir)
+            .map_err(|e| format!("Failed to create standalone data dir: {e}"))?;
+        #[cfg(unix)]
+        {
+            use std::os::unix::fs::PermissionsExt;
+            let _ = std::fs::set_permissions(DATA_DIR, std::fs::Permissions::from_mode(0o1777));
+            let _ = std::fs::set_permissions(&gemini_dir, std::fs::Permissions::from_mode(0o1777));
+        }
+
+        // LS args — mirrors standalone-ls.sh but with correct params
+        let args = vec![
+            "-enable_lsp".to_string(),
+            "-extension_server_port".to_string(),
+            main_config.extension_server_port.clone(),
+            "-csrf_token".to_string(),
+            main_config.csrf.clone(),
+            "-server_port".to_string(),
+            port.to_string(),
+            "-workspace_id".to_string(),
+            format!("standalone_{ts}"),
+            "-cloud_code_endpoint".to_string(),
+            "https://daily-cloudcode-pa.googleapis.com".to_string(),
+            "-app_data_dir".to_string(),
+            "antigravity-standalone".to_string(),
+            "-gemini_dir".to_string(),
+            gemini_dir,
+        ];
+
+        info!(port, "Spawning standalone LS");
+        debug!(?args, "LS args");
+
+        // Build env vars for the LS process
+        let mut env_vars: Vec<(String, String)> = vec![
+            ("ANTIGRAVITY_EDITOR_APP_ROOT".into(), APP_ROOT.into()),
+        ];
+
+        // If MITM is enabled, add SSL + proxy env vars
+        if let Some(mitm) = mitm_config {
+            // Go's SSL_CERT_FILE replaces the entire system cert pool, so we
+            // need a combined bundle: system CAs + our MITM CA
+            // Write to /tmp — accessible by antigravity-ls user
+            // (user's ~/.config/ is not traversable by other UIDs)
+            let combined_ca_path = "/tmp/antigravity-mitm-combined-ca.pem".to_string();
+            let system_ca = std::fs::read_to_string("/etc/ssl/certs/ca-certificates.crt")
+                .unwrap_or_default();
+            let mitm_ca = std::fs::read_to_string(&mitm.ca_cert_path)
+                .map_err(|e| format!("Failed to read MITM CA cert: {e}"))?;
+            std::fs::write(&combined_ca_path, format!("{system_ca}\n{mitm_ca}"))
+                .map_err(|e| format!("Failed to write combined CA bundle: {e}"))?;
+            // Make readable by antigravity-ls user
+            #[cfg(unix)]
+            {
+                use std::os::unix::fs::PermissionsExt;
+                let _ = std::fs::set_permissions(
+                    &combined_ca_path,
+                    std::fs::Permissions::from_mode(0o644),
+                );
+            }
+
+            info!(
+                proxy = %mitm.proxy_addr,
+                ca = %combined_ca_path,
+                "Setting MITM env vars on standalone LS (combined CA bundle)"
+            );
+            env_vars.push(("SSL_CERT_FILE".into(), combined_ca_path));
+            env_vars.push(("SSL_CERT_DIR".into(), "/dev/null".into()));
+            env_vars.push(("NODE_EXTRA_CA_CERTS".into(), mitm.ca_cert_path.clone()));
+        }
+
+        // Check if 'antigravity-ls' user exists for UID-scoped iptables isolation
+        let use_sudo = has_ls_user();
+
+        let mut cmd = if use_sudo {
+            info!("Using UID isolation: spawning LS as 'antigravity-ls' user");
+            // Build: sudo -n -u antigravity-ls -- /usr/bin/env VAR=val ... LS_BINARY args...
+            let mut c = Command::new("sudo");
+            c.args(["-n", "-u", LS_USER, "--", "/usr/bin/env"]);
+            // Pass env vars as key=value args to /usr/bin/env
+            for (k, v) in &env_vars {
+                c.arg(format!("{k}={v}"));
+            }
+            c.arg(LS_BINARY_PATH);
+            c.args(&args);
+            c
+        } else {
+            debug!("No 'antigravity-ls' user found, spawning LS as current user");
+            let mut c = Command::new(LS_BINARY_PATH);
+            c.args(&args);
+            for (k, v) in &env_vars {
+                c.env(k, v);
+            }
+            c
+        };
+
+        cmd.stdin(Stdio::piped())
+            .stdout(Stdio::null())
+            .stderr(Stdio::null());
+
+        let mut child = cmd
+            .spawn()
+            .map_err(|e| format!("Failed to spawn LS binary: {e}"))?;
+
+        // Feed init metadata via stdin, then close it
+        if let Some(mut stdin) = child.stdin.take() {
+            stdin
+                .write_all(&metadata)
+                .map_err(|e| format!("Failed to write init metadata to stdin: {e}"))?;
+            // stdin drops here → EOF
+        }
+
+        info!(pid = child.id(), port, "Standalone LS spawned");
+
+        Ok(StandaloneLS {
+            child,
+            port,
+            csrf: main_config.csrf.clone(),
+        })
+    }
+
+    /// Wait for the standalone LS to be ready (accepting TCP connections).
+    ///
+    /// Retries up to `max_attempts` times with a 1-second delay between each.
+    pub async fn wait_ready(&self, max_attempts: u32) -> Result<(), String> {
+        info!(port = self.port, "Waiting for standalone LS to be ready...");
+
+        for attempt in 1..=max_attempts {
+            sleep(Duration::from_secs(1)).await;
+
+            // Simple TCP connect check — if the LS is listening, it's ready
+            match tokio::net::TcpStream::connect(format!("127.0.0.1:{}", self.port)).await {
+                Ok(_) => {
+                    info!(attempt, "Standalone LS is ready (accepting connections)");
+                    return Ok(());
+                }
+                Err(e) => {
+                    debug!(attempt, error = %e, "LS not ready yet");
+                }
+            }
+        }
+
+        Err(format!(
+            "Standalone LS failed to become ready after {max_attempts} attempts on port {}",
+            self.port
+        ))
+    }
+
+    /// Check if the child process is still running.
+    #[allow(dead_code)]
+    pub fn is_alive(&mut self) -> bool {
+        matches!(self.child.try_wait(), Ok(None))
+    }
+
+    /// Kill the standalone LS process.
+    pub fn kill(&mut self) {
+        info!("Killing standalone LS");
+        let _ = self.child.kill();
+        let _ = self.child.wait();
+    }
+}
+
+impl Drop for StandaloneLS {
+    fn drop(&mut self) {
+        self.kill();
+    }
+}
+
+/// Discover only the extension_server_port and csrf_token from the running main LS.
+///
+/// This does NOT discover the HTTPS port — we don't need to talk to the real LS,
+/// only steal its extension server connection info.
+pub fn discover_main_ls_config() -> Result<MainLSConfig, String> {
+    let pid = find_main_ls_pid()?;
+
+    let cmdline = std::fs::read(format!("/proc/{pid}/cmdline"))
+        .map_err(|e| format!("Can't read cmdline for PID {pid}: {e}"))?;
+    let args: Vec<&[u8]> = cmdline.split(|&b| b == 0).collect();
+
+    let mut csrf = String::new();
+    let mut ext_port = String::new();
+
+    for (i, arg) in args.iter().enumerate() {
+        if let Ok(s) = std::str::from_utf8(arg) {
+            match s {
+                "--csrf_token" | "-csrf_token" => {
+                    if let Some(next) = args.get(i + 1) {
+                        if let Ok(val) = std::str::from_utf8(next) {
+                            csrf = val.to_string();
+                        }
+                    }
+                }
+                "--extension_server_port" | "-extension_server_port" => {
+                    if let Some(next) = args.get(i + 1) {
+                        if let Ok(val) = std::str::from_utf8(next) {
+                            ext_port = val.to_string();
+                        }
+                    }
+                }
+                _ => {}
+            }
+        }
+    }
+
+    if csrf.is_empty() {
+        return Err("Could not find CSRF token from main LS".to_string());
+    }
+    if ext_port.is_empty() {
+        return Err("Could not find extension_server_port from main LS".to_string());
+    }
+
+    info!(
+        pid,
+        ext_port,
+        csrf_len = csrf.len(),
+        "Discovered main LS config"
+    );
+
+    Ok(MainLSConfig {
+        extension_server_port: ext_port,
+        csrf,
+    })
+}
+
+/// Find the PID of the main (real) LS process.
+///
+/// Checks `/proc/<pid>/exe` to ensure we find the actual LS binary,
+/// not bash scripts that happen to mention `language_server_linux` in their args.
+fn find_main_ls_pid() -> Result<String, String> {
+    let proc = std::path::Path::new("/proc");
+    if !proc.exists() {
+        return Err("No /proc filesystem".to_string());
+    }
+
+    let entries = std::fs::read_dir(proc)
+        .map_err(|e| format!("Cannot read /proc: {e}"))?;
+
+    for entry in entries.flatten() {
+        let name = entry.file_name();
+        let name_str = name.to_string_lossy();
+        // Only numeric dirs (PIDs)
+        if !name_str.chars().all(|c| c.is_ascii_digit()) {
+            continue;
+        }
+        let exe_link = entry.path().join("exe");
+        if let Ok(target) = std::fs::read_link(&exe_link) {
+            let target_str = target.to_string_lossy().to_string();
+            let target_clean = target_str.trim_end_matches(" (deleted)");
+            // Must be the actual LS binary, not a bash script
+            if target_clean.contains("language_server_linux")
+                || target_clean.contains("antigravity-language-server")
+            {
+                return Ok(name_str.to_string());
+            }
+        }
+    }
+
+    Err("No main LS process found — Antigravity must be running".to_string())
+}
+
+/// Find a free TCP port by binding to port 0.
+fn find_free_port() -> Result<u16, String> {
+    let listener =
+        TcpListener::bind("127.0.0.1:0").map_err(|e| format!("Failed to bind for port: {e}"))?;
+    listener
+        .local_addr()
+        .map(|a| a.port())
+        .map_err(|e| format!("Failed to get port: {e}"))
+}
+
+/// Check if the dedicated LS system user exists.
+///
+/// When the user exists, the proxy spawns the LS as that UID so iptables
+/// can scope the :443 redirect to only the standalone LS process.
+fn has_ls_user() -> bool {
+    Command::new("id")
+        .args(["-u", LS_USER])
+        .stdout(Stdio::null())
+        .stderr(Stdio::null())
+        .status()
+        .map(|s| s.success())
+        .unwrap_or(false)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_find_free_port() {
+        let port = find_free_port().unwrap();
+        assert!(port > 0);
+        // Port should be available — try binding to it
+        let listener = TcpListener::bind(format!("127.0.0.1:{port}"));
+        assert!(listener.is_ok(), "Port {port} should be free");
+    }
+}