chore: remove outdated planning documents and the known issues file.chore: remove outdated planning documents and the known issues file.

2026-02-18 03:33:47 -06:00
parent 7577e28229
commit ea12127acb
4 changed files with 4 additions and 455 deletions
--- a/.gemini/plans/sync-and-latency.md
+++ b/.gemini/plans/sync-and-latency.md
@@ -1,46 +0,0 @@
-# Sync All Endpoints + Latency + Thinking Streaming
-
-## Phase 1: Sync Responses API (`/v1/responses`) with LS bypass
-
-Current state:
-
- `handle_responses_stream` (line 529-859) polls LS steps for text
- Doesn't use MitmStore bypass at all
- Still suffers from LS multi-turn overhead when tools are active
-
-Fix:
-
- Add MITM bypass path (same as completions) — check MitmStore for text + function calls
- For function calls: emit `response.output_item.added` (function_call type) + done events
- For text: stream from MitmStore `captured_response_text` + `response_complete`
-
-## Phase 2: Sync Gemini endpoint (`/v1/gemini`) with LS bypass
-
-Current state:
-
- `handle_gemini` (line 57-236) uses `poll_for_response` then checks MitmStore
- Already checks `take_any_function_calls()` after polling
- But `poll_for_response` still goes through LS steps
-
-Fix:
-
- When tools are active, poll MitmStore directly instead of `poll_for_response`
-
-## Phase 3: Latency improvements
-
- Reduce poll intervals across all handlers
- Add MITM store thinking_text capture for real-time streaming
-
-## Phase 4: Real-time thinking streaming investigation
-
-Current state:
-
- Google SSE includes `thought: true` parts with thinking text
- `streaming_acc.thinking_text` accumulates this
- Currently only used for final usage stats, not streamed in real-time
-
-Investigation needed:
-
- The MITM intercept already captures thinking_text per-chunk
- Need to store thinking_text updates in MitmStore incrementally
- Responses handler can then stream thinking deltas in real-time
--- a/.gemini/plans/tool-calls-implementation.md
+++ b/.gemini/plans/tool-calls-implementation.md
@@ -1,292 +0,0 @@
-# Tool Call Implementation Plan
-
-## Overview
-
-Add full tool call support to the Antigravity proxy. Primary endpoint is OpenAI Responses API (`/v1/responses`), with a Gemini-native backup endpoint (`/v1/gemini`). Tools are stored per-session, all `tool_choice` modes supported, parallel tool calls supported.
-
-## Data Flow
-
-```
-┌─────────┐      ┌───────────┐      ┌────┐      ┌──────┐      ┌────────┐
-│  Client  │─────▶│  Proxy    │─────▶│ LS │─────▶│ MITM │─────▶│ Google │
-│ (openai) │      │ (axum)    │      │    │      │      │      │        │
-│          │◀─────│           │◀─────│    │◀─────│      │◀─────│        │
-└─────────┘      └───────────┘      └────┘      └──────┘      └────────┘
-     │                │                             │              │
-     │  tools (OAI)   │  store tools (Gemini fmt)   │  inject      │
-     │───────────────▶│────────────▶ MitmStore ─────▶│  tools       │
-     │                │                             │──────────────▶│
-     │                │                             │              │
-     │                │                             │ functionCall  │
-     │                │◀──── capture ───────────────│◀──────────────│
-     │  tool_calls    │                             │ block follow  │
-     │◀───────────────│                             │  ups          │
-     │                │                             │              │
-     │  tool result   │  store result               │  inject      │
-     │───────────────▶│────────────▶ MitmStore ─────▶│ fn response  │
-     │                │                             │──────────────▶│
-     │  final text    │                             │              │
-     │◀───────────────│◀────────────────────────────│◀──────────────│
-```
-
-## Format Differences
-
-### Tool Definitions
-
-| Aspect       | OpenAI                                 | Gemini                             |
-| ------------ | -------------------------------------- | ---------------------------------- |
-| Wrapper      | `{"type":"function","function":{...}}` | `{"functionDeclarations":[{...}]}` |
-| Type strings | lowercase: `"object"`, `"string"`      | UPPERCASE: `"OBJECT"`, `"STRING"`  |
-| Parameters   | JSON Schema subset                     | Same schema, uppercase types       |
-
-### Tool Choice
-
-| OpenAI                                        | Gemini toolConfig                                                       |
-| --------------------------------------------- | ----------------------------------------------------------------------- |
-| `"auto"`                                      | `{"functionCallingConfig":{"mode":"AUTO"}}`                             |
-| `"required"`                                  | `{"functionCallingConfig":{"mode":"ANY"}}`                              |
-| `"none"`                                      | `{"functionCallingConfig":{"mode":"NONE"}}`                             |
-| `{"type":"function","function":{"name":"X"}}` | `{"functionCallingConfig":{"mode":"ANY","allowedFunctionNames":["X"]}}` |
-
-### Tool Call Response
-
-| OpenAI (what we return)                                                                            | Gemini (what Google returns)                                    |
-| -------------------------------------------------------------------------------------------------- | --------------------------------------------------------------- |
-| `output: [{"type":"function_call","call_id":"call_xxx","name":"get_weather","arguments":"{...}"}]` | `parts: [{"functionCall":{"name":"get_weather","args":{...}}}]` |
-
-### Tool Result Submission
-
-| OpenAI (what client sends)                                                       | Gemini (what we inject into Google request)                                                                                  |
-| -------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
-| `input: [{"type":"function_call_output","call_id":"call_xxx","output":"{...}"}]` | `contents: [{role:"model",parts:[{functionCall:...}]},{role:"user",parts:[{functionResponse:{name:"...",response:{...}}}]}]` |
-
---
-
-## Implementation Phases
-
-### Phase 1: Store Infrastructure (`store.rs`)
-
-Add to `MitmStore`:
-
-```rust
-/// Active tool definitions (Gemini format) for MITM injection.
-active_tools: Arc<RwLock<Option<Vec<Value>>>>,
-/// Active tool config (Gemini toolConfig format).
-active_tool_config: Arc<RwLock<Option<Value>>>,
-/// Pending tool results for MITM to inject as functionResponse.
-pending_tool_results: Arc<RwLock<Vec<PendingToolResult>>>,
-/// Mapping call_id → function name for tool result routing.
-call_id_to_name: Arc<RwLock<HashMap<String, String>>>,
-/// Last captured function calls (for conversation history rewriting).
-last_function_calls: Arc<RwLock<Vec<CapturedFunctionCall>>>,
-```
-
-New types:
-
-```rust
-pub struct PendingToolResult {
-    pub name: String,
-    pub result: serde_json::Value,
-}
-```
-
-New methods:
-
- `set_tools(tools)` / `get_tools()` / `clear_tools()`
- `set_tool_config(config)` / `get_tool_config()`
- `add_tool_result(result)` / `take_tool_results()`
- `register_call_id(call_id, name)` / `lookup_call_id(call_id)`
- `set_last_function_calls(calls)` / `get_last_function_calls()`
-
-### Phase 2: Request Types (`types.rs`)
-
-Add to `ResponsesRequest`:
-
-```rust
-#[serde(default)]
-pub tools: Option<Vec<serde_json::Value>>,
-#[serde(default)]
-pub tool_choice: Option<serde_json::Value>,
-```
-
-New output builder:
-
-```rust
-pub fn build_function_call_output(call_id: &str, name: &str, arguments: &str) -> Value
-```
-
-### Phase 3: Format Conversion + Dynamic Injection (`modify.rs`)
-
-New public struct:
-
-```rust
-pub struct ToolContext {
-    pub tools: Option<Vec<Value>>,          // Gemini functionDeclarations
-    pub tool_config: Option<Value>,         // Gemini toolConfig
-    pub pending_results: Vec<PendingToolResult>,  // Tool results to inject
-    pub last_calls: Vec<CapturedFunctionCall>,    // For history rewriting
-}
-```
-
-New conversion functions:
-
-```rust
-pub fn openai_tools_to_gemini(tools: &[Value]) -> Vec<Value>     // OAI → Gemini format
-pub fn openai_tool_choice_to_gemini(choice: &Value) -> Value     // OAI → Gemini toolConfig
-fn uppercase_types(val: Value) -> Value                          // Recursive type case fix
-```
-
-Change `modify_request` signature:
-
-```rust
-pub fn modify_request(body: &[u8], tool_ctx: Option<&ToolContext>) -> Option<Vec<u8>>
-```
-
-Tool injection logic:
-
-1. Strip all LS tools (existing)
-2. If `tool_ctx.tools` provided → inject as Gemini `functionDeclarations`
-3. If `tool_ctx.tool_config` provided → inject as `toolConfig`
-4. If `tool_ctx.pending_results` not empty → rewrite conversation history:
-   - Find model turn with "Tool call completed" → replace with `functionCall` parts
-   - Find last user turn → prepend `functionResponse` part
-
-### Phase 4: MITM Plumbing (`proxy.rs`)
-
-In `handle_http_over_tls`, before calling `modify_request`:
-
-1. Read `get_tools()`, `get_tool_config()`, `take_tool_results()`, `get_last_function_calls()` from store
-2. Build `ToolContext`
-3. Pass to `modify_request(body, tool_ctx)`
-
-After response capture:
-
-1. Save captured function calls as `last_function_calls` (for future history rewriting)
-
-### Phase 5: API Handler (`responses.rs`)
-
-#### Request handling (in `handle_responses`):
-
-1. If `body.tools` provided:
-   - Convert OpenAI → Gemini format via `openai_tools_to_gemini()`
-   - Store in `MitmStore` via `set_tools()`
-2. If `body.tool_choice` provided:
-   - Convert via `openai_tool_choice_to_gemini()`
-   - Store in `MitmStore` via `set_tool_config()`
-3. Check `body.input` for `function_call_output` items:
-   - If found: look up `call_id` → function name via `lookup_call_id()`
-   - Store as `PendingToolResult` via `add_tool_result()`
-   - Extract any accompanying text (or use placeholder)
-
-#### Response handling (in `handle_responses_sync` / `handle_responses_stream`):
-
-After polling completes:
-
-1. Check `take_any_function_calls()` for captured tool calls
-2. If captured:
-   - Generate `call_id` for each (e.g., `"call_" + random`)
-   - Register `call_id → name` mapping via `register_call_id()`
-   - Build `function_call` output items via `build_function_call_output()`
-   - Return these INSTEAD of the text message output
-3. If no tool calls: existing text response behavior
-
-### Phase 6: Gemini-Native Endpoint (`gemini.rs` + `mod.rs`)
-
-New file `src/api/gemini.rs` with handler `handle_gemini`:
-
- Accepts tools in Gemini `functionDeclarations` format directly (no conversion)
- Accepts `toolConfig` directly
- Returns `functionCall` in Gemini format directly
- Same cascade/session management as responses.rs
- Much simpler — no format translation
-
-Route: `POST /v1/gemini` in `mod.rs`
-
---
-
-## File Change Summary
-
-| File                   | Changes                                                                 | Complexity |
-| ---------------------- | ----------------------------------------------------------------------- | ---------- |
-| `src/mitm/store.rs`    | Add tool context storage (5 new fields, ~10 methods)                    | Medium     |
-| `src/api/types.rs`     | Add `tools`/`tool_choice` to request, add output builder                | Low        |
-| `src/mitm/modify.rs`   | `ToolContext`, format conversion, dynamic injection, history rewrite    | High       |
-| `src/mitm/proxy.rs`    | Read store → build ToolContext → pass to modify                         | Low        |
-| `src/api/responses.rs` | Store tools, detect tool results in input, return function_call outputs | High       |
-| `src/api/gemini.rs`    | New file — Gemini-native endpoint (passthrough)                         | Medium     |
-| `src/api/mod.rs`       | Add route + module declaration                                          | Low        |
-
-## Implementation Order
-
-1. `store.rs` — foundation, no dependencies
-2. `types.rs` — request/response types
-3. `modify.rs` — format conversion + injection (depends on store types)
-4. `proxy.rs` — plumbing (depends on modify signature)
-5. Build + verify compilation
-6. `responses.rs` — handler changes (depends on all above)
-7. Build + test with `get_weather` request
-8. `gemini.rs` + `mod.rs` — Gemini endpoint
-9. Build + test with Gemini format
-10. Tool result flow test (multi-turn)
-
-## Testing Strategy
-
-### Test 1: Basic tool call (sync)
-
-```bash
-curl -s http://localhost:8741/v1/responses -H "Content-Type: application/json" -d '{
-  "model": "gemini-3-flash",
-  "input": "What is the weather in Tokyo?",
-  "tools": [{"type":"function","function":{"name":"get_weather","description":"Get weather","parameters":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}}}],
-  "tool_choice": "auto",
-  "conversation": "tool-test",
-  "stream": false
-}'
-# Expected: output contains function_call with name=get_weather, arguments={"city":"Tokyo"}
-```
-
-### Test 2: Tool result submission (multi-turn)
-
-```bash
-curl -s http://localhost:8741/v1/responses -H "Content-Type: application/json" -d '{
-  "model": "gemini-3-flash",
-  "input": [{"type":"function_call_output","call_id":"call_xxx","output":"{\"temp\":72,\"unit\":\"F\"}"}],
-  "conversation": "tool-test",
-  "stream": false
-}'
-# Expected: output contains text response using the tool result
-```
-
-### Test 3: Gemini-native endpoint
-
-```bash
-curl -s http://localhost:8741/v1/gemini -H "Content-Type: application/json" -d '{
-  "model": "gemini-3-flash",
-  "input": "What is the weather in Tokyo?",
-  "tools": [{"functionDeclarations":[{"name":"get_weather","description":"Get weather","parameters":{"type":"OBJECT","properties":{"city":{"type":"STRING"}},"required":["city"]}}]}],
-  "conversation": "gemini-tool-test",
-  "stream": false
-}'
-# Expected: response contains functionCall in Gemini format
-```
-
-### Test 4: No tools (regression)
-
-```bash
-curl -s http://localhost:8741/v1/responses -H "Content-Type: application/json" -d '{
-  "model": "gemini-3-flash",
-  "input": "What is 2+2?",
-  "stream": false
-}'
-# Expected: normal text response, no tool call behavior
-```
-
-## Risks & Mitigations
-
-| Risk                                                             | Impact | Mitigation                                                                |
-| ---------------------------------------------------------------- | ------ | ------------------------------------------------------------------------- |
-| History rewriting breaks conversation                            | High   | Only rewrite when pending_results non-empty; keep original as fallback    |
-| LS times out waiting for Google response during tool result turn | Medium | Increase timeout for tool result turns                                    |
-| Multiple parallel tool calls create race conditions              | Medium | AtomicBool + sequential processing already handles this                   |
-| `modify_request` test breakage                                   | Low    | Update existing tests for new signature                                   |
-| Global tool storage conflicts across concurrent requests         | Medium | Not an issue — LS processes one request at a time (single cascade active) |
--- a/.gitignore
+++ b/.gitignore
@@ -7,3 +7,7 @@
 !README.txt
 test_output.json
 captured-request-*.json
+
+# Agent artifacts
+.gemini/plans/
+KNOWN_ISSUES.md
--- a/KNOWN_ISSUES.md
+++ b/KNOWN_ISSUES.md
@@ -1,117 +0,0 @@
-# Known Issues & Future Work
-
-All critical blockers have been resolved. Standalone LS with MITM interception
-is fully working. Reactive streaming is implemented with polling fallback.
-All three API endpoints (Responses, Completions, Gemini) now bypass the LS
-when custom tools are active, reading directly from MitmStore.
-
---
-
-## ✅ Resolved
-
-### ~~LS Go LLM Client Ignores System TLS Trust Store~~
-
-**Status: SOLVED (2026-02-14)**
-
-Previously the #1 blocker. The standalone LS (`--standalone` flag, now default)
-routes all LLM API traffic through the MITM proxy with full decryption.
-
-**Solution:**
-
-1. **UID-scoped iptables** — `scripts/mitm-redirect.sh` creates an `antigravity-ls`
-   system user. iptables redirects only that UID's port-443 traffic → MITM port.
-2. **Combined CA bundle** — The Go client honors `SSL_CERT_FILE` when set on
-   the standalone process. A combined bundle (system CAs + MITM CA) is written
-   to `/tmp/antigravity-mitm-combined-ca.pem`.
-3. **`sudo -u` spawning** — The proxy spawns the LS as the `antigravity-ls` user,
-   so only the standalone LS traffic is intercepted. No impact on other software.
-4. **Google SSE parsing** — MITM parses `streamGenerateContent?alt=sse` responses
-   and extracts `promptTokenCount`, `candidatesTokenCount`, `thoughtsTokenCount`.
-
-**Verified:** `/v1/usage` returns per-model token usage from intercepted traffic.
-
-### ~~Polling-Based Cascade Updates~~
-
-**Status: SOLVED (2026-02-14)**
-
-`StreamCascadeReactiveUpdates` is now used for real-time cascade state
-notifications. Falls back to timer-based polling if the streaming RPC is
-unavailable. Reactive diffs also carry progressive response text and thinking
-content (see `docs/panel-stream-investigation.md`).
-
-### ~~StreamCascadePanelReactiveUpdates — Dead End~~
-
-**Status: INVESTIGATED & CLOSED (2026-02-14)**
-
-`CascadePanelState` only contains `plan_status` and `user_settings` — not
-thinking text. The panel reactive component uses a workspace-scoped ID, not
-cascade IDs. See `docs/panel-stream-investigation.md`.
-
-### ~~Request Modification Not Implemented~~
-
-**Status: SOLVED (2026-02-15)**
-
-`MitmConfig.modify_requests` is now `true` by default. Used for:
-
- Tool/function call injection into LS requests (Gemini `functionDeclarations`)
- Tool result injection as `functionResponse` parts
- LS bypass when custom tools are active (response captured directly from MITM)
-
-### ~~Cascade Correlation Is Heuristic~~
-
-**Status: SOLVED (2026-02-15)**
-
-Previously, MITM usage was keyed under `_latest` because `extract_cascade_hint()`
-couldn't parse the chunked-encoded Google SSE request body.
-
-**Fix:** API handlers now call `mitm_store.set_active_cascade(cascade_id)` before
-sending messages. `record_usage()` falls back to this active cascade ID when the
-heuristic hint is absent, properly correlating usage to cascades.
-
-### ~~Progressive Thinking Streaming~~
-
-**Status: SOLVED (2026-02-15)**
-
-Thinking text now streams progressively as delta events. The implementation:
-
-1. **LS cascade steps** — `plannerResponse.thinking` (field 3) grows progressively
-   as the LS receives data. For Opus 4.6, thinking text builds up word-by-word
-   over ~1-2s. For Gemini Flash, thinking arrives in 1-2 larger chunks.
-2. **Delta tracking** — `last_thinking_len` tracks the previously emitted length.
-   Each poll compares current thinking length and emits only the new characters
-   as `response.reasoning_summary_text.delta` events.
-3. **Lifecycle** — Structure events (`output_item.added`, `summary_part.added`)
-   emit on first thinking appearance. `done` events emit when response text
-   first appears (indicating thinking phase completed).
-
-**Verified with Opus 4.6:** (2026-02-15 13:22 UTC)
-
-```
-delta_len=24  "The user is asking about"
-delta_len=61  " the Collatz conjecture..."
-delta_len=5   " This"
-delta_len=10  " is a pure"
-... (11 progressive deltas over ~850ms)
-```
-
---
-
-## 🟢 Low
-
-### 1. MITM Integration Tests
-
-Unit tests cover protobuf decoding and intercept parsing (18 tests pass).
-Integration tests for the full MITM pipeline (TLS interception, response
-parsing, usage recording) would be valuable now that interception works.
-
-### 2. MITM for Main Antigravity Session
-
-The current MITM only works for the standalone LS (default mode).
-Intercepting the main Antigravity session's LS is harder because:
-
- The main LS is managed by the Antigravity app, not by us
- UID-scoped iptables can't target it without affecting all user traffic
- The `mitm-wrapper.sh` approach sets env vars but the LLM client ignores
-  `HTTPS_PROXY` unless `detect_and_use_proxy` is ENABLED via init metadata
-
-**Workaround:** Use standalone mode (default) for all proxy traffic.