fix: resolve cascade correlation, update KNOWN_ISSUES

- MitmStore: added active_cascade_id field with set/get/clear methods - record_usage() now falls back to active_cascade_id when the heuristic cascade hint is absent (fixes usage always going to _latest) - All three API handlers set active cascade before send_message - KNOWN_ISSUES: moved 3 issues to resolved: - Request modification (already true, was stale entry) - Cascade correlation (fixed via active_cascade_id) - Progressive thinking streaming (fixed via MITM bypass)
2026-02-15 01:10:34 -06:00
parent b3af73cebd
commit 981fb3b18d
5 changed files with 59 additions and 26 deletions
--- a/KNOWN_ISSUES.md
+++ b/KNOWN_ISSUES.md
@@ -2,6 +2,8 @@

 All critical blockers have been resolved. Standalone LS with MITM interception
 is fully working. Reactive streaming is implemented with polling fallback.
+All three API endpoints (Responses, Completions, Gemini) now bypass the LS
+when custom tools are active, reading directly from MitmStore.

 ---

@@ -45,43 +47,48 @@ content (see `docs/panel-stream-investigation.md`).
 thinking text. The panel reactive component uses a workspace-scoped ID, not
 cascade IDs. See `docs/panel-stream-investigation.md`.

---
+### ~~Request Modification Not Implemented~~

-## 🟡 Medium (Architecture / Future Work)
+**Status: SOLVED (2026-02-15)**

-### 1. Cascade Correlation Is Heuristic
+`MitmConfig.modify_requests` is now `true` by default. Used for:

-**File:** `src/mitm/intercept.rs` — `extract_cascade_hint()`
+- Tool/function call injection into LS requests (Gemini `functionDeclarations`)
+- Tool result injection as `functionResponse` parts
+- LS bypass when custom tools are active (response captured directly from MITM)

-The MITM proxy matches intercepted API traffic to cascade IDs heuristically.
-Currently all intercepted usage is stored under `_latest` because the Google
-SSE request body is empty (`content_length=0` — the LS sends the request body
-via chunked encoding that isn't captured in the hint extractor).
+### ~~Cascade Correlation Is Heuristic~~

-**Impact:** Usage shows up in `/v1/usage` aggregate stats but isn't correlated
-to specific cascades. Not blocking — aggregate usage is the primary use case.
+**Status: SOLVED (2026-02-15)**

---
+Previously, MITM usage was keyed under `_latest` because `extract_cascade_hint()`
+couldn't parse the chunked-encoded Google SSE request body.

-### 2. Request Modification Not Implemented
+**Fix:** API handlers now call `mitm_store.set_active_cascade(cascade_id)` before
+sending messages. `record_usage()` falls back to this active cascade ID when the
+heuristic hint is absent, properly correlating usage to cascades.

-**File:** `src/mitm/proxy.rs` — `modify_requests: bool`
+### ~~Progressive Thinking Streaming~~

-The `MitmConfig.modify_requests` flag is plumbed through but hardcoded to `false`.
-Reserved for future request mutation features (e.g., injecting custom system
-prompts, modifying model selection).
+**Status: SOLVED (2026-02-15)**
+
+The MITM proxy now captures `thinking_text` from `StreamingAccumulator` into
+`MitmStore` as SSE chunks arrive. The Responses API streaming handler reads
+thinking deltas from MitmStore and emits `response.reasoning_summary_text.delta`
+events in real-time. This works for both Google (`thought: true` parts) and
+Anthropic (`thinking_delta`) formats.

 ---

 ## 🟢 Low

-### 3. MITM Integration Tests
+### 1. MITM Integration Tests

 Unit tests cover protobuf decoding and intercept parsing (18 tests pass).
 Integration tests for the full MITM pipeline (TLS interception, response
 parsing, usage recording) would be valuable now that interception works.

-### 4. MITM for Main Antigravity Session
+### 2. MITM for Main Antigravity Session

 The current MITM only works for the standalone LS (default mode).
 Intercepting the main Antigravity session's LS is harder because:
@@ -92,10 +99,3 @@ Intercepting the main Antigravity session's LS is harder because:
  `HTTPS_PROXY` unless `detect_and_use_proxy` is ENABLED via init metadata

 **Workaround:** Use standalone mode (default) for all proxy traffic.
-
-### 5. Progressive Thinking Streaming
-
-For extended-thinking models (Opus), thinking text may arrive progressively
-across multiple reactive diffs. Currently thinking is captured atomically via
-polling. Progressive streaming would require parsing reactive diff field numbers
-to extract incremental thinking deltas. See `docs/panel-stream-investigation.md`.