fix: resolve cascade correlation, update KNOWN_ISSUES

- MitmStore: added active_cascade_id field with set/get/clear methods - record_usage() now falls back to active_cascade_id when the heuristic cascade hint is absent (fixes usage always going to _latest) - All three API handlers set active cascade before send_message - KNOWN_ISSUES: moved 3 issues to resolved: - Request modification (already true, was stale entry) - Cascade correlation (fixed via active_cascade_id) - Progressive thinking streaming (fixed via MITM bypass)
2026-02-15 01:10:34 -06:00
parent b3af73cebd
commit 981fb3b18d
5 changed files with 59 additions and 26 deletions
--- a/KNOWN_ISSUES.md
+++ b/KNOWN_ISSUES.md
@@ -2,6 +2,8 @@
 All critical blockers have been resolved. Standalone LS with MITM interception
 is fully working. Reactive streaming is implemented with polling fallback.
 All three API endpoints (Responses, Completions, Gemini) now bypass the LS
 when custom tools are active, reading directly from MitmStore.
 ---
@@ -45,43 +47,48 @@ content (see `docs/panel-stream-investigation.md`).
 thinking text. The panel reactive component uses a workspace-scoped ID, not
 cascade IDs. See `docs/panel-stream-investigation.md`.
---
+### ~~Request Modification Not Implemented~~
-## 🟡 Medium (Architecture / Future Work)
+**Status: SOLVED (2026-02-15)**
-### 1. Cascade Correlation Is Heuristic
+`MitmConfig.modify_requests` is now `true` by default. Used for:
-**File:** `src/mitm/intercept.rs` — `extract_cascade_hint()`
+- Tool/function call injection into LS requests (Gemini `functionDeclarations`)
 - Tool result injection as `functionResponse` parts
 - LS bypass when custom tools are active (response captured directly from MITM)
-The MITM proxy matches intercepted API traffic to cascade IDs heuristically.
+### ~~Cascade Correlation Is Heuristic~~
 Currently all intercepted usage is stored under `_latest` because the Google
 SSE request body is empty (`content_length=0` — the LS sends the request body
 via chunked encoding that isn't captured in the hint extractor).
-**Impact:** Usage shows up in `/v1/usage` aggregate stats but isn't correlated
+**Status: SOLVED (2026-02-15)**
 to specific cascades. Not blocking — aggregate usage is the primary use case.
---
+Previously, MITM usage was keyed under `_latest` because `extract_cascade_hint()`
 couldn't parse the chunked-encoded Google SSE request body.
-### 2. Request Modification Not Implemented
+**Fix:** API handlers now call `mitm_store.set_active_cascade(cascade_id)` before
 sending messages. `record_usage()` falls back to this active cascade ID when the
 heuristic hint is absent, properly correlating usage to cascades.
-**File:** `src/mitm/proxy.rs` — `modify_requests: bool`
+### ~~Progressive Thinking Streaming~~
-The `MitmConfig.modify_requests` flag is plumbed through but hardcoded to `false`.
+**Status: SOLVED (2026-02-15)**
-Reserved for future request mutation features (e.g., injecting custom system
+
-prompts, modifying model selection).
+The MITM proxy now captures `thinking_text` from `StreamingAccumulator` into
 `MitmStore` as SSE chunks arrive. The Responses API streaming handler reads
 thinking deltas from MitmStore and emits `response.reasoning_summary_text.delta`
 events in real-time. This works for both Google (`thought: true` parts) and
 Anthropic (`thinking_delta`) formats.
 ---
 ## 🟢 Low
-### 3. MITM Integration Tests
+### 1. MITM Integration Tests
 Unit tests cover protobuf decoding and intercept parsing (18 tests pass).
 Integration tests for the full MITM pipeline (TLS interception, response
 parsing, usage recording) would be valuable now that interception works.
-### 4. MITM for Main Antigravity Session
+### 2. MITM for Main Antigravity Session
 The current MITM only works for the standalone LS (default mode).
 Intercepting the main Antigravity session's LS is harder because:
@@ -92,10 +99,3 @@ Intercepting the main Antigravity session's LS is harder because:
  `HTTPS_PROXY` unless `detect_and_use_proxy` is ENABLED via init metadata
 **Workaround:** Use standalone mode (default) for all proxy traffic.
 ### 5. Progressive Thinking Streaming
 For extended-thinking models (Opus), thinking text may arrive progressively
 across multiple reactive diffs. Currently thinking is captured atomically via
 polling. Progressive streaming would require parsing reactive diff field numbers
 to extract incremental thinking deltas. See `docs/panel-stream-investigation.md`.
--- a/src/api/completions.rs
+++ b/src/api/completions.rs
@@ -207,6 +207,7 @@ pub(crate) async fn handle_completions(
    };
    // Send message
    state.mitm_store.set_active_cascade(&cascade_id).await;
    match state
        .backend
        .send_message(&cascade_id, &user_text, model.model_enum)
--- a/src/api/gemini.rs
+++ b/src/api/gemini.rs
@@ -155,6 +155,7 @@ pub(crate) async fn handle_gemini(
    };
    // Send message
    state.mitm_store.set_active_cascade(&cascade_id).await;
    match state
        .backend
        .send_message(&cascade_id, &user_text, model.model_enum)
--- a/src/api/responses.rs
+++ b/src/api/responses.rs
@@ -278,6 +278,7 @@ pub(crate) async fn handle_responses(
    };
    // Send message
    state.mitm_store.set_active_cascade(&cascade_id).await;
    match state
        .backend
        .send_message(&cascade_id, &user_text, model.model_enum)
--- a/src/mitm/store.rs
+++ b/src/mitm/store.rs
@@ -89,6 +89,11 @@ pub struct MitmStore {
    /// Last captured function calls (for conversation history rewriting).
    last_function_calls: Arc<RwLock<Vec<CapturedFunctionCall>>>,
    // ── Cascade correlation ──────────────────────────────────────────────
    /// Active cascade ID set by the API layer before sending a message.
    /// Used by the MITM proxy to correlate intercepted traffic to cascades.
    active_cascade_id: Arc<RwLock<Option<String>>>,
    // ── Direct response capture (bypasses LS) ────────────────────────────
    /// Captured response text from MITM when custom tools are active.
    /// The completions/responses handler reads this instead of polling LS steps.
@@ -135,6 +140,7 @@ impl MitmStore {
            pending_tool_results: Arc::new(RwLock::new(Vec::new())),
            call_id_to_name: Arc::new(RwLock::new(HashMap::new())),
            last_function_calls: Arc::new(RwLock::new(Vec::new())),
            active_cascade_id: Arc::new(RwLock::new(None)),
            captured_response_text: Arc::new(RwLock::new(None)),
            captured_thinking_text: Arc::new(RwLock::new(None)),
            response_complete: Arc::new(AtomicBool::new(false)),
@@ -186,7 +192,13 @@ impl MitmStore {
        //   Call 2: thinking summary text (thinking_output_tokens == 0, response_text has the summary)
        //
        // When Call 2 arrives, we merge its response_text as thinking_text into Call 1's usage.
-        let key = cascade_id.map(|s| s.to_string()).unwrap_or_else(|| "_latest".to_string());
+        let key = if let Some(cid) = cascade_id {
            cid.to_string()
        } else if let Some(active) = self.active_cascade_id.read().await.as_ref() {
            active.clone()
        } else {
            "_latest".to_string()
        };
        let mut latest = self.latest_usage.write().await;
        if let Some(existing) = latest.get_mut(&key) {
@@ -436,4 +448,22 @@ impl MitmStore {
    pub async fn take_thinking_text(&self) -> Option<String> {
        self.captured_thinking_text.write().await.take()
    }
    // ── Cascade correlation ──────────────────────────────────────────────
    /// Set the active cascade ID (called by API handlers before sending a message).
    /// The MITM proxy will use this to correlate intercepted traffic.
    pub async fn set_active_cascade(&self, cascade_id: &str) {
        *self.active_cascade_id.write().await = Some(cascade_id.to_string());
    }
    /// Get the active cascade ID.
    pub async fn get_active_cascade(&self) -> Option<String> {
        self.active_cascade_id.read().await.clone()
    }
    /// Clear the active cascade ID (called after response is complete).
    pub async fn clear_active_cascade(&self) {
        *self.active_cascade_id.write().await = None;
    }
 }