From f3fd203a53cca90ffe0376246712b33110d8afc7 Mon Sep 17 00:00:00 2001 From: Nikketryhard Date: Sat, 14 Feb 2026 16:02:01 -0600 Subject: [PATCH] chore: rewrite KNOWN_ISSUES with investigation verdicts and confidence levels --- KNOWN_ISSUES.md | 176 ++++++++++++++++++++++++++++-------------------- 1 file changed, 104 insertions(+), 72 deletions(-) diff --git a/KNOWN_ISSUES.md b/KNOWN_ISSUES.md index b91c9e6..a4b6875 100644 --- a/KNOWN_ISSUES.md +++ b/KNOWN_ISSUES.md @@ -1,97 +1,129 @@ # Known Issues & Future Work ---- - -## Medium - -### 1. Cascade Correlation Is Heuristic - -**File:** `src/mitm/intercept.rs` — `extract_cascade_hint()` - -The MITM proxy matches intercepted API traffic to cascade IDs by scanning for `metadata.user_id` or `workspace_id` in the request body. If neither is found, it stores under `_latest`. Since `take_usage()` no longer falls back to `_latest`, unidentified requests will have **no MITM usage data at all**. - -**Fix:** Investigate the actual request body format the LS sends for better correlation keys. Alternatively, use timing-based correlation (match MITM capture timestamp to cascade polling window). +All fixable issues from the original report have been resolved. The remaining +items require either architectural changes, new features, or deep investigation +of the Go language server binary. --- -### 2. Request Modification Not Implemented +## 🔴 Blockers (Require Deep Investigation) -**File:** `src/mitm/proxy.rs` — `modify_requests: false` - -The `MitmConfig.modify_requests` flag exists and is plumbed through, but no actual modification logic is implemented. The flag is hardcoded to `false`. - -**Fix:** When needed, implement request body mutation in `handle_http_over_tls()` — parse JSON, modify, reserialize, update `Content-Length`. - ---- - -### 3. Polling-Based Cascade Updates vs Streaming RPC - -**File:** `src/api/polling.rs` - -We poll `GetCascadeTrajectorySteps` on a timer to check for new cascade output. The LS has a `StreamCascadeReactiveUpdates` streaming gRPC method that pushes updates in real-time. Our polling approach works but adds latency and unnecessary requests. - -**Impact:** Functional but suboptimal. The streaming approach would give lower latency and less LS load, but requires maintaining a long-lived gRPC stream and handling reconnection. - -**See:** `docs/ls-binary-analysis.md` → gRPC Services → LanguageServerService - ---- - -### 4. No BYOK Model Routing - -**File:** `src/api/models.rs` - -The LS supports BYOK (Bring Your Own Key) variants for Claude and OpenAI models (e.g., `MODEL_CLAUDE_4_SONNET_BYOK`, `MODEL_OPENAI_COMPATIBLE`). Our proxy only exposes the 5 built-in placeholder models. Users with BYOK keys can't use them through the proxy. - -**Fix:** Add a mechanism to register BYOK models at runtime (e.g., via a config file or API endpoint). The BYOK model IDs and their proto enum numbers are documented in `docs/ls-binary-analysis.md`. - ---- - -## 🟢 Low - -### 5. No Integration Tests for MITM Module - -The MITM module has unit tests for protobuf decoding and intercept parsing, but no integration tests that verify: - -- TLS interception end-to-end with the generated CA -- Full HTTP/1.1 request/response cycle through the proxy -- gRPC (HTTP/2) request/response cycle through `h2_handler` -- Store recording and retrieval under concurrency -- Wrapper script install/uninstall lifecycle - ---- - -## Blockers - -### 6. LS Go LLM Client Ignores System TLS Trust Store +### 1. LS Go LLM Client Ignores System TLS Trust Store **File:** `docs/mitm-interception-status.md` -The LS binary is a Go program whose HTTP client for LLM API calls uses a custom `tls.Config` that does **not** trust system CAs or honor `SSL_CERT_FILE`. This means our MITM proxy's generated CA cert is rejected even when properly installed system-wide. +The LS binary's Go HTTP client for LLM API calls uses a custom `tls.Config` that +does **not** trust system CAs or honor `SSL_CERT_FILE`. Our MITM proxy can route +traffic but not decrypt it. -The extension patch (`detectAndUseProxy=1`) only makes the LS honor `HTTPS_PROXY` for routing — it doesn't fix CA trust. Without this, the MITM proxy can route but not decrypt LLM traffic. +**Investigation status:** All practical approaches have been tried and failed: -**Potential fixes:** +- iptables REDIRECT → redirect loop + broke all HTTPS traffic +- DNS redirect → same TLS trust failure +- LD_PRELOAD → Go doesn't use libc for syscalls +- SSLKEYLOGFILE → Go doesn't support it -- Binary patching the Go TLS verification (hard, breaks on updates) -- Full standalone LS control (in progress, see issue #7) -- Network namespace + iptables redirect (eliminates HTTPS_PROXY need but doesn't fix TLS trust) -- eBPF/ptrace to inject certs at runtime (complex) +**Remaining options (untried):** + +- Binary patching Go TLS verification (fragile, breaks on updates) +- Full standalone LS control (see issue #2) +- eBPF/ptrace syscall interception (complex) +- Network namespace isolation (complex setup) + +**Confidence: <30%** — all easy paths exhausted. Requires reverse engineering the Go binary's TLS setup. **See:** `docs/mitm-interception-status.md` for full analysis --- -### 7. Standalone LS Cascades Silently Fail +### 2. Standalone LS Cascades Silently Fail **File:** `docs/standalone-ls-todo.md` -When running a standalone LS instance (outside of Antigravity), cascades start but produce no output. The LS accepts `StartCascade` RPCs without error, but the cascade never progresses. +Standalone LS (outside Antigravity) accepts `StartCascade` RPCs without error +but cascade never progresses. No output. **Suspected blockers:** -- Missing auth context (OAuth token not properly propagated) -- Unleash feature flags differ between main and standalone instances (`GetUnleashData` returns different flags) -- `LoadCodeAssist` / `OnboardUser` initialization steps may be required -- Extension server callbacks (`WriteCascadeEdit`, `ExecuteCommand`, etc.) have no handler +- Missing auth context (OAuth token propagation) +- Different Unleash feature flags between main and standalone instances +- Missing initialization steps (`LoadCodeAssist`, `OnboardUser`) +- Missing extension server callbacks (`WriteCascadeEdit`, `ExecuteCommand`) + +**Confidence: <30%** — too many unknowns. Needs systematic debugging with the standalone LS. **See:** `docs/standalone-ls-todo.md` for investigation plan + +--- + +## Medium (Architecture / Future Work) + +### 3. Cascade Correlation Is Heuristic + +**File:** `src/mitm/intercept.rs` — `extract_cascade_hint()` + +The MITM proxy matches intercepted API traffic to cascade IDs heuristically: + +- HTTP/1.1 path: scans JSON body for `metadata.user_id` or `workspace_id` +- gRPC/H2 path: recursively searches proto fields for UUID strings + +If neither method finds a match, usage is stored under `_latest` but never +consumed (since `take_usage()` requires exact cascade ID match). + +**Confidence: <50%** — can't test without working MITM interception (blocked by issue #1). The heuristic is reasonable but unverified against real traffic. + +--- + +### 4. Request Modification Not Implemented + +**File:** `src/mitm/proxy.rs` — `modify_requests: bool` + +The `MitmConfig.modify_requests` flag is plumbed through the entire call chain +but hardcoded to `false`. No modification logic exists. This is intentional +scaffolding for future use. + +**Status:** Not a bug — reserved for potential request mutation features. + +--- + +### 5. Polling-Based Cascade Updates vs Streaming RPC + +**File:** `src/api/polling.rs` + +We poll `GetCascadeTrajectorySteps` on a timer. The LS has a +`StreamCascadeReactiveUpdates` streaming gRPC method that pushes updates +in real-time. Polling works but adds latency. + +**Status:** Functional but suboptimal. Switching to streaming requires +implementing a gRPC streaming client with reconnection handling. Not blocking. + +--- + +### 6. No BYOK Model Routing + +**File:** `src/api/models.rs` + +The LS supports BYOK (Bring Your Own Key) models (e.g., `MODEL_CLAUDE_4_SONNET_BYOK`, +`MODEL_OPENAI_COMPATIBLE`). Our proxy only exposes the 5 built-in placeholder +models. + +**Status:** Feature request. Would need a runtime model registration mechanism. +Proto enum numbers are documented in `docs/ls-binary-analysis.md`. + +--- + +## 🟢 Low + +### 7. No Integration Tests for MITM Module + +Unit tests cover protobuf decoding and intercept parsing (17 tests pass), but +no integration tests for: + +- TLS interception end-to-end with the generated CA +- Full HTTP/1.1 request/response cycle through the proxy +- gRPC (HTTP/2) request/response cycle through `h2_handler` +- Store recording and retrieval under concurrency + +**Status:** The MITM can't intercept real traffic anyway (blocked by issue #1), +so integration tests would be somewhat hypothetical. Worth adding when the TLS +blocker is resolved.