chore: rewrite KNOWN_ISSUES with investigation verdicts and confidence levels
This commit is contained in:
176
KNOWN_ISSUES.md
176
KNOWN_ISSUES.md
@@ -1,97 +1,129 @@
|
||||
# Known Issues & Future Work
|
||||
|
||||
---
|
||||
|
||||
## Medium
|
||||
|
||||
### 1. Cascade Correlation Is Heuristic
|
||||
|
||||
**File:** `src/mitm/intercept.rs` — `extract_cascade_hint()`
|
||||
|
||||
The MITM proxy matches intercepted API traffic to cascade IDs by scanning for `metadata.user_id` or `workspace_id` in the request body. If neither is found, it stores under `_latest`. Since `take_usage()` no longer falls back to `_latest`, unidentified requests will have **no MITM usage data at all**.
|
||||
|
||||
**Fix:** Investigate the actual request body format the LS sends for better correlation keys. Alternatively, use timing-based correlation (match MITM capture timestamp to cascade polling window).
|
||||
All fixable issues from the original report have been resolved. The remaining
|
||||
items require either architectural changes, new features, or deep investigation
|
||||
of the Go language server binary.
|
||||
|
||||
---
|
||||
|
||||
### 2. Request Modification Not Implemented
|
||||
## 🔴 Blockers (Require Deep Investigation)
|
||||
|
||||
**File:** `src/mitm/proxy.rs` — `modify_requests: false`
|
||||
|
||||
The `MitmConfig.modify_requests` flag exists and is plumbed through, but no actual modification logic is implemented. The flag is hardcoded to `false`.
|
||||
|
||||
**Fix:** When needed, implement request body mutation in `handle_http_over_tls()` — parse JSON, modify, reserialize, update `Content-Length`.
|
||||
|
||||
---
|
||||
|
||||
### 3. Polling-Based Cascade Updates vs Streaming RPC
|
||||
|
||||
**File:** `src/api/polling.rs`
|
||||
|
||||
We poll `GetCascadeTrajectorySteps` on a timer to check for new cascade output. The LS has a `StreamCascadeReactiveUpdates` streaming gRPC method that pushes updates in real-time. Our polling approach works but adds latency and unnecessary requests.
|
||||
|
||||
**Impact:** Functional but suboptimal. The streaming approach would give lower latency and less LS load, but requires maintaining a long-lived gRPC stream and handling reconnection.
|
||||
|
||||
**See:** `docs/ls-binary-analysis.md` → gRPC Services → LanguageServerService
|
||||
|
||||
---
|
||||
|
||||
### 4. No BYOK Model Routing
|
||||
|
||||
**File:** `src/api/models.rs`
|
||||
|
||||
The LS supports BYOK (Bring Your Own Key) variants for Claude and OpenAI models (e.g., `MODEL_CLAUDE_4_SONNET_BYOK`, `MODEL_OPENAI_COMPATIBLE`). Our proxy only exposes the 5 built-in placeholder models. Users with BYOK keys can't use them through the proxy.
|
||||
|
||||
**Fix:** Add a mechanism to register BYOK models at runtime (e.g., via a config file or API endpoint). The BYOK model IDs and their proto enum numbers are documented in `docs/ls-binary-analysis.md`.
|
||||
|
||||
---
|
||||
|
||||
## 🟢 Low
|
||||
|
||||
### 5. No Integration Tests for MITM Module
|
||||
|
||||
The MITM module has unit tests for protobuf decoding and intercept parsing, but no integration tests that verify:
|
||||
|
||||
- TLS interception end-to-end with the generated CA
|
||||
- Full HTTP/1.1 request/response cycle through the proxy
|
||||
- gRPC (HTTP/2) request/response cycle through `h2_handler`
|
||||
- Store recording and retrieval under concurrency
|
||||
- Wrapper script install/uninstall lifecycle
|
||||
|
||||
---
|
||||
|
||||
## Blockers
|
||||
|
||||
### 6. LS Go LLM Client Ignores System TLS Trust Store
|
||||
### 1. LS Go LLM Client Ignores System TLS Trust Store
|
||||
|
||||
**File:** `docs/mitm-interception-status.md`
|
||||
|
||||
The LS binary is a Go program whose HTTP client for LLM API calls uses a custom `tls.Config` that does **not** trust system CAs or honor `SSL_CERT_FILE`. This means our MITM proxy's generated CA cert is rejected even when properly installed system-wide.
|
||||
The LS binary's Go HTTP client for LLM API calls uses a custom `tls.Config` that
|
||||
does **not** trust system CAs or honor `SSL_CERT_FILE`. Our MITM proxy can route
|
||||
traffic but not decrypt it.
|
||||
|
||||
The extension patch (`detectAndUseProxy=1`) only makes the LS honor `HTTPS_PROXY` for routing — it doesn't fix CA trust. Without this, the MITM proxy can route but not decrypt LLM traffic.
|
||||
**Investigation status:** All practical approaches have been tried and failed:
|
||||
|
||||
**Potential fixes:**
|
||||
- iptables REDIRECT → redirect loop + broke all HTTPS traffic
|
||||
- DNS redirect → same TLS trust failure
|
||||
- LD_PRELOAD → Go doesn't use libc for syscalls
|
||||
- SSLKEYLOGFILE → Go doesn't support it
|
||||
|
||||
- Binary patching the Go TLS verification (hard, breaks on updates)
|
||||
- Full standalone LS control (in progress, see issue #7)
|
||||
- Network namespace + iptables redirect (eliminates HTTPS_PROXY need but doesn't fix TLS trust)
|
||||
- eBPF/ptrace to inject certs at runtime (complex)
|
||||
**Remaining options (untried):**
|
||||
|
||||
- Binary patching Go TLS verification (fragile, breaks on updates)
|
||||
- Full standalone LS control (see issue #2)
|
||||
- eBPF/ptrace syscall interception (complex)
|
||||
- Network namespace isolation (complex setup)
|
||||
|
||||
**Confidence: <30%** — all easy paths exhausted. Requires reverse engineering the Go binary's TLS setup.
|
||||
|
||||
**See:** `docs/mitm-interception-status.md` for full analysis
|
||||
|
||||
---
|
||||
|
||||
### 7. Standalone LS Cascades Silently Fail
|
||||
### 2. Standalone LS Cascades Silently Fail
|
||||
|
||||
**File:** `docs/standalone-ls-todo.md`
|
||||
|
||||
When running a standalone LS instance (outside of Antigravity), cascades start but produce no output. The LS accepts `StartCascade` RPCs without error, but the cascade never progresses.
|
||||
Standalone LS (outside Antigravity) accepts `StartCascade` RPCs without error
|
||||
but cascade never progresses. No output.
|
||||
|
||||
**Suspected blockers:**
|
||||
|
||||
- Missing auth context (OAuth token not properly propagated)
|
||||
- Unleash feature flags differ between main and standalone instances (`GetUnleashData` returns different flags)
|
||||
- `LoadCodeAssist` / `OnboardUser` initialization steps may be required
|
||||
- Extension server callbacks (`WriteCascadeEdit`, `ExecuteCommand`, etc.) have no handler
|
||||
- Missing auth context (OAuth token propagation)
|
||||
- Different Unleash feature flags between main and standalone instances
|
||||
- Missing initialization steps (`LoadCodeAssist`, `OnboardUser`)
|
||||
- Missing extension server callbacks (`WriteCascadeEdit`, `ExecuteCommand`)
|
||||
|
||||
**Confidence: <30%** — too many unknowns. Needs systematic debugging with the standalone LS.
|
||||
|
||||
**See:** `docs/standalone-ls-todo.md` for investigation plan
|
||||
|
||||
---
|
||||
|
||||
## Medium (Architecture / Future Work)
|
||||
|
||||
### 3. Cascade Correlation Is Heuristic
|
||||
|
||||
**File:** `src/mitm/intercept.rs` — `extract_cascade_hint()`
|
||||
|
||||
The MITM proxy matches intercepted API traffic to cascade IDs heuristically:
|
||||
|
||||
- HTTP/1.1 path: scans JSON body for `metadata.user_id` or `workspace_id`
|
||||
- gRPC/H2 path: recursively searches proto fields for UUID strings
|
||||
|
||||
If neither method finds a match, usage is stored under `_latest` but never
|
||||
consumed (since `take_usage()` requires exact cascade ID match).
|
||||
|
||||
**Confidence: <50%** — can't test without working MITM interception (blocked by issue #1). The heuristic is reasonable but unverified against real traffic.
|
||||
|
||||
---
|
||||
|
||||
### 4. Request Modification Not Implemented
|
||||
|
||||
**File:** `src/mitm/proxy.rs` — `modify_requests: bool`
|
||||
|
||||
The `MitmConfig.modify_requests` flag is plumbed through the entire call chain
|
||||
but hardcoded to `false`. No modification logic exists. This is intentional
|
||||
scaffolding for future use.
|
||||
|
||||
**Status:** Not a bug — reserved for potential request mutation features.
|
||||
|
||||
---
|
||||
|
||||
### 5. Polling-Based Cascade Updates vs Streaming RPC
|
||||
|
||||
**File:** `src/api/polling.rs`
|
||||
|
||||
We poll `GetCascadeTrajectorySteps` on a timer. The LS has a
|
||||
`StreamCascadeReactiveUpdates` streaming gRPC method that pushes updates
|
||||
in real-time. Polling works but adds latency.
|
||||
|
||||
**Status:** Functional but suboptimal. Switching to streaming requires
|
||||
implementing a gRPC streaming client with reconnection handling. Not blocking.
|
||||
|
||||
---
|
||||
|
||||
### 6. No BYOK Model Routing
|
||||
|
||||
**File:** `src/api/models.rs`
|
||||
|
||||
The LS supports BYOK (Bring Your Own Key) models (e.g., `MODEL_CLAUDE_4_SONNET_BYOK`,
|
||||
`MODEL_OPENAI_COMPATIBLE`). Our proxy only exposes the 5 built-in placeholder
|
||||
models.
|
||||
|
||||
**Status:** Feature request. Would need a runtime model registration mechanism.
|
||||
Proto enum numbers are documented in `docs/ls-binary-analysis.md`.
|
||||
|
||||
---
|
||||
|
||||
## 🟢 Low
|
||||
|
||||
### 7. No Integration Tests for MITM Module
|
||||
|
||||
Unit tests cover protobuf decoding and intercept parsing (17 tests pass), but
|
||||
no integration tests for:
|
||||
|
||||
- TLS interception end-to-end with the generated CA
|
||||
- Full HTTP/1.1 request/response cycle through the proxy
|
||||
- gRPC (HTTP/2) request/response cycle through `h2_handler`
|
||||
- Store recording and retrieval under concurrency
|
||||
|
||||
**Status:** The MITM can't intercept real traffic anyway (blocked by issue #1),
|
||||
so integration tests would be somewhat hypothetical. Worth adding when the TLS
|
||||
blocker is resolved.
|
||||
|
||||
Reference in New Issue
Block a user