Commit Graph

8 Commits

Author SHA1 Message Date
Nikketryhard
48674f65da refactor: decompose large functions and remove dead code
- Decompose modify_request() into 7 single-responsibility helpers
- Decompose handle_http_over_tls(): extract read_full_request, dispatch_stream_events
- Promote connect_upstream/resolve_upstream to module-level functions
- Split standalone.rs (1238 lines) into 4 submodules:
  standalone/mod.rs, spawn.rs, discovery.rs, stub.rs
- Extract proto wire primitives into proto/wire.rs
- Remove 6 dead MitmStore methods
- Remove dead SessionResult, DEFAULT_SESSION, get_or_create
- Remove dead decode_varint_at, extract_conversation_id
- Clean all unused imports across 10 files
- Suppress structural dead_code warnings on deserialization fields

Warnings: 20 -> 0. All 43 tests pass.
2026-02-17 22:27:26 -06:00
Nikketryhard
3fdd0368a0 fix: block ALL LS follow-up requests across connections
Move the in-flight blocking check to the top of the LLM request flow,
BEFORE request modification. This catches follow-ups on ALL connections
(the LS opens multiple parallel TLS connections). Only the very first
modified request reaches Google — all others get fake STOP responses.

Previously, each new connection independently allowed one request
through before blocking, letting 4-5 requests leak per turn.
2026-02-16 00:57:33 -06:00
Nikketryhard
34b9553484 feat: capture thinking text via MITM dual-call merge
The LS makes TWO separate Google API calls for thinking models:
  Call 1: response + thinking token count (no thinking text)
  Call 2: thinking summary text (no thinking tokens)

Each hits a different StreamingAccumulator, so we:
1. Capture response_text in StreamingAccumulator (non-thinking parts)
2. In MitmStore::record_usage, detect when Call 2 arrives for a
   cascade that already has thinking tokens from Call 1
3. Merge Call 2's response_text as thinking_text on Call 1's usage

Also injects includeThoughts into Google API requests via MITM
modify to ensure thinking text is available in SSE responses.
2026-02-14 19:49:15 -06:00
Nikketryhard
905d55beb5 feat: capture thinking text from MITM-intercepted API responses
The LS strips thinking/reasoning text from plannerResponse steps —
only the thinkingSignature (opaque verification blob) is preserved.
The actual thinking text flows through the MITM proxy in the raw
Google SSE response (parts with thought: true) and Anthropic SSE
(thinking_delta content blocks).

Changes:
- StreamingAccumulator now accumulates thinking text from SSE events
- ApiUsage gains thinking_text: Option<String>
- usage_from_poll returns (Usage, Option<thinking_text>)
- Thinking text priority: MITM-captured > LS-extracted (fallback)
- Reasoning output item now populated from real API data
- Removed debug dump code
2026-02-14 19:30:09 -06:00
Nikketryhard
2ccc4b46f8 fix(#4): remove dead total_cost_usd field; map model enums to readable names 2026-02-14 15:54:03 -06:00
Nikketryhard
edad784bcd refactor: extract GrpcUsage::into_api_usage to DRY up h2_handler 2026-02-14 04:13:46 -06:00
Nikketryhard
901cd3d2e3 fix: resolve clippy warnings (matches!, map_or, redundant guard, unnecessary allocations) 2026-02-14 04:06:18 -06:00
Nikketryhard
d5e7f09225 feat: initial commit — antigravity proxy with MITM, standalone LS, and snapshot tooling 2026-02-14 02:24:35 -06:00