zerogravity

Author	SHA1	Message	Date
Nikketryhard	22177a28a1	chore: fix all clippy warnings and add Cargo.toml metadata	2026-02-18 02:50:47 -06:00
Nikketryhard	ad0aa1556c	feat: Add LICENSE file and refactor MITM response handling and tracing.	2026-02-18 02:43:05 -06:00
Nikketryhard	28d3296c87	fix: gemini route, usage capture, search timeout, and trace finalization - Add missing /v1/gemini POST route and handler - Capture MitmEvent::Usage in gemini sync/streaming handlers - Add retry counter (max 3) to search handler to prevent hang - Add trace finalization at all gemini_sync channel exit points - Fix UpstreamError trace outcome label - Add timeout trace with error recording - Dispatch Usage before ResponseComplete in SSE flush	2026-02-18 01:31:18 -06:00
Nikketryhard	48674f65da	refactor: decompose large functions and remove dead code - Decompose modify_request() into 7 single-responsibility helpers - Decompose handle_http_over_tls(): extract read_full_request, dispatch_stream_events - Promote connect_upstream/resolve_upstream to module-level functions - Split standalone.rs (1238 lines) into 4 submodules: standalone/mod.rs, spawn.rs, discovery.rs, stub.rs - Extract proto wire primitives into proto/wire.rs - Remove 6 dead MitmStore methods - Remove dead SessionResult, DEFAULT_SESSION, get_or_create - Remove dead decode_varint_at, extract_conversation_id - Clean all unused imports across 10 files - Suppress structural dead_code warnings on deserialization fields Warnings: 20 -> 0. All 43 tests pass.	2026-02-17 22:27:26 -06:00
Nikketryhard	637fbc0e54	refactor: endpoint parity and proxy improvements Mixed changes from recent sessions: endpoint feature parity improvements, proxy bug fixes, and store cleanup.	2026-02-16 21:47:00 -06:00
Nikketryhard	931e1cc5a1	chore: remove unused push_tool_round_calls and attach_tool_round_results	2026-02-16 19:22:09 -06:00
Nikketryhard	ba96534ead	fix: prevent tool_rounds cross-cascade contamination causing hangs Root cause: proxy.rs eagerly pushed tool rounds via push_tool_round_calls when intercepting Google's functionCall response. These stale rounds leaked into LS follow-up requests, producing malformed history that Google timed out on (60s 'no upstream response'). Changes: - Remove push_tool_round_calls from proxy.rs response interception - proxy.rs: use get_tool_rounds (non-destructive) instead of take_tool_rounds so accumulated rounds persist across multiple LS requests per cascade - responses.rs/gemini.rs: build rounds via take+push+set pattern — each handler accumulates its own rounds from get_last_function_calls + results - completions.rs: unchanged (set_tool_rounds replaces from messages) - clear_tools: also clears tool_rounds to prevent stale data between sessions - store.rs: add get_tool_rounds (non-destructive clone) method	2026-02-16 19:21:03 -06:00
Nikketryhard	32f02d6456	fix: extend multi-round tool history to responses and gemini endpoints - proxy.rs: push_tool_round_calls alongside set_last_function_calls when Google responds with functionCall — accumulates rounds - responses.rs: attach_tool_round_results to pair tool results with the correct round instead of flat add_tool_result - gemini.rs: same attach_tool_round_results integration - store.rs: add push_tool_round_calls and attach_tool_round_results methods for cross-request round accumulation - Legacy add_tool_result kept for backward compat alongside new path	2026-02-16 19:11:38 -06:00
Nikketryhard	39381a4dfe	fix: multi-round tool history rewrite and finishReason handling - Add ToolRound struct to pair function calls with results per-round - Replace single-match history rewrite (broke after first round) with multi-round loop that rewrites ALL placeholder model turns - Fix tool result name fallback: use positional index instead of always picking the first call - Set is_complete for any finishReason (FUNCTION_CALL, MAX_TOKENS, etc.) not just STOP — prevents response_complete flag from never being set - Legacy fallback: responses.rs path (single-round via last_calls + pending_results) still works when tool_rounds is empty - Add tests: multi-round rewrite, single-round legacy, no-op, and FUNCTION_CALL/MAX_TOKENS finishReason handling	2026-02-16 19:05:37 -06:00
Nikketryhard	6bda2ecafa	fix: tool call race conditions and missing completions tool result extraction - store.rs: record_function_call now falls back to active_cascade_id (matching record_usage behavior) instead of blind _latest fallback - store.rs: add cascade-aware take_function_calls(cascade_id) method with priority: exact match → active cascade → _latest → any key - completions.rs: extract tool_calls from assistant messages and tool results from tool messages, storing them for MITM injection. This was the ROOT CAUSE — the completions handler stored tool definitions but never extracted tool results, so modify_request couldn't rewrite the LS conversation history with proper functionCall/functionResponse - responses.rs: use cascade-aware take_function_calls for consistency	2026-02-16 18:43:16 -06:00
Nikketryhard	38b4130c55	feat: Implement request generation counter and state management to prevent stale data and unblock Language Server for follow-up requests.	2026-02-16 16:21:52 -06:00
Nikketryhard	3fdd0368a0	fix: block ALL LS follow-up requests across connections Move the in-flight blocking check to the top of the LLM request flow, BEFORE request modification. This catches follow-ups on ALL connections (the LS opens multiple parallel TLS connections). Only the very first modified request reaches Google — all others get fake STOP responses. Previously, each new connection independently allowed one request through before blocking, letting 4-5 requests leak per turn.	2026-02-16 00:57:33 -06:00
Nikketryhard	a8f3c8915f	fix: block ALL LS follow-up requests, deduplicate function calls - Add request_in_flight flag to MitmStore, set immediately when first LLM request is forwarded with custom tools active - Block ALL subsequent LS requests (agentic loop + internal flash-lite) with fake SSE responses instead of waiting for response_complete - Fix function call deduplication: drain() accumulator after storing to prevent 3x duplicate tool calls across SSE chunks - Clear all stale state (response, thinking, function calls, errors) at the start of each streaming request - Handle response_complete with no content (thoughtSignature-only) gracefully with timeout instead of infinite hang	2026-02-16 00:51:56 -06:00
Nikketryhard	4e4d8e9474	chore: code cleanup and documentation overhaul - Remove debug header dump from MITM proxy (was temp debugging code) - Suppress dead_code warnings for intentional OpenAI compat fields - Rewrite README with styled mermaid architecture diagrams, full feature listing, usage examples, and CLI reference - Update endpoint-gap-analysis: images implemented, audio only stretch - Update mitm-interception-status: add request modification and error capture components - Update standalone-ls-todo: add new endpoints to test results - Zero compiler warnings	2026-02-15 18:27:53 -06:00
Nikketryhard	2882f7cce2	feat: propagate Google upstream errors to client When Google returns an error (400, 429, 500, etc.), the MITM proxy now captures it and the API handlers return it immediately instead of hanging until timeout. - UpstreamError struct stored in MitmStore - MITM proxy parses Google error JSON (message + status) - Polling handler checks for upstream errors each cycle - Streaming handlers emit response.failed / SSE error events - Error status mapped to OpenAI-style types (invalid_request_error, rate_limit_error, authentication_error, server_error, etc.) - All handlers clear stale errors at request start	2026-02-15 18:19:38 -06:00
Nikketryhard	89bea030cc	feat: inject images via MITM layer instead of relying on LS The LS silently ignores the 'images' field from our SendUserCascadeMessageRequest proto — it never forwards image data to Google's API. New approach: store the image in MitmStore, then the MITM request modifier injects it as 'inlineData' directly into the last user message's parts array in the Google API JSON request. Flow: Client → Proxy (decode base64) → MitmStore.set_pending_image() LS → Google API → MITM intercepts → inject inlineData part → Google receives image + text together This works for all three API endpoints (responses, completions, gemini).	2026-02-15 17:57:32 -06:00
Nikketryhard	afa96b88a5	chore: remove broken googleSearch grounding and /v1/search endpoint	2026-02-15 17:08:46 -06:00
Nikketryhard	b1bd57ab5e	feat: forward generation params via MITM + add usageMetadata to Gemini - Add GenerationParams struct to MitmStore for temperature, top_p, top_k, max_output_tokens, stop_sequences, frequency/presence_penalty - MITM modify_request injects params into request.generationConfig - All 3 endpoints (Completions, Responses, Gemini) store client params - Add usageMetadata to Gemini sync responses (promptTokenCount, candidatesTokenCount, totalTokenCount, thoughtsTokenCount) - Add generation param fields to GeminiRequest (temperature, topP, etc.) - Completions stream_options.include_usage emits final usage chunk - Completions reasoning_tokens in completion_tokens_details - Update endpoint gap analysis doc (all high-priority gaps resolved)	2026-02-15 14:23:05 -06:00
Nikketryhard	735c3e357d	chore: clean up dead code, fix broken test - Remove unused methods: append_response_text, clear_response, has_pending_function_calls, take_function_calls - Add #[allow(dead_code)] for intentionally kept future-use methods and response modification helpers - Remove unused now_unix import from gemini.rs - Fix test_modify_strips_all_tools: tools key is removed entirely when no custom tools provided, not left as empty array - Zero warnings, 32 tests passing	2026-02-15 01:14:51 -06:00
Nikketryhard	981fb3b18d	fix: resolve cascade correlation, update KNOWN_ISSUES - MitmStore: added active_cascade_id field with set/get/clear methods - record_usage() now falls back to active_cascade_id when the heuristic cascade hint is absent (fixes usage always going to _latest) - All three API handlers set active cascade before send_message - KNOWN_ISSUES: moved 3 issues to resolved: - Request modification (already true, was stale entry) - Cascade correlation (fixed via active_cascade_id) - Progressive thinking streaming (fixed via MITM bypass)	2026-02-15 01:10:34 -06:00
Nikketryhard	b3af73cebd	feat: sync all endpoints with MITM LS bypass + real-time thinking streaming - Responses API (streaming): MITM bypass path polls MitmStore directly when custom tools are active, skipping LS step polling entirely. Streams thinking text deltas in real-time as they arrive from the MITM. Handles function calls, text response, and thinking/reasoning events. - Responses API (sync): Same MITM bypass for non-streaming responses. Polls MitmStore for function calls or completed text before falling back to LS path. - Gemini endpoint: MITM bypass polls MitmStore directly for tool call responses, eliminating LS overhead. - MitmStore: Added captured_thinking_text field with set/peek/take methods for real-time thinking text capture from MITM SSE. - MITM proxy: Now captures both thinking_text and response_text from StreamingAccumulator into MitmStore when bypass mode is active.	2026-02-15 01:03:39 -06:00
Nikketryhard	50b53097bc	fix: bypass LS entirely when custom tools are active When custom tools are set, don't forward ANY response from Google to the LS. Instead, capture text and function calls directly into MitmStore. The completions handler reads from MitmStore. This eliminates the LS multi-turn loop (5 requests, 30+ seconds) that occurred because the LS kept processing responses internally. Tool calls now return in ~1.3s instead of timing out.	2026-02-15 00:54:40 -06:00
Nikketryhard	7e16a7b892	fix: clear stale tool state in completions handler to prevent hang Tool definitions stored in MitmStore from /v1/responses requests were persisting and getting injected into /v1/chat/completions requests. This caused Gemini to return functionCalls instead of text, and since the completions handler has no function call handling logic, it would poll forever waiting for text that never came. Fix: clear active_tools, active_tool_config, and has_active_function_call at the start of handle_completions. Also add clear_active_function_call() method to MitmStore.	2026-02-14 23:10:45 -06:00
Nikketryhard	786987116b	feat: full tool call support (OpenAI + Gemini endpoints) - store.rs: Add tool context storage (active tools, tool config, pending tool results, call_id mapping, last function calls for history rewrite) - types.rs: Add tools/tool_choice fields to ResponsesRequest, add build_function_call_output helper for OpenAI function_call output items - modify.rs: Replace hardcoded get_weather with dynamic ToolContext injection. Add openai_tools_to_gemini and openai_tool_choice_to_gemini converters. Add conversation history rewriting for tool result turns (replaces fake 'Tool call completed' model turn with real functionCall, injects functionResponse before last user turn) - proxy.rs: Build ToolContext from MitmStore before calling modify_request. Save last_function_calls for history rewriting on subsequent turns - responses.rs: Store client tools in MitmStore before LS call. Detect function_call_output in input array for tool result submission. Return captured functionCalls as OpenAI function_call output items with generated call_ids and stringified arguments - gemini.rs: New Gemini-native endpoint (POST /v1/gemini) with zero format translation. Accepts functionDeclarations directly, returns functionCall in Gemini format directly - mod.rs: Wire /v1/gemini route, bump version to 3.3.0	2026-02-14 22:56:44 -06:00
Nikketryhard	8455aa674f	feat: capture function calls from Google + block follow-up quota waste When MITM strips LS tools and injects custom tools: - Google returns functionCall → captured in MitmStore - Follow-up LS requests are blocked with fake SSE response - Proxy consumes captured calls and clears the flag - Result: 1 real Google API call instead of 5+ per tool call Flow: Client → Proxy → LS → MITM(inject tool) → Google Google returns functionCall → MITM captures it LS tries follow-up → MITM blocks (fake response) Proxy reads captured functionCall → returns to client	2026-02-14 22:37:28 -06:00
Nikketryhard	5c1f4c77d9	fix: add retry logic for MITM thinking text merge race condition The LS makes two Google API calls for thinking models. Call 2 (thinking summary) may not have arrived by the time usage_from_poll runs after Call 1 (response). Now we peek first, and if thinking tokens exist but text is missing, wait up to 1s for the merge to happen. Also adds peek_usage method to MitmStore for non-consuming reads.	2026-02-14 19:54:37 -06:00
Nikketryhard	34b9553484	feat: capture thinking text via MITM dual-call merge The LS makes TWO separate Google API calls for thinking models: Call 1: response + thinking token count (no thinking text) Call 2: thinking summary text (no thinking tokens) Each hits a different StreamingAccumulator, so we: 1. Capture response_text in StreamingAccumulator (non-thinking parts) 2. In MitmStore::record_usage, detect when Call 2 arrives for a cascade that already has thinking tokens from Call 1 3. Merge Call 2's response_text as thinking_text on Call 1's usage Also injects includeThoughts into Google API requests via MITM modify to ensure thinking text is available in SSE responses.	2026-02-14 19:49:15 -06:00
Nikketryhard	905d55beb5	feat: capture thinking text from MITM-intercepted API responses The LS strips thinking/reasoning text from plannerResponse steps — only the thinkingSignature (opaque verification blob) is preserved. The actual thinking text flows through the MITM proxy in the raw Google SSE response (parts with thought: true) and Anthropic SSE (thinking_delta content blocks). Changes: - StreamingAccumulator now accumulates thinking text from SSE events - ApiUsage gains thinking_text: Option<String> - usage_from_poll returns (Usage, Option<thinking_text>) - Thinking text priority: MITM-captured > LS-extracted (fallback) - Reasoning output item now populated from real API data - Removed debug dump code	2026-02-14 19:30:09 -06:00
Nikketryhard	6842bfeaa5	chore: clean up code — remove dead code, stale allows, eprintln→tracing, remove volatile data from docs	2026-02-14 16:11:34 -06:00
Nikketryhard	2ccc4b46f8	fix(#4 ): remove dead total_cost_usd field; map model enums to readable names	2026-02-14 15:54:03 -06:00
Nikketryhard	d5e7f09225	feat: initial commit — antigravity proxy with MITM, standalone LS, and snapshot tooling	2026-02-14 02:24:35 -06:00

31 Commits