zerogravity

Author	SHA1	Message	Date
Nikketryhard	22177a28a1	chore: fix all clippy warnings and add Cargo.toml metadata	2026-02-18 02:50:47 -06:00
Nikketryhard	ad0aa1556c	feat: Add LICENSE file and refactor MITM response handling and tracing.	2026-02-18 02:43:05 -06:00
Nikketryhard	00587fcce8	feat: rebrand to ZeroGravity, replace proxyctl with zg Rust binary Phase 1 - Rename: - Crate: antigravity-proxy -> zerogravity - Env: ANTIGRAVITY_OAUTH_TOKEN -> ZEROGRAVITY_TOKEN - Paths: ~/.config/antigravity-proxy -> ~/.config/zerogravity - Paths: /tmp/antigravity-* -> /tmp/zerogravity-* - User: antigravity-ls -> zerogravity-ls - Service: antigravity-proxy -> zerogravity Phase 2 - zg daemon manager: - New Rust binary src/bin/zg.rs replaces scripts/proxyctl bash - Commands: start, stop, restart, rebuild, status, logs, test, health - Auto-resolves project dir from binary location - All commands exit immediately (safe for agent fast-bash)	2026-02-18 01:54:54 -06:00
Nikketryhard	28d3296c87	fix: gemini route, usage capture, search timeout, and trace finalization - Add missing /v1/gemini POST route and handler - Capture MitmEvent::Usage in gemini sync/streaming handlers - Add retry counter (max 3) to search handler to prevent hang - Add trace finalization at all gemini_sync channel exit points - Fix UpstreamError trace outcome label - Add timeout trace with error recording - Dispatch Usage before ResponseComplete in SSE flush	2026-02-18 01:31:18 -06:00
Nikketryhard	48674f65da	refactor: decompose large functions and remove dead code - Decompose modify_request() into 7 single-responsibility helpers - Decompose handle_http_over_tls(): extract read_full_request, dispatch_stream_events - Promote connect_upstream/resolve_upstream to module-level functions - Split standalone.rs (1238 lines) into 4 submodules: standalone/mod.rs, spawn.rs, discovery.rs, stub.rs - Extract proto wire primitives into proto/wire.rs - Remove 6 dead MitmStore methods - Remove dead SessionResult, DEFAULT_SESSION, get_or_create - Remove dead decode_varint_at, extract_conversation_id - Clean all unused imports across 10 files - Suppress structural dead_code warnings on deserialization fields Warnings: 20 -> 0. All 43 tests pass.	2026-02-17 22:27:26 -06:00
Nikketryhard	637fbc0e54	refactor: endpoint parity and proxy improvements Mixed changes from recent sessions: endpoint feature parity improvements, proxy bug fixes, and store cleanup.	2026-02-16 21:47:00 -06:00
Nikketryhard	a47c572e48	fix: forward Google's exact error messages to client Root cause: errors from Google were being swallowed, replaced with placeholders like 'Google API returned HTTP 400' or '[Timeout waiting for response]', or silently converted to fake 'incomplete' responses. Changes across all endpoints (/v1/chat/completions, /v1/responses, /v1/gemini, /v1/search): Error message fidelity: - UpstreamError message now includes Google's status prefix: [STATUS] msg - Falls back to raw body if JSON parsing fails (protobuf, HTML, etc.) - ErrorDetail gains optional code and param fields Timeout handling: - poll_for_response returns UpstreamError(504, DEADLINE_EXCEEDED) on timeout instead of '[Timeout waiting for AI response]' placeholder text - Streaming timeouts emit proper error events, not fake content - Sync bypass timeouts return 504 Gateway Timeout, not 200 incomplete Missing error checks added: - responses.rs sync bypass: added upstream_error check in polling loop - gemini.rs sync bypass: added upstream_error check in polling loop - gemini.rs streaming: added upstream_error check in polling loop (was completely missing — errors only handled in sync path) DRY helpers: - upstream_error_message(): shared exact message extraction - upstream_error_type(): shared Google→OpenAI error type mapping - All streaming handlers use these instead of inline formatting	2026-02-16 19:30:32 -06:00
Nikketryhard	39381a4dfe	fix: multi-round tool history rewrite and finishReason handling - Add ToolRound struct to pair function calls with results per-round - Replace single-match history rewrite (broke after first round) with multi-round loop that rewrites ALL placeholder model turns - Fix tool result name fallback: use positional index instead of always picking the first call - Set is_complete for any finishReason (FUNCTION_CALL, MAX_TOKENS, etc.) not just STOP — prevents response_complete flag from never being set - Legacy fallback: responses.rs path (single-round via last_calls + pending_results) still works when tool_rounds is empty - Add tests: multi-round rewrite, single-round legacy, no-op, and FUNCTION_CALL/MAX_TOKENS finishReason handling	2026-02-16 19:05:37 -06:00
Nikketryhard	6bda2ecafa	fix: tool call race conditions and missing completions tool result extraction - store.rs: record_function_call now falls back to active_cascade_id (matching record_usage behavior) instead of blind _latest fallback - store.rs: add cascade-aware take_function_calls(cascade_id) method with priority: exact match → active cascade → _latest → any key - completions.rs: extract tool_calls from assistant messages and tool results from tool messages, storing them for MITM injection. This was the ROOT CAUSE — the completions handler stored tool definitions but never extracted tool results, so modify_request couldn't rewrite the LS conversation history with proper functionCall/functionResponse - responses.rs: use cascade-aware take_function_calls for consistency	2026-02-16 18:43:16 -06:00
Nikketryhard	38b4130c55	feat: Implement request generation counter and state management to prevent stale data and unblock Language Server for follow-up requests.	2026-02-16 16:21:52 -06:00
Nikketryhard	e6a339d92e	fix: clear request_in_flight when stream ends Without this, request_in_flight stayed true after tool call streaming, blocking all subsequent turns until the next completions handler happened to clear it first.	2026-02-16 01:02:09 -06:00
Nikketryhard	3fdd0368a0	fix: block ALL LS follow-up requests across connections Move the in-flight blocking check to the top of the LLM request flow, BEFORE request modification. This catches follow-ups on ALL connections (the LS opens multiple parallel TLS connections). Only the very first modified request reaches Google — all others get fake STOP responses. Previously, each new connection independently allowed one request through before blocking, letting 4-5 requests leak per turn.	2026-02-16 00:57:33 -06:00
Nikketryhard	a8f3c8915f	fix: block ALL LS follow-up requests, deduplicate function calls - Add request_in_flight flag to MitmStore, set immediately when first LLM request is forwarded with custom tools active - Block ALL subsequent LS requests (agentic loop + internal flash-lite) with fake SSE responses instead of waiting for response_complete - Fix function call deduplication: drain() accumulator after storing to prevent 3x duplicate tool calls across SSE chunks - Clear all stale state (response, thinking, function calls, errors) at the start of each streaming request - Handle response_complete with no content (thoughtSignature-only) gracefully with timeout instead of infinite hang	2026-02-16 00:51:56 -06:00
Nikketryhard	2882f7cce2	feat: propagate Google upstream errors to client When Google returns an error (400, 429, 500, etc.), the MITM proxy now captures it and the API handlers return it immediately instead of hanging until timeout. - UpstreamError struct stored in MitmStore - MITM proxy parses Google error JSON (message + status) - Polling handler checks for upstream errors each cycle - Streaming handlers emit response.failed / SSE error events - Error status mapped to OpenAI-style types (invalid_request_error, rate_limit_error, authentication_error, server_error, etc.) - All handlers clear stale errors at request start	2026-02-15 18:19:38 -06:00
Nikketryhard	89bea030cc	feat: inject images via MITM layer instead of relying on LS The LS silently ignores the 'images' field from our SendUserCascadeMessageRequest proto — it never forwards image data to Google's API. New approach: store the image in MitmStore, then the MITM request modifier injects it as 'inlineData' directly into the last user message's parts array in the Google API JSON request. Flow: Client → Proxy (decode base64) → MitmStore.set_pending_image() LS → Google API → MITM intercepts → inject inlineData part → Google receives image + text together This works for all three API endpoints (responses, completions, gemini).	2026-02-15 17:57:32 -06:00
Nikketryhard	976c44fdd4	feat: add image support across all endpoints (responses, completions, gemini)	2026-02-15 17:25:33 -06:00
Nikketryhard	ca9f808ee3	feat: completions API improvements, gemini endpoint, response types	2026-02-15 17:08:53 -06:00
Nikketryhard	b1bd57ab5e	feat: forward generation params via MITM + add usageMetadata to Gemini - Add GenerationParams struct to MitmStore for temperature, top_p, top_k, max_output_tokens, stop_sequences, frequency/presence_penalty - MITM modify_request injects params into request.generationConfig - All 3 endpoints (Completions, Responses, Gemini) store client params - Add usageMetadata to Gemini sync responses (promptTokenCount, candidatesTokenCount, totalTokenCount, thoughtsTokenCount) - Add generation param fields to GeminiRequest (temperature, topP, etc.) - Completions stream_options.include_usage emits final usage chunk - Completions reasoning_tokens in completion_tokens_details - Update endpoint gap analysis doc (all high-priority gaps resolved)	2026-02-15 14:23:05 -06:00
Nikketryhard	981fb3b18d	fix: resolve cascade correlation, update KNOWN_ISSUES - MitmStore: added active_cascade_id field with set/get/clear methods - record_usage() now falls back to active_cascade_id when the heuristic cascade hint is absent (fixes usage always going to _latest) - All three API handlers set active cascade before send_message - KNOWN_ISSUES: moved 3 issues to resolved: - Request modification (already true, was stale entry) - Cascade correlation (fixed via active_cascade_id) - Progressive thinking streaming (fixed via MITM bypass)	2026-02-15 01:10:34 -06:00
Nikketryhard	50b53097bc	fix: bypass LS entirely when custom tools are active When custom tools are set, don't forward ANY response from Google to the LS. Instead, capture text and function calls directly into MitmStore. The completions handler reads from MitmStore. This eliminates the LS multi-turn loop (5 requests, 30+ seconds) that occurred because the LS kept processing responses internally. Tool calls now return in ~1.3s instead of timing out.	2026-02-15 00:54:40 -06:00
Nikketryhard	ec1c0c700d	fix: decouple function call detection from LS step polling Move MitmStore function call check outside get_steps() block so tool calls are detected immediately when captured by MITM, regardless of LS processing state. Also reduce poll interval to 300ms. The LS can take 20-30s for its internal multi-turn loop. Previously, function call checks were nested inside the steps block and required LS to have produced steps. Now the MITM capture is picked up within 300ms of detection.	2026-02-15 00:48:14 -06:00
Nikketryhard	4f08b994c7	fix: include tool results in conversation context When OpenCode sends follow-up messages with tool results, include the full conversation (user message, assistant tool calls, and tool results) in the text sent to the model. Previously only the user message was extracted, causing the model to never see tool results and call the same tool repeatedly in an infinite loop. Also add tool_calls and tool_call_id fields to CompletionMessage.	2026-02-15 00:42:43 -06:00
Nikketryhard	5d4125fa0d	fix: suppress dummy text from tool call responses Check for MITM-captured function calls BEFORE emitting text in the streaming handler. This prevents the dummy 'Tool call completed' placeholder (sent to the LS) from leaking to OpenCode, which was confusing it into infinite loops. Also removes duplicate function call storage at end of response loop since they're now stored immediately when detected.	2026-02-15 00:37:39 -06:00
Nikketryhard	3303ce38de	feat: add tool call support to chat completions endpoint - Accept tools and tool_choice fields in CompletionRequest - Convert OpenAI tools to Gemini format and store in MitmStore - Detect MITM-captured function calls in streaming poll loop - Emit tool_calls delta chunks in OpenAI streaming format - Finish with 'tool_calls' reason instead of 'stop' when tools used - Only clear tools when request has none (prevents stale state leak)	2026-02-14 23:47:23 -06:00
Nikketryhard	7e16a7b892	fix: clear stale tool state in completions handler to prevent hang Tool definitions stored in MitmStore from /v1/responses requests were persisting and getting injected into /v1/chat/completions requests. This caused Gemini to return functionCalls instead of text, and since the completions handler has no function call handling logic, it would poll forever waiting for text that never came. Fix: clear active_tools, active_tool_config, and has_active_function_call at the start of handle_completions. Also add clear_active_function_call() method to MitmStore.	2026-02-14 23:10:45 -06:00
Nikketryhard	061b08fc8f	fix: cascade correlation — fallback to _latest MITM usage When the MITM can't extract a cascade ID from the intercepted request (Content-Length: 0 / chunked encoding), usage is stored under '_latest'. Now usage_from_poll and completions try the exact cascade_id first, then fall back to '_latest' so MITM-captured tokens are actually used.	2026-02-14 18:10:04 -06:00
Nikketryhard	901cd3d2e3	fix: resolve clippy warnings (matches!, map_or, redundant guard, unnecessary allocations)	2026-02-14 04:06:18 -06:00
Nikketryhard	d5e7f09225	feat: initial commit — antigravity proxy with MITM, standalone LS, and snapshot tooling	2026-02-14 02:24:35 -06:00

28 Commits