zerogravity

Author	SHA1	Message	Date
Nikketryhard	0a33c1b706	fix: send images as top-level ImageData field, not ChatMessage blob SendUserCascadeMessageRequest proto field layout (from JS bundle analysis): - Field 6 is 'images' (repeated ImageData) at the REQUEST level - NOT a Blob sub-message inside ChatMessage (field 2) ImageData proto uses base64_data (field 1) + mime_type (field 2), not raw bytes. The LS was silently ignoring our ChatMessage blob because the field structure didn't match. Also protect MITM modifier from stripping messages containing inlineData (image parts in Google API JSON).	2026-02-15 17:46:41 -06:00
Nikketryhard	2ac2016ed4	fix: resolve symlink in proxyctl before deriving PROJECT_DIR	2026-02-15 17:36:45 -06:00
Nikketryhard	976c44fdd4	feat: add image support across all endpoints (responses, completions, gemini)	2026-02-15 17:25:33 -06:00
Nikketryhard	ca9f808ee3	feat: completions API improvements, gemini endpoint, response types	2026-02-15 17:08:53 -06:00
Nikketryhard	afa96b88a5	chore: remove broken googleSearch grounding and /v1/search endpoint	2026-02-15 17:08:46 -06:00
Nikketryhard	cc5f48967a	fix: LS cleanup uses sudo -u for same-UID kill, prevent double kill	2026-02-15 17:08:43 -06:00
Nikketryhard	b1bd57ab5e	feat: forward generation params via MITM + add usageMetadata to Gemini - Add GenerationParams struct to MitmStore for temperature, top_p, top_k, max_output_tokens, stop_sequences, frequency/presence_penalty - MITM modify_request injects params into request.generationConfig - All 3 endpoints (Completions, Responses, Gemini) store client params - Add usageMetadata to Gemini sync responses (promptTokenCount, candidatesTokenCount, totalTokenCount, thoughtsTokenCount) - Add generation param fields to GeminiRequest (temperature, topP, etc.) - Completions stream_options.include_usage emits final usage chunk - Completions reasoning_tokens in completion_tokens_details - Update endpoint gap analysis doc (all high-priority gaps resolved)	2026-02-15 14:23:05 -06:00
Nikketryhard	735c3e357d	chore: clean up dead code, fix broken test - Remove unused methods: append_response_text, clear_response, has_pending_function_calls, take_function_calls - Add #[allow(dead_code)] for intentionally kept future-use methods and response modification helpers - Remove unused now_unix import from gemini.rs - Fix test_modify_strips_all_tools: tools key is removed entirely when no custom tools provided, not left as empty array - Zero warnings, 32 tests passing	2026-02-15 01:14:51 -06:00
Nikketryhard	981fb3b18d	fix: resolve cascade correlation, update KNOWN_ISSUES - MitmStore: added active_cascade_id field with set/get/clear methods - record_usage() now falls back to active_cascade_id when the heuristic cascade hint is absent (fixes usage always going to _latest) - All three API handlers set active cascade before send_message - KNOWN_ISSUES: moved 3 issues to resolved: - Request modification (already true, was stale entry) - Cascade correlation (fixed via active_cascade_id) - Progressive thinking streaming (fixed via MITM bypass)	2026-02-15 01:10:34 -06:00
Nikketryhard	b3af73cebd	feat: sync all endpoints with MITM LS bypass + real-time thinking streaming - Responses API (streaming): MITM bypass path polls MitmStore directly when custom tools are active, skipping LS step polling entirely. Streams thinking text deltas in real-time as they arrive from the MITM. Handles function calls, text response, and thinking/reasoning events. - Responses API (sync): Same MITM bypass for non-streaming responses. Polls MitmStore for function calls or completed text before falling back to LS path. - Gemini endpoint: MITM bypass polls MitmStore directly for tool call responses, eliminating LS overhead. - MitmStore: Added captured_thinking_text field with set/peek/take methods for real-time thinking text capture from MITM SSE. - MITM proxy: Now captures both thinking_text and response_text from StreamingAccumulator into MitmStore when bypass mode is active.	2026-02-15 01:03:39 -06:00
Nikketryhard	50b53097bc	fix: bypass LS entirely when custom tools are active When custom tools are set, don't forward ANY response from Google to the LS. Instead, capture text and function calls directly into MitmStore. The completions handler reads from MitmStore. This eliminates the LS multi-turn loop (5 requests, 30+ seconds) that occurred because the LS kept processing responses internally. Tool calls now return in ~1.3s instead of timing out.	2026-02-15 00:54:40 -06:00
Nikketryhard	ec1c0c700d	fix: decouple function call detection from LS step polling Move MitmStore function call check outside get_steps() block so tool calls are detected immediately when captured by MITM, regardless of LS processing state. Also reduce poll interval to 300ms. The LS can take 20-30s for its internal multi-turn loop. Previously, function call checks were nested inside the steps block and required LS to have produced steps. Now the MITM capture is picked up within 300ms of detection.	2026-02-15 00:48:14 -06:00
Nikketryhard	4f08b994c7	fix: include tool results in conversation context When OpenCode sends follow-up messages with tool results, include the full conversation (user message, assistant tool calls, and tool results) in the text sent to the model. Previously only the user message was extracted, causing the model to never see tool results and call the same tool repeatedly in an infinite loop. Also add tool_calls and tool_call_id fields to CompletionMessage.	2026-02-15 00:42:43 -06:00
Nikketryhard	5d4125fa0d	fix: suppress dummy text from tool call responses Check for MITM-captured function calls BEFORE emitting text in the streaming handler. This prevents the dummy 'Tool call completed' placeholder (sent to the LS) from leaking to OpenCode, which was confusing it into infinite loops. Also removes duplicate function call storage at end of response loop since they're now stored immediately when detected.	2026-02-15 00:37:39 -06:00
Nikketryhard	502318acec	fix: store function calls in MitmStore immediately on detection Previously, captured function calls were only stored in MitmStore after the response loop ended. The completions handler polls take_any_function_calls() during streaming, creating a race condition where the MitmStore was empty. Now function calls are stored immediately when parse_streaming_chunk detects them, in both the initial body and body chunk paths.	2026-02-15 00:28:40 -06:00
Nikketryhard	40c6379ca1	fix: strip $schema and unsupported JSON Schema fields from tool params Google's Gemini API rejects $schema, additionalProperties, $ref, $defs, default, examples, and title in tool parameter schemas. OpenCode/MCP tools include these standard JSON Schema fields. Now recursively stripped during OpenAI→Gemini tool conversion.	2026-02-15 00:18:32 -06:00
Nikketryhard	7c44729ace	fix: forge dummy STOP response to LS on functionCall capture When the MITM detects a functionCall in Google's response AND custom tools are active, send a forged clean text response to the LS instead of the real one. This prevents the LS from seeing function calls for tools it doesn't manage, eliminating the retry loop entirely. The real function call data is captured in MitmStore and returned to the client (OpenCode) through the completions handler. Also removes the complex chunked-encoding response rewriting approach in favor of this simpler forge-and-break strategy.	2026-02-15 00:15:00 -06:00
Nikketryhard	19ff784cae	fix: always strip old functionCall/functionResponse from LS history The function call stripping was only happening when no custom tools were present. But even with custom tools injected, the LS history contains functionCall/functionResponse parts for LS-internal tools that we stripped, causing MALFORMED_FUNCTION_CALL. Now always strip regardless of custom tools presence.	2026-02-14 23:59:13 -06:00
Nikketryhard	3303ce38de	feat: add tool call support to chat completions endpoint - Accept tools and tool_choice fields in CompletionRequest - Convert OpenAI tools to Gemini format and store in MitmStore - Detect MITM-captured function calls in streaming poll loop - Emit tool_calls delta chunks in OpenAI streaming format - Finish with 'tool_calls' reason instead of 'stop' when tools used - Only clear tools when request has none (prevents stale state leak)	2026-02-14 23:47:23 -06:00
Nikketryhard	19090b79f0	fix: prevent MALFORMED_FUNCTION_CALL infinite retry loop Root cause: after stripping LS tool definitions, two things remained: 1. toolConfig with mode=VALIDATED (forces function calling even with empty tools array) 2. Model's training/identity context causing it to attempt function calls in text Fix: - Remove empty tools array and toolConfig when no custom tools injected - Strip functionCall/functionResponse parts from conversation history - Append explicit 'no tools available' instruction to system prompt - Remove debug dump code	2026-02-14 23:31:26 -06:00
Nikketryhard	a52d1bf475	fix: strip functionCall/functionResponse from history when no tools When LS tools are stripped from the request but the conversation history still contains functionCall/functionResponse parts referencing those tools, Google returns MALFORMED_FUNCTION_CALL and the LS retries in an infinite loop, causing the request to hang forever. Now after stripping LS tools and confirming no custom tools are injected, we also strip all functionCall/functionResponse parts from the history and remove any messages that become empty as a result.	2026-02-14 23:19:28 -06:00
Nikketryhard	7e16a7b892	fix: clear stale tool state in completions handler to prevent hang Tool definitions stored in MitmStore from /v1/responses requests were persisting and getting injected into /v1/chat/completions requests. This caused Gemini to return functionCalls instead of text, and since the completions handler has no function call handling logic, it would poll forever waiting for text that never came. Fix: clear active_tools, active_tool_config, and has_active_function_call at the start of handle_completions. Also add clear_active_function_call() method to MitmStore.	2026-02-14 23:10:45 -06:00
Nikketryhard	786987116b	feat: full tool call support (OpenAI + Gemini endpoints) - store.rs: Add tool context storage (active tools, tool config, pending tool results, call_id mapping, last function calls for history rewrite) - types.rs: Add tools/tool_choice fields to ResponsesRequest, add build_function_call_output helper for OpenAI function_call output items - modify.rs: Replace hardcoded get_weather with dynamic ToolContext injection. Add openai_tools_to_gemini and openai_tool_choice_to_gemini converters. Add conversation history rewriting for tool result turns (replaces fake 'Tool call completed' model turn with real functionCall, injects functionResponse before last user turn) - proxy.rs: Build ToolContext from MitmStore before calling modify_request. Save last_function_calls for history rewriting on subsequent turns - responses.rs: Store client tools in MitmStore before LS call. Detect function_call_output in input array for tool result submission. Return captured functionCalls as OpenAI function_call output items with generated call_ids and stringified arguments - gemini.rs: New Gemini-native endpoint (POST /v1/gemini) with zero format translation. Accepts functionDeclarations directly, returns functionCall in Gemini format directly - mod.rs: Wire /v1/gemini route, bump version to 3.3.0	2026-02-14 22:56:44 -06:00
Nikketryhard	8455aa674f	feat: capture function calls from Google + block follow-up quota waste When MITM strips LS tools and injects custom tools: - Google returns functionCall → captured in MitmStore - Follow-up LS requests are blocked with fake SSE response - Proxy consumes captured calls and clears the flag - Result: 1 real Google API call instead of 5+ per tool call Flow: Client → Proxy → LS → MITM(inject tool) → Google Google returns functionCall → MITM captures it LS tries follow-up → MITM blocks (fake response) Proxy reads captured functionCall → returns to client	2026-02-14 22:37:28 -06:00
Nikketryhard	146be139a2	fix: re-enable tool stripping after testing With tools present, LS enters full agentic mode doing multi-turn tool calls (file searches, terminal commands, etc.). A simple weather question caused 40+ Google API calls in 120s before timeout. Tool stripping is required to maintain single-turn behavior.	2026-02-14 22:18:02 -06:00
Nikketryhard	3e3af85798	feat: add proxyctl daemon manager, fix standalone LS cleanup - Add proxyctl CLI script for systemd service management - Add systemd user service file for background operation - Fix standalone LS kill: properly track real LS PID via pgrep and use sudo kill for cross-user cleanup on shutdown - Remove deprecated scripts (dns-redirect, iptables-redirect, mitm-wrapper, standalone-ls, parse-snapshot) - Disable tool stripping in MITM for tool call investigation - Update GEMINI.md with CLI tools documentation	2026-02-14 22:14:00 -06:00
Nikketryhard	f64f007421	fix: reduce GetCascadeTrajectory log spam from debug to trace	2026-02-14 21:43:36 -06:00
Nikketryhard	940786c57f	docs: update standalone LS, MITM, and panel stream investigation - Add panel-stream-investigation.md documenting dead end - Update KNOWN_ISSUES: move polling and panel stream to resolved - Update GEMINI.md with standalone LS section and new MITM setup - Fix standalone-ls-todo to reflect default mode	2026-02-14 21:40:35 -06:00
Nikketryhard	b965be3f60	feat: add reactive streaming and remove dead panel stream code - Subscribe to StreamCascadeReactiveUpdates for real-time cascade state diffs - Fall back to timer-based polling if streaming RPC unavailable - Remove StreamCascadePanelReactiveUpdates code (dead end, only has plan_status/user_settings) - Remove debug diff file-saving code - Add stream_reactive_rpc() helper to backend	2026-02-14 21:39:04 -06:00
Nikketryhard	3d7a7f492b	fix: reduce poll intervals for smoother streaming Streaming poll: 800-1200ms → 150-250ms (5x faster) Sync poll: 1000-1800ms → 200-400ms (4x faster) Verified via STEP_DUMP instrumentation that the LS updates plannerResponse.response incrementally during GENERATING status, so faster polling yields smoother progressive text delivery. Also restructured streaming to emit reasoning events first when thinking content is detected in LS steps before response text.	2026-02-14 20:34:37 -06:00
Nikketryhard	b1a089d21d	feat: emit streaming reasoning events per OpenAI spec Adds proper streaming SSE events for reasoning content: - response.output_item.added (reasoning) - response.reasoning_summary_part.added - response.reasoning_summary_text.delta - response.reasoning_summary_text.done - response.reasoning_summary_part.done - response.output_item.done (reasoning) These are emitted before the message events, matching the format that OpenAI-compatible clients expect for displaying thinking content.	2026-02-14 19:57:52 -06:00
Nikketryhard	5c1f4c77d9	fix: add retry logic for MITM thinking text merge race condition The LS makes two Google API calls for thinking models. Call 2 (thinking summary) may not have arrived by the time usage_from_poll runs after Call 1 (response). Now we peek first, and if thinking tokens exist but text is missing, wait up to 1s for the merge to happen. Also adds peek_usage method to MitmStore for non-consuming reads.	2026-02-14 19:54:37 -06:00
Nikketryhard	34b9553484	feat: capture thinking text via MITM dual-call merge The LS makes TWO separate Google API calls for thinking models: Call 1: response + thinking token count (no thinking text) Call 2: thinking summary text (no thinking tokens) Each hits a different StreamingAccumulator, so we: 1. Capture response_text in StreamingAccumulator (non-thinking parts) 2. In MitmStore::record_usage, detect when Call 2 arrives for a cascade that already has thinking tokens from Call 1 3. Merge Call 2's response_text as thinking_text on Call 1's usage Also injects includeThoughts into Google API requests via MITM modify to ensure thinking text is available in SSE responses.	2026-02-14 19:49:15 -06:00
Nikketryhard	905d55beb5	feat: capture thinking text from MITM-intercepted API responses The LS strips thinking/reasoning text from plannerResponse steps — only the thinkingSignature (opaque verification blob) is preserved. The actual thinking text flows through the MITM proxy in the raw Google SSE response (parts with thought: true) and Anthropic SSE (thinking_delta content blocks). Changes: - StreamingAccumulator now accumulates thinking text from SSE events - ApiUsage gains thinking_text: Option<String> - usage_from_poll returns (Usage, Option<thinking_text>) - Thinking text priority: MITM-captured > LS-extracted (fallback) - Reasoning output item now populated from real API data - Removed debug dump code	2026-02-14 19:30:09 -06:00
Nikketryhard	19dc920872	fix: return thinking as reasoning output item per OpenAI spec Thinking content was previously returned as non-standard top-level fields (thinking, thinking_duration). Now follows the official OpenAI Responses API format: - Reasoning appears as a 'type: reasoning' item in the output array with summary[].text containing the thinking content - Message item follows after the reasoning item - thinking_signature kept as proxy extension (internal multi-turn data) - Removed ResponseOutput/OutputContent structs in favor of serde_json::Value for polymorphic output items	2026-02-14 19:16:12 -06:00
Nikketryhard	7c4e781900	feat: aggressive request stripping — keep only identity + conversation Strip everything from intercepted LLM requests except: - <identity> section in system instruction - Actual conversation turns (user messages + model responses) Removed: tool_calling, web_app_dev, knowledge_discovery, persistent_context, skills, ephemeral_message, communication_style, user_information, user_rules, MEMORY, workflows, mcp_servers, conversation_summaries, ADDITIONAL_METADATA, Step Id prefixes. Expected reduction: ~92% (63KB → ~5KB for simple requests).	2026-02-14 19:05:49 -06:00
Nikketryhard	1a7c81e5f9	feat: strip ALL tools from intercepted requests by default Tools are only needed by the Antigravity webview for tool-call UI. Our proxy doesn't need them — the model generates text responses fine without tool definitions. Stripping all 20 tools saves ~15KB per request.	2026-02-14 18:53:38 -06:00
Nikketryhard	89a8422291	fix: suppress profile picture warn, ensure release binary rebuilds	2026-02-14 18:50:37 -06:00
Nikketryhard	e678ec655b	fix: standalone MITM — remove HTTPS_PROXY with iptables, fix is_agent detection - Only set HTTPS_PROXY/HTTP_PROXY when iptables UID isolation is NOT available. With iptables, double-proxying caused profile picture fetches to fail with 'lookup http' DNS errors. - Fix is_agent detection: handle JSON with spaces after colons ("requestType": "agent" vs "requestType":"agent") - Suppress wrapper-not-installed warning in standalone mode - Show 'iptables (standalone)' in banner instead of 'not installed'	2026-02-14 18:47:38 -06:00
Nikketryhard	f0c2574c88	feat: MITM request modification — strip bloat from LLM API requests Intercepts streamGenerateContent requests and trims: - System instruction: strips web_application_development, knowledge_discovery, persistent_context, skills sections (~18KB saved) - Content messages: strips empty user_rules, workflows boilerplate, conversation summaries (~4.5KB saved) - Tools: keeps 12 essential coding tools, strips 8 non-essential (browser_subagent, generate_image, search_web, etc. ~6KB saved) Total: ~55% reduction in request size while keeping identity, user info, and all coding-relevant tools intact. Only modifies 'agent' type requests, checkpoint requests pass through unmodified. Also: - Standalone mode is now the default (use --no-standalone to attach to existing LS) - Enable request modification by default - Add mold linker, sccache, nextest config (8 thread cap) - Add .cargo/config.toml and .config/nextest.toml	2026-02-14 18:35:07 -06:00
Nikketryhard	061b08fc8f	fix: cascade correlation — fallback to _latest MITM usage When the MITM can't extract a cascade ID from the intercepted request (Content-Length: 0 / chunked encoding), usage is stored under '_latest'. Now usage_from_poll and completions try the exact cascade_id first, then fall back to '_latest' so MITM-captured tokens are actually used.	2026-02-14 18:10:04 -06:00
Nikketryhard	ca36ab0631	chore: clean up MITM logs and add Google SSE tests - Demote non-LLM request logs to debug (only streamGenerateContent at info) - Demote non-streaming response headers to debug - Add 5 Google SSE parser tests (single event, multi-event accumulation, chunked framing, completion detection, no-thinking-tokens) - Fix unused variable warning in proxy.rs	2026-02-14 17:55:17 -06:00
Nikketryhard	d4de436856	feat: MITM interception for standalone LS with UID isolation - Spawn standalone LS as dedicated 'antigravity-ls' user via sudo - UID-scoped iptables redirect (port 443 → MITM proxy) via mitm-redirect.sh - Combined CA bundle (system CAs + MITM CA) for Go TLS trust - Transparent TLS interception with chunked response detection - Google SSE parser for streamGenerateContent usage extraction - Timeouts on all MITM operations (TLS handshake, upstream, idle) - Forward response data immediately (no buffering) - Per-model token usage capture (input, output, thinking) - Update docs and known issues to reflect resolved TLS blocker	2026-02-14 17:50:12 -06:00
Nikketryhard	6842bfeaa5	chore: clean up code — remove dead code, stale allows, eprintln→tracing, remove volatile data from docs	2026-02-14 16:11:34 -06:00
Nikketryhard	2e2d90bdb9	chore: remove BYOK issue — out of scope	2026-02-14 16:07:00 -06:00
Nikketryhard	f3fd203a53	chore: rewrite KNOWN_ISSUES with investigation verdicts and confidence levels	2026-02-14 16:02:01 -06:00
Nikketryhard	05ae6b8652	chore: clean up KNOWN_ISSUES — remove fixed items, renumber	2026-02-14 15:58:52 -06:00
Nikketryhard	2f53485821	fix(#4,#5,#7): remove dead cost field, fix stale fallback paths, mark quota as implemented	2026-02-14 15:55:11 -06:00
Nikketryhard	2ccc4b46f8	fix(#4 ): remove dead total_cost_usd field; map model enums to readable names	2026-02-14 15:54:03 -06:00
Nikketryhard	dd7b12a97d	fix(#2 ): cap domain cert cache at 64 entries	2026-02-14 15:49:39 -06:00

1 2

67 Commits