zerogravity

Author	SHA1	Message	Date
Nikketryhard	b1bd57ab5e	feat: forward generation params via MITM + add usageMetadata to Gemini - Add GenerationParams struct to MitmStore for temperature, top_p, top_k, max_output_tokens, stop_sequences, frequency/presence_penalty - MITM modify_request injects params into request.generationConfig - All 3 endpoints (Completions, Responses, Gemini) store client params - Add usageMetadata to Gemini sync responses (promptTokenCount, candidatesTokenCount, totalTokenCount, thoughtsTokenCount) - Add generation param fields to GeminiRequest (temperature, topP, etc.) - Completions stream_options.include_usage emits final usage chunk - Completions reasoning_tokens in completion_tokens_details - Update endpoint gap analysis doc (all high-priority gaps resolved)	2026-02-15 14:23:05 -06:00
Nikketryhard	735c3e357d	chore: clean up dead code, fix broken test - Remove unused methods: append_response_text, clear_response, has_pending_function_calls, take_function_calls - Add #[allow(dead_code)] for intentionally kept future-use methods and response modification helpers - Remove unused now_unix import from gemini.rs - Fix test_modify_strips_all_tools: tools key is removed entirely when no custom tools provided, not left as empty array - Zero warnings, 32 tests passing	2026-02-15 01:14:51 -06:00
Nikketryhard	40c6379ca1	fix: strip $schema and unsupported JSON Schema fields from tool params Google's Gemini API rejects $schema, additionalProperties, $ref, $defs, default, examples, and title in tool parameter schemas. OpenCode/MCP tools include these standard JSON Schema fields. Now recursively stripped during OpenAI→Gemini tool conversion.	2026-02-15 00:18:32 -06:00
Nikketryhard	7c44729ace	fix: forge dummy STOP response to LS on functionCall capture When the MITM detects a functionCall in Google's response AND custom tools are active, send a forged clean text response to the LS instead of the real one. This prevents the LS from seeing function calls for tools it doesn't manage, eliminating the retry loop entirely. The real function call data is captured in MitmStore and returned to the client (OpenCode) through the completions handler. Also removes the complex chunked-encoding response rewriting approach in favor of this simpler forge-and-break strategy.	2026-02-15 00:15:00 -06:00
Nikketryhard	19ff784cae	fix: always strip old functionCall/functionResponse from LS history The function call stripping was only happening when no custom tools were present. But even with custom tools injected, the LS history contains functionCall/functionResponse parts for LS-internal tools that we stripped, causing MALFORMED_FUNCTION_CALL. Now always strip regardless of custom tools presence.	2026-02-14 23:59:13 -06:00
Nikketryhard	19090b79f0	fix: prevent MALFORMED_FUNCTION_CALL infinite retry loop Root cause: after stripping LS tool definitions, two things remained: 1. toolConfig with mode=VALIDATED (forces function calling even with empty tools array) 2. Model's training/identity context causing it to attempt function calls in text Fix: - Remove empty tools array and toolConfig when no custom tools injected - Strip functionCall/functionResponse parts from conversation history - Append explicit 'no tools available' instruction to system prompt - Remove debug dump code	2026-02-14 23:31:26 -06:00
Nikketryhard	a52d1bf475	fix: strip functionCall/functionResponse from history when no tools When LS tools are stripped from the request but the conversation history still contains functionCall/functionResponse parts referencing those tools, Google returns MALFORMED_FUNCTION_CALL and the LS retries in an infinite loop, causing the request to hang forever. Now after stripping LS tools and confirming no custom tools are injected, we also strip all functionCall/functionResponse parts from the history and remove any messages that become empty as a result.	2026-02-14 23:19:28 -06:00
Nikketryhard	786987116b	feat: full tool call support (OpenAI + Gemini endpoints) - store.rs: Add tool context storage (active tools, tool config, pending tool results, call_id mapping, last function calls for history rewrite) - types.rs: Add tools/tool_choice fields to ResponsesRequest, add build_function_call_output helper for OpenAI function_call output items - modify.rs: Replace hardcoded get_weather with dynamic ToolContext injection. Add openai_tools_to_gemini and openai_tool_choice_to_gemini converters. Add conversation history rewriting for tool result turns (replaces fake 'Tool call completed' model turn with real functionCall, injects functionResponse before last user turn) - proxy.rs: Build ToolContext from MitmStore before calling modify_request. Save last_function_calls for history rewriting on subsequent turns - responses.rs: Store client tools in MitmStore before LS call. Detect function_call_output in input array for tool result submission. Return captured functionCalls as OpenAI function_call output items with generated call_ids and stringified arguments - gemini.rs: New Gemini-native endpoint (POST /v1/gemini) with zero format translation. Accepts functionDeclarations directly, returns functionCall in Gemini format directly - mod.rs: Wire /v1/gemini route, bump version to 3.3.0	2026-02-14 22:56:44 -06:00
Nikketryhard	8455aa674f	feat: capture function calls from Google + block follow-up quota waste When MITM strips LS tools and injects custom tools: - Google returns functionCall → captured in MitmStore - Follow-up LS requests are blocked with fake SSE response - Proxy consumes captured calls and clears the flag - Result: 1 real Google API call instead of 5+ per tool call Flow: Client → Proxy → LS → MITM(inject tool) → Google Google returns functionCall → MITM captures it LS tries follow-up → MITM blocks (fake response) Proxy reads captured functionCall → returns to client	2026-02-14 22:37:28 -06:00
Nikketryhard	146be139a2	fix: re-enable tool stripping after testing With tools present, LS enters full agentic mode doing multi-turn tool calls (file searches, terminal commands, etc.). A simple weather question caused 40+ Google API calls in 120s before timeout. Tool stripping is required to maintain single-turn behavior.	2026-02-14 22:18:02 -06:00
Nikketryhard	3e3af85798	feat: add proxyctl daemon manager, fix standalone LS cleanup - Add proxyctl CLI script for systemd service management - Add systemd user service file for background operation - Fix standalone LS kill: properly track real LS PID via pgrep and use sudo kill for cross-user cleanup on shutdown - Remove deprecated scripts (dns-redirect, iptables-redirect, mitm-wrapper, standalone-ls, parse-snapshot) - Disable tool stripping in MITM for tool call investigation - Update GEMINI.md with CLI tools documentation	2026-02-14 22:14:00 -06:00
Nikketryhard	34b9553484	feat: capture thinking text via MITM dual-call merge The LS makes TWO separate Google API calls for thinking models: Call 1: response + thinking token count (no thinking text) Call 2: thinking summary text (no thinking tokens) Each hits a different StreamingAccumulator, so we: 1. Capture response_text in StreamingAccumulator (non-thinking parts) 2. In MitmStore::record_usage, detect when Call 2 arrives for a cascade that already has thinking tokens from Call 1 3. Merge Call 2's response_text as thinking_text on Call 1's usage Also injects includeThoughts into Google API requests via MITM modify to ensure thinking text is available in SSE responses.	2026-02-14 19:49:15 -06:00
Nikketryhard	7c4e781900	feat: aggressive request stripping — keep only identity + conversation Strip everything from intercepted LLM requests except: - <identity> section in system instruction - Actual conversation turns (user messages + model responses) Removed: tool_calling, web_app_dev, knowledge_discovery, persistent_context, skills, ephemeral_message, communication_style, user_information, user_rules, MEMORY, workflows, mcp_servers, conversation_summaries, ADDITIONAL_METADATA, Step Id prefixes. Expected reduction: ~92% (63KB → ~5KB for simple requests).	2026-02-14 19:05:49 -06:00
Nikketryhard	1a7c81e5f9	feat: strip ALL tools from intercepted requests by default Tools are only needed by the Antigravity webview for tool-call UI. Our proxy doesn't need them — the model generates text responses fine without tool definitions. Stripping all 20 tools saves ~15KB per request.	2026-02-14 18:53:38 -06:00
Nikketryhard	f0c2574c88	feat: MITM request modification — strip bloat from LLM API requests Intercepts streamGenerateContent requests and trims: - System instruction: strips web_application_development, knowledge_discovery, persistent_context, skills sections (~18KB saved) - Content messages: strips empty user_rules, workflows boilerplate, conversation summaries (~4.5KB saved) - Tools: keeps 12 essential coding tools, strips 8 non-essential (browser_subagent, generate_image, search_web, etc. ~6KB saved) Total: ~55% reduction in request size while keeping identity, user info, and all coding-relevant tools intact. Only modifies 'agent' type requests, checkpoint requests pass through unmodified. Also: - Standalone mode is now the default (use --no-standalone to attach to existing LS) - Enable request modification by default - Add mold linker, sccache, nextest config (8 thread cap) - Add .cargo/config.toml and .config/nextest.toml	2026-02-14 18:35:07 -06:00

15 Commits