Google's Gemini API rejects $schema, additionalProperties, $ref,
$defs, default, examples, and title in tool parameter schemas.
OpenCode/MCP tools include these standard JSON Schema fields.
Now recursively stripped during OpenAI→Gemini tool conversion.
When the MITM detects a functionCall in Google's response AND custom
tools are active, send a forged clean text response to the LS instead
of the real one. This prevents the LS from seeing function calls for
tools it doesn't manage, eliminating the retry loop entirely.
The real function call data is captured in MitmStore and returned to
the client (OpenCode) through the completions handler.
Also removes the complex chunked-encoding response rewriting approach
in favor of this simpler forge-and-break strategy.
The function call stripping was only happening when no custom tools
were present. But even with custom tools injected, the LS history
contains functionCall/functionResponse parts for LS-internal tools
that we stripped, causing MALFORMED_FUNCTION_CALL. Now always strip
regardless of custom tools presence.
Root cause: after stripping LS tool definitions, two things remained:
1. toolConfig with mode=VALIDATED (forces function calling even with
empty tools array)
2. Model's training/identity context causing it to attempt function
calls in text
Fix:
- Remove empty tools array and toolConfig when no custom tools injected
- Strip functionCall/functionResponse parts from conversation history
- Append explicit 'no tools available' instruction to system prompt
- Remove debug dump code
When LS tools are stripped from the request but the conversation history
still contains functionCall/functionResponse parts referencing those
tools, Google returns MALFORMED_FUNCTION_CALL and the LS retries in an
infinite loop, causing the request to hang forever.
Now after stripping LS tools and confirming no custom tools are injected,
we also strip all functionCall/functionResponse parts from the history
and remove any messages that become empty as a result.
- store.rs: Add tool context storage (active tools, tool config, pending
tool results, call_id mapping, last function calls for history rewrite)
- types.rs: Add tools/tool_choice fields to ResponsesRequest, add
build_function_call_output helper for OpenAI function_call output items
- modify.rs: Replace hardcoded get_weather with dynamic ToolContext
injection. Add openai_tools_to_gemini and openai_tool_choice_to_gemini
converters. Add conversation history rewriting for tool result turns
(replaces fake 'Tool call completed' model turn with real functionCall,
injects functionResponse before last user turn)
- proxy.rs: Build ToolContext from MitmStore before calling modify_request.
Save last_function_calls for history rewriting on subsequent turns
- responses.rs: Store client tools in MitmStore before LS call. Detect
function_call_output in input array for tool result submission. Return
captured functionCalls as OpenAI function_call output items with
generated call_ids and stringified arguments
- gemini.rs: New Gemini-native endpoint (POST /v1/gemini) with zero
format translation. Accepts functionDeclarations directly, returns
functionCall in Gemini format directly
- mod.rs: Wire /v1/gemini route, bump version to 3.3.0
When MITM strips LS tools and injects custom tools:
- Google returns functionCall → captured in MitmStore
- Follow-up LS requests are blocked with fake SSE response
- Proxy consumes captured calls and clears the flag
- Result: 1 real Google API call instead of 5+ per tool call
Flow: Client → Proxy → LS → MITM(inject tool) → Google
Google returns functionCall → MITM captures it
LS tries follow-up → MITM blocks (fake response)
Proxy reads captured functionCall → returns to client
With tools present, LS enters full agentic mode doing multi-turn
tool calls (file searches, terminal commands, etc.). A simple
weather question caused 40+ Google API calls in 120s before timeout.
Tool stripping is required to maintain single-turn behavior.
- Add proxyctl CLI script for systemd service management
- Add systemd user service file for background operation
- Fix standalone LS kill: properly track real LS PID via pgrep
and use sudo kill for cross-user cleanup on shutdown
- Remove deprecated scripts (dns-redirect, iptables-redirect,
mitm-wrapper, standalone-ls, parse-snapshot)
- Disable tool stripping in MITM for tool call investigation
- Update GEMINI.md with CLI tools documentation
The LS makes TWO separate Google API calls for thinking models:
Call 1: response + thinking token count (no thinking text)
Call 2: thinking summary text (no thinking tokens)
Each hits a different StreamingAccumulator, so we:
1. Capture response_text in StreamingAccumulator (non-thinking parts)
2. In MitmStore::record_usage, detect when Call 2 arrives for a
cascade that already has thinking tokens from Call 1
3. Merge Call 2's response_text as thinking_text on Call 1's usage
Also injects includeThoughts into Google API requests via MITM
modify to ensure thinking text is available in SSE responses.
Tools are only needed by the Antigravity webview for tool-call UI.
Our proxy doesn't need them — the model generates text responses fine
without tool definitions. Stripping all 20 tools saves ~15KB per request.