- store.rs: Add tool context storage (active tools, tool config, pending tool results, call_id mapping, last function calls for history rewrite) - types.rs: Add tools/tool_choice fields to ResponsesRequest, add build_function_call_output helper for OpenAI function_call output items - modify.rs: Replace hardcoded get_weather with dynamic ToolContext injection. Add openai_tools_to_gemini and openai_tool_choice_to_gemini converters. Add conversation history rewriting for tool result turns (replaces fake 'Tool call completed' model turn with real functionCall, injects functionResponse before last user turn) - proxy.rs: Build ToolContext from MitmStore before calling modify_request. Save last_function_calls for history rewriting on subsequent turns - responses.rs: Store client tools in MitmStore before LS call. Detect function_call_output in input array for tool result submission. Return captured functionCalls as OpenAI function_call output items with generated call_ids and stringified arguments - gemini.rs: New Gemini-native endpoint (POST /v1/gemini) with zero format translation. Accepts functionDeclarations directly, returns functionCall in Gemini format directly - mod.rs: Wire /v1/gemini route, bump version to 3.3.0
293 lines
14 KiB
Markdown
293 lines
14 KiB
Markdown
# Tool Call Implementation Plan
|
|
|
|
## Overview
|
|
|
|
Add full tool call support to the Antigravity proxy. Primary endpoint is OpenAI Responses API (`/v1/responses`), with a Gemini-native backup endpoint (`/v1/gemini`). Tools are stored per-session, all `tool_choice` modes supported, parallel tool calls supported.
|
|
|
|
## Data Flow
|
|
|
|
```
|
|
┌─────────┐ ┌───────────┐ ┌────┐ ┌──────┐ ┌────────┐
|
|
│ Client │─────▶│ Proxy │─────▶│ LS │─────▶│ MITM │─────▶│ Google │
|
|
│ (openai) │ │ (axum) │ │ │ │ │ │ │
|
|
│ │◀─────│ │◀─────│ │◀─────│ │◀─────│ │
|
|
└─────────┘ └───────────┘ └────┘ └──────┘ └────────┘
|
|
│ │ │ │
|
|
│ tools (OAI) │ store tools (Gemini fmt) │ inject │
|
|
│───────────────▶│────────────▶ MitmStore ─────▶│ tools │
|
|
│ │ │──────────────▶│
|
|
│ │ │ │
|
|
│ │ │ functionCall │
|
|
│ │◀──── capture ───────────────│◀──────────────│
|
|
│ tool_calls │ │ block follow │
|
|
│◀───────────────│ │ ups │
|
|
│ │ │ │
|
|
│ tool result │ store result │ inject │
|
|
│───────────────▶│────────────▶ MitmStore ─────▶│ fn response │
|
|
│ │ │──────────────▶│
|
|
│ final text │ │ │
|
|
│◀───────────────│◀────────────────────────────│◀──────────────│
|
|
```
|
|
|
|
## Format Differences
|
|
|
|
### Tool Definitions
|
|
|
|
| Aspect | OpenAI | Gemini |
|
|
| ------------ | -------------------------------------- | ---------------------------------- |
|
|
| Wrapper | `{"type":"function","function":{...}}` | `{"functionDeclarations":[{...}]}` |
|
|
| Type strings | lowercase: `"object"`, `"string"` | UPPERCASE: `"OBJECT"`, `"STRING"` |
|
|
| Parameters | JSON Schema subset | Same schema, uppercase types |
|
|
|
|
### Tool Choice
|
|
|
|
| OpenAI | Gemini toolConfig |
|
|
| --------------------------------------------- | ----------------------------------------------------------------------- |
|
|
| `"auto"` | `{"functionCallingConfig":{"mode":"AUTO"}}` |
|
|
| `"required"` | `{"functionCallingConfig":{"mode":"ANY"}}` |
|
|
| `"none"` | `{"functionCallingConfig":{"mode":"NONE"}}` |
|
|
| `{"type":"function","function":{"name":"X"}}` | `{"functionCallingConfig":{"mode":"ANY","allowedFunctionNames":["X"]}}` |
|
|
|
|
### Tool Call Response
|
|
|
|
| OpenAI (what we return) | Gemini (what Google returns) |
|
|
| -------------------------------------------------------------------------------------------------- | --------------------------------------------------------------- |
|
|
| `output: [{"type":"function_call","call_id":"call_xxx","name":"get_weather","arguments":"{...}"}]` | `parts: [{"functionCall":{"name":"get_weather","args":{...}}}]` |
|
|
|
|
### Tool Result Submission
|
|
|
|
| OpenAI (what client sends) | Gemini (what we inject into Google request) |
|
|
| -------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
|
|
| `input: [{"type":"function_call_output","call_id":"call_xxx","output":"{...}"}]` | `contents: [{role:"model",parts:[{functionCall:...}]},{role:"user",parts:[{functionResponse:{name:"...",response:{...}}}]}]` |
|
|
|
|
---
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Store Infrastructure (`store.rs`)
|
|
|
|
Add to `MitmStore`:
|
|
|
|
```rust
|
|
/// Active tool definitions (Gemini format) for MITM injection.
|
|
active_tools: Arc<RwLock<Option<Vec<Value>>>>,
|
|
/// Active tool config (Gemini toolConfig format).
|
|
active_tool_config: Arc<RwLock<Option<Value>>>,
|
|
/// Pending tool results for MITM to inject as functionResponse.
|
|
pending_tool_results: Arc<RwLock<Vec<PendingToolResult>>>,
|
|
/// Mapping call_id → function name for tool result routing.
|
|
call_id_to_name: Arc<RwLock<HashMap<String, String>>>,
|
|
/// Last captured function calls (for conversation history rewriting).
|
|
last_function_calls: Arc<RwLock<Vec<CapturedFunctionCall>>>,
|
|
```
|
|
|
|
New types:
|
|
|
|
```rust
|
|
pub struct PendingToolResult {
|
|
pub name: String,
|
|
pub result: serde_json::Value,
|
|
}
|
|
```
|
|
|
|
New methods:
|
|
|
|
- `set_tools(tools)` / `get_tools()` / `clear_tools()`
|
|
- `set_tool_config(config)` / `get_tool_config()`
|
|
- `add_tool_result(result)` / `take_tool_results()`
|
|
- `register_call_id(call_id, name)` / `lookup_call_id(call_id)`
|
|
- `set_last_function_calls(calls)` / `get_last_function_calls()`
|
|
|
|
### Phase 2: Request Types (`types.rs`)
|
|
|
|
Add to `ResponsesRequest`:
|
|
|
|
```rust
|
|
#[serde(default)]
|
|
pub tools: Option<Vec<serde_json::Value>>,
|
|
#[serde(default)]
|
|
pub tool_choice: Option<serde_json::Value>,
|
|
```
|
|
|
|
New output builder:
|
|
|
|
```rust
|
|
pub fn build_function_call_output(call_id: &str, name: &str, arguments: &str) -> Value
|
|
```
|
|
|
|
### Phase 3: Format Conversion + Dynamic Injection (`modify.rs`)
|
|
|
|
New public struct:
|
|
|
|
```rust
|
|
pub struct ToolContext {
|
|
pub tools: Option<Vec<Value>>, // Gemini functionDeclarations
|
|
pub tool_config: Option<Value>, // Gemini toolConfig
|
|
pub pending_results: Vec<PendingToolResult>, // Tool results to inject
|
|
pub last_calls: Vec<CapturedFunctionCall>, // For history rewriting
|
|
}
|
|
```
|
|
|
|
New conversion functions:
|
|
|
|
```rust
|
|
pub fn openai_tools_to_gemini(tools: &[Value]) -> Vec<Value> // OAI → Gemini format
|
|
pub fn openai_tool_choice_to_gemini(choice: &Value) -> Value // OAI → Gemini toolConfig
|
|
fn uppercase_types(val: Value) -> Value // Recursive type case fix
|
|
```
|
|
|
|
Change `modify_request` signature:
|
|
|
|
```rust
|
|
pub fn modify_request(body: &[u8], tool_ctx: Option<&ToolContext>) -> Option<Vec<u8>>
|
|
```
|
|
|
|
Tool injection logic:
|
|
|
|
1. Strip all LS tools (existing)
|
|
2. If `tool_ctx.tools` provided → inject as Gemini `functionDeclarations`
|
|
3. If `tool_ctx.tool_config` provided → inject as `toolConfig`
|
|
4. If `tool_ctx.pending_results` not empty → rewrite conversation history:
|
|
- Find model turn with "Tool call completed" → replace with `functionCall` parts
|
|
- Find last user turn → prepend `functionResponse` part
|
|
|
|
### Phase 4: MITM Plumbing (`proxy.rs`)
|
|
|
|
In `handle_http_over_tls`, before calling `modify_request`:
|
|
|
|
1. Read `get_tools()`, `get_tool_config()`, `take_tool_results()`, `get_last_function_calls()` from store
|
|
2. Build `ToolContext`
|
|
3. Pass to `modify_request(body, tool_ctx)`
|
|
|
|
After response capture:
|
|
|
|
1. Save captured function calls as `last_function_calls` (for future history rewriting)
|
|
|
|
### Phase 5: API Handler (`responses.rs`)
|
|
|
|
#### Request handling (in `handle_responses`):
|
|
|
|
1. If `body.tools` provided:
|
|
- Convert OpenAI → Gemini format via `openai_tools_to_gemini()`
|
|
- Store in `MitmStore` via `set_tools()`
|
|
2. If `body.tool_choice` provided:
|
|
- Convert via `openai_tool_choice_to_gemini()`
|
|
- Store in `MitmStore` via `set_tool_config()`
|
|
3. Check `body.input` for `function_call_output` items:
|
|
- If found: look up `call_id` → function name via `lookup_call_id()`
|
|
- Store as `PendingToolResult` via `add_tool_result()`
|
|
- Extract any accompanying text (or use placeholder)
|
|
|
|
#### Response handling (in `handle_responses_sync` / `handle_responses_stream`):
|
|
|
|
After polling completes:
|
|
|
|
1. Check `take_any_function_calls()` for captured tool calls
|
|
2. If captured:
|
|
- Generate `call_id` for each (e.g., `"call_" + random`)
|
|
- Register `call_id → name` mapping via `register_call_id()`
|
|
- Build `function_call` output items via `build_function_call_output()`
|
|
- Return these INSTEAD of the text message output
|
|
3. If no tool calls: existing text response behavior
|
|
|
|
### Phase 6: Gemini-Native Endpoint (`gemini.rs` + `mod.rs`)
|
|
|
|
New file `src/api/gemini.rs` with handler `handle_gemini`:
|
|
|
|
- Accepts tools in Gemini `functionDeclarations` format directly (no conversion)
|
|
- Accepts `toolConfig` directly
|
|
- Returns `functionCall` in Gemini format directly
|
|
- Same cascade/session management as responses.rs
|
|
- Much simpler — no format translation
|
|
|
|
Route: `POST /v1/gemini` in `mod.rs`
|
|
|
|
---
|
|
|
|
## File Change Summary
|
|
|
|
| File | Changes | Complexity |
|
|
| ---------------------- | ----------------------------------------------------------------------- | ---------- |
|
|
| `src/mitm/store.rs` | Add tool context storage (5 new fields, ~10 methods) | Medium |
|
|
| `src/api/types.rs` | Add `tools`/`tool_choice` to request, add output builder | Low |
|
|
| `src/mitm/modify.rs` | `ToolContext`, format conversion, dynamic injection, history rewrite | High |
|
|
| `src/mitm/proxy.rs` | Read store → build ToolContext → pass to modify | Low |
|
|
| `src/api/responses.rs` | Store tools, detect tool results in input, return function_call outputs | High |
|
|
| `src/api/gemini.rs` | New file — Gemini-native endpoint (passthrough) | Medium |
|
|
| `src/api/mod.rs` | Add route + module declaration | Low |
|
|
|
|
## Implementation Order
|
|
|
|
1. `store.rs` — foundation, no dependencies
|
|
2. `types.rs` — request/response types
|
|
3. `modify.rs` — format conversion + injection (depends on store types)
|
|
4. `proxy.rs` — plumbing (depends on modify signature)
|
|
5. Build + verify compilation
|
|
6. `responses.rs` — handler changes (depends on all above)
|
|
7. Build + test with `get_weather` request
|
|
8. `gemini.rs` + `mod.rs` — Gemini endpoint
|
|
9. Build + test with Gemini format
|
|
10. Tool result flow test (multi-turn)
|
|
|
|
## Testing Strategy
|
|
|
|
### Test 1: Basic tool call (sync)
|
|
|
|
```bash
|
|
curl -s http://localhost:8741/v1/responses -H "Content-Type: application/json" -d '{
|
|
"model": "gemini-3-flash",
|
|
"input": "What is the weather in Tokyo?",
|
|
"tools": [{"type":"function","function":{"name":"get_weather","description":"Get weather","parameters":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}}}],
|
|
"tool_choice": "auto",
|
|
"conversation": "tool-test",
|
|
"stream": false
|
|
}'
|
|
# Expected: output contains function_call with name=get_weather, arguments={"city":"Tokyo"}
|
|
```
|
|
|
|
### Test 2: Tool result submission (multi-turn)
|
|
|
|
```bash
|
|
curl -s http://localhost:8741/v1/responses -H "Content-Type: application/json" -d '{
|
|
"model": "gemini-3-flash",
|
|
"input": [{"type":"function_call_output","call_id":"call_xxx","output":"{\"temp\":72,\"unit\":\"F\"}"}],
|
|
"conversation": "tool-test",
|
|
"stream": false
|
|
}'
|
|
# Expected: output contains text response using the tool result
|
|
```
|
|
|
|
### Test 3: Gemini-native endpoint
|
|
|
|
```bash
|
|
curl -s http://localhost:8741/v1/gemini -H "Content-Type: application/json" -d '{
|
|
"model": "gemini-3-flash",
|
|
"input": "What is the weather in Tokyo?",
|
|
"tools": [{"functionDeclarations":[{"name":"get_weather","description":"Get weather","parameters":{"type":"OBJECT","properties":{"city":{"type":"STRING"}},"required":["city"]}}]}],
|
|
"conversation": "gemini-tool-test",
|
|
"stream": false
|
|
}'
|
|
# Expected: response contains functionCall in Gemini format
|
|
```
|
|
|
|
### Test 4: No tools (regression)
|
|
|
|
```bash
|
|
curl -s http://localhost:8741/v1/responses -H "Content-Type: application/json" -d '{
|
|
"model": "gemini-3-flash",
|
|
"input": "What is 2+2?",
|
|
"stream": false
|
|
}'
|
|
# Expected: normal text response, no tool call behavior
|
|
```
|
|
|
|
## Risks & Mitigations
|
|
|
|
| Risk | Impact | Mitigation |
|
|
| ---------------------------------------------------------------- | ------ | ------------------------------------------------------------------------- |
|
|
| History rewriting breaks conversation | High | Only rewrite when pending_results non-empty; keep original as fallback |
|
|
| LS times out waiting for Google response during tool result turn | Medium | Increase timeout for tool result turns |
|
|
| Multiple parallel tool calls create race conditions | Medium | AtomicBool + sequential processing already handles this |
|
|
| `modify_request` test breakage | Low | Update existing tests for new signature |
|
|
| Global tool storage conflicts across concurrent requests | Medium | Not an issue — LS processes one request at a time (single cascade active) |
|