zerogravity/.gemini/plans/tool-calls-implementation.md

# Tool Call Implementation Plan

## Overview

Add full tool call support to the Antigravity proxy. Primary endpoint is OpenAI Responses API (`/v1/responses`), with a Gemini-native backup endpoint (`/v1/gemini`). Tools are stored per-session, all `tool_choice` modes supported, parallel tool calls supported.

## Data Flow

```
┌─────────┐      ┌───────────┐      ┌────┐      ┌──────┐      ┌────────┐
│  Client  │─────▶│  Proxy    │─────▶│ LS │─────▶│ MITM │─────▶│ Google │
│ (openai) │      │ (axum)    │      │    │      │      │      │        │
│          │◀─────│           │◀─────│    │◀─────│      │◀─────│        │
└─────────┘      └───────────┘      └────┘      └──────┘      └────────┘
     │                │                             │              │
     │  tools (OAI)   │  store tools (Gemini fmt)   │  inject      │
     │───────────────▶│────────────▶ MitmStore ─────▶│  tools       │
     │                │                             │──────────────▶│
     │                │                             │              │
     │                │                             │ functionCall  │
     │                │◀──── capture ───────────────│◀──────────────│
     │  tool_calls    │                             │ block follow  │
     │◀───────────────│                             │  ups          │
     │                │                             │              │
     │  tool result   │  store result               │  inject      │
     │───────────────▶│────────────▶ MitmStore ─────▶│ fn response  │
     │                │                             │──────────────▶│
     │  final text    │                             │              │
     │◀───────────────│◀────────────────────────────│◀──────────────│
```

## Format Differences

### Tool Definitions

| Aspect       | OpenAI                                 | Gemini                             |
| ------------ | -------------------------------------- | ---------------------------------- |
| Wrapper      | `{"type":"function","function":{...}}` | `{"functionDeclarations":[{...}]}` |
| Type strings | lowercase: `"object"`, `"string"`      | UPPERCASE: `"OBJECT"`, `"STRING"`  |
| Parameters   | JSON Schema subset                     | Same schema, uppercase types       |

### Tool Choice

| OpenAI                                        | Gemini toolConfig                                                       |
| --------------------------------------------- | ----------------------------------------------------------------------- |
| `"auto"`                                      | `{"functionCallingConfig":{"mode":"AUTO"}}`                             |
| `"required"`                                  | `{"functionCallingConfig":{"mode":"ANY"}}`                              |
| `"none"`                                      | `{"functionCallingConfig":{"mode":"NONE"}}`                             |
| `{"type":"function","function":{"name":"X"}}` | `{"functionCallingConfig":{"mode":"ANY","allowedFunctionNames":["X"]}}` |

### Tool Call Response

| OpenAI (what we return)                                                                            | Gemini (what Google returns)                                    |
| -------------------------------------------------------------------------------------------------- | --------------------------------------------------------------- |
| `output: [{"type":"function_call","call_id":"call_xxx","name":"get_weather","arguments":"{...}"}]` | `parts: [{"functionCall":{"name":"get_weather","args":{...}}}]` |

### Tool Result Submission

| OpenAI (what client sends)                                                       | Gemini (what we inject into Google request)                                                                                  |
| -------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
| `input: [{"type":"function_call_output","call_id":"call_xxx","output":"{...}"}]` | `contents: [{role:"model",parts:[{functionCall:...}]},{role:"user",parts:[{functionResponse:{name:"...",response:{...}}}]}]` |

---

## Implementation Phases

### Phase 1: Store Infrastructure (`store.rs`)

Add to `MitmStore`:

```rust
/// Active tool definitions (Gemini format) for MITM injection.
active_tools: Arc<RwLock<Option<Vec<Value>>>>,
/// Active tool config (Gemini toolConfig format).
active_tool_config: Arc<RwLock<Option<Value>>>,
/// Pending tool results for MITM to inject as functionResponse.
pending_tool_results: Arc<RwLock<Vec<PendingToolResult>>>,
/// Mapping call_id → function name for tool result routing.
call_id_to_name: Arc<RwLock<HashMap<String, String>>>,
/// Last captured function calls (for conversation history rewriting).
last_function_calls: Arc<RwLock<Vec<CapturedFunctionCall>>>,
```

New types:

```rust
pub struct PendingToolResult {
    pub name: String,
    pub result: serde_json::Value,
}
```

New methods:

- `set_tools(tools)` / `get_tools()` / `clear_tools()`
- `set_tool_config(config)` / `get_tool_config()`
- `add_tool_result(result)` / `take_tool_results()`
- `register_call_id(call_id, name)` / `lookup_call_id(call_id)`
- `set_last_function_calls(calls)` / `get_last_function_calls()`

### Phase 2: Request Types (`types.rs`)

Add to `ResponsesRequest`:

```rust
#[serde(default)]
pub tools: Option<Vec<serde_json::Value>>,
#[serde(default)]
pub tool_choice: Option<serde_json::Value>,
```

New output builder:

```rust
pub fn build_function_call_output(call_id: &str, name: &str, arguments: &str) -> Value
```

### Phase 3: Format Conversion + Dynamic Injection (`modify.rs`)

New public struct:

```rust
pub struct ToolContext {
    pub tools: Option<Vec<Value>>,          // Gemini functionDeclarations
    pub tool_config: Option<Value>,         // Gemini toolConfig
    pub pending_results: Vec<PendingToolResult>,  // Tool results to inject
    pub last_calls: Vec<CapturedFunctionCall>,    // For history rewriting
}
```

New conversion functions:

```rust
pub fn openai_tools_to_gemini(tools: &[Value]) -> Vec<Value>     // OAI → Gemini format
pub fn openai_tool_choice_to_gemini(choice: &Value) -> Value     // OAI → Gemini toolConfig
fn uppercase_types(val: Value) -> Value                          // Recursive type case fix
```

Change `modify_request` signature:

```rust
pub fn modify_request(body: &[u8], tool_ctx: Option<&ToolContext>) -> Option<Vec<u8>>
```

Tool injection logic:

1. Strip all LS tools (existing)
2. If `tool_ctx.tools` provided → inject as Gemini `functionDeclarations`
3. If `tool_ctx.tool_config` provided → inject as `toolConfig`
4. If `tool_ctx.pending_results` not empty → rewrite conversation history:
   - Find model turn with "Tool call completed" → replace with `functionCall` parts
   - Find last user turn → prepend `functionResponse` part

### Phase 4: MITM Plumbing (`proxy.rs`)

In `handle_http_over_tls`, before calling `modify_request`:

1. Read `get_tools()`, `get_tool_config()`, `take_tool_results()`, `get_last_function_calls()` from store
2. Build `ToolContext`
3. Pass to `modify_request(body, tool_ctx)`

After response capture:

1. Save captured function calls as `last_function_calls` (for future history rewriting)

### Phase 5: API Handler (`responses.rs`)

#### Request handling (in `handle_responses`):

1. If `body.tools` provided:
   - Convert OpenAI → Gemini format via `openai_tools_to_gemini()`
   - Store in `MitmStore` via `set_tools()`
2. If `body.tool_choice` provided:
   - Convert via `openai_tool_choice_to_gemini()`
   - Store in `MitmStore` via `set_tool_config()`
3. Check `body.input` for `function_call_output` items:
   - If found: look up `call_id` → function name via `lookup_call_id()`
   - Store as `PendingToolResult` via `add_tool_result()`
   - Extract any accompanying text (or use placeholder)

#### Response handling (in `handle_responses_sync` / `handle_responses_stream`):

After polling completes:

1. Check `take_any_function_calls()` for captured tool calls
2. If captured:
   - Generate `call_id` for each (e.g., `"call_" + random`)
   - Register `call_id → name` mapping via `register_call_id()`
   - Build `function_call` output items via `build_function_call_output()`
   - Return these INSTEAD of the text message output
3. If no tool calls: existing text response behavior

### Phase 6: Gemini-Native Endpoint (`gemini.rs` + `mod.rs`)

New file `src/api/gemini.rs` with handler `handle_gemini`:

- Accepts tools in Gemini `functionDeclarations` format directly (no conversion)
- Accepts `toolConfig` directly
- Returns `functionCall` in Gemini format directly
- Same cascade/session management as responses.rs
- Much simpler — no format translation

Route: `POST /v1/gemini` in `mod.rs`

---

## File Change Summary

| File                   | Changes                                                                 | Complexity |
| ---------------------- | ----------------------------------------------------------------------- | ---------- |
| `src/mitm/store.rs`    | Add tool context storage (5 new fields, ~10 methods)                    | Medium     |
| `src/api/types.rs`     | Add `tools`/`tool_choice` to request, add output builder                | Low        |
| `src/mitm/modify.rs`   | `ToolContext`, format conversion, dynamic injection, history rewrite    | High       |
| `src/mitm/proxy.rs`    | Read store → build ToolContext → pass to modify                         | Low        |
| `src/api/responses.rs` | Store tools, detect tool results in input, return function_call outputs | High       |
| `src/api/gemini.rs`    | New file — Gemini-native endpoint (passthrough)                         | Medium     |
| `src/api/mod.rs`       | Add route + module declaration                                          | Low        |

## Implementation Order

1. `store.rs` — foundation, no dependencies
2. `types.rs` — request/response types
3. `modify.rs` — format conversion + injection (depends on store types)
4. `proxy.rs` — plumbing (depends on modify signature)
5. Build + verify compilation
6. `responses.rs` — handler changes (depends on all above)
7. Build + test with `get_weather` request
8. `gemini.rs` + `mod.rs` — Gemini endpoint
9. Build + test with Gemini format
10. Tool result flow test (multi-turn)

## Testing Strategy

### Test 1: Basic tool call (sync)

```bash
curl -s http://localhost:8741/v1/responses -H "Content-Type: application/json" -d '{
  "model": "gemini-3-flash",
  "input": "What is the weather in Tokyo?",
  "tools": [{"type":"function","function":{"name":"get_weather","description":"Get weather","parameters":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}}}],
  "tool_choice": "auto",
  "conversation": "tool-test",
  "stream": false
}'
# Expected: output contains function_call with name=get_weather, arguments={"city":"Tokyo"}
```

### Test 2: Tool result submission (multi-turn)

```bash
curl -s http://localhost:8741/v1/responses -H "Content-Type: application/json" -d '{
  "model": "gemini-3-flash",
  "input": [{"type":"function_call_output","call_id":"call_xxx","output":"{\"temp\":72,\"unit\":\"F\"}"}],
  "conversation": "tool-test",
  "stream": false
}'
# Expected: output contains text response using the tool result
```

### Test 3: Gemini-native endpoint

```bash
curl -s http://localhost:8741/v1/gemini -H "Content-Type: application/json" -d '{
  "model": "gemini-3-flash",
  "input": "What is the weather in Tokyo?",
  "tools": [{"functionDeclarations":[{"name":"get_weather","description":"Get weather","parameters":{"type":"OBJECT","properties":{"city":{"type":"STRING"}},"required":["city"]}}]}],
  "conversation": "gemini-tool-test",
  "stream": false
}'
# Expected: response contains functionCall in Gemini format
```

### Test 4: No tools (regression)

```bash
curl -s http://localhost:8741/v1/responses -H "Content-Type: application/json" -d '{
  "model": "gemini-3-flash",
  "input": "What is 2+2?",
  "stream": false
}'
# Expected: normal text response, no tool call behavior
```

## Risks & Mitigations

| Risk                                                             | Impact | Mitigation                                                                |
| ---------------------------------------------------------------- | ------ | ------------------------------------------------------------------------- |
| History rewriting breaks conversation                            | High   | Only rewrite when pending_results non-empty; keep original as fallback    |
| LS times out waiting for Google response during tool result turn | Medium | Increase timeout for tool result turns                                    |
| Multiple parallel tool calls create race conditions              | Medium | AtomicBool + sequential processing already handles this                   |
| `modify_request` test breakage                                   | Low    | Update existing tests for new signature                                   |
| Global tool storage conflicts across concurrent requests         | Medium | Not an issue — LS processes one request at a time (single cascade active) |