feat: completions API improvements, gemini endpoint, response types

2026-02-15 17:08:53 -06:00
parent afa96b88a5
commit ca9f808ee3
8 changed files with 1031 additions and 742 deletions
--- a/GEMINI.md
+++ b/GEMINI.md
@@ -47,17 +47,18 @@ sudo ./scripts/mitm-redirect.sh status     # check current state

 ## Endpoints

-| Method   | Path                   | Description                                                 |
-| -------- | ---------------------- | ----------------------------------------------------------- |
-| `POST`   | `/v1/responses`        | **Responses API** (primary) — supports `stream: true/false` |
-| `POST`   | `/v1/chat/completions` | Chat Completions API (OpenAI compat shim)                   |
-| `GET`    | `/v1/models`           | List available models                                       |
-| `GET`    | `/v1/sessions`         | List active sessions                                        |
-| `DELETE` | `/v1/sessions/:id`     | Delete a session                                            |
-| `POST`   | `/v1/token`            | Set OAuth token at runtime                                  |
-| `GET`    | `/v1/usage`            | MITM-intercepted token usage stats                          |
-| `GET`    | `/v1/quota`            | LS quota — credits, per-model rate limits, reset timers     |
-| `GET`    | `/health`              | Health check                                                |
+| Method     | Path                   | Description                                                 |
+| ---------- | ---------------------- | ----------------------------------------------------------- |
+| `POST`     | `/v1/responses`        | **Responses API** (primary) — supports `stream: true/false` |
+| `POST`     | `/v1/chat/completions` | Chat Completions API (OpenAI compat shim)                   |
+| `GET/POST` | `/v1/search`           | **Web Search** — Google Search grounding, returns results   |
+| `GET`      | `/v1/models`           | List available models                                       |
+| `GET`      | `/v1/sessions`         | List active sessions                                        |
+| `DELETE`   | `/v1/sessions/:id`     | Delete a session                                            |
+| `POST`     | `/v1/token`            | Set OAuth token at runtime                                  |
+| `GET`      | `/v1/usage`            | MITM-intercepted token usage stats                          |
+| `GET`      | `/v1/quota`            | LS quota — credits, per-model rate limits, reset timers     |
+| `GET`      | `/health`              | Health check                                                |

 ## Available Models

@@ -116,8 +117,8 @@ curl -s http://localhost:8741/v1/responses \
  }' | jq .

 # Follow-up in same cascade:
-curl -s http://localhost:8741/v1/responses \
-  -H "Content-Type: application/json" \
+curl -s http://localhost:8741/v1/responses \\
+  -H "Content-Type: application/json" \\
  -d '{
    "model": "gemini-3-flash",
    "input": "Now multiply that by 10",
@@ -126,6 +127,64 @@ curl -s http://localhost:8741/v1/responses \
  }' | jq .
 ```

+## Web Search
+
+The proxy supports Google Search grounding in two ways:
+
+### 1. Dedicated Search Endpoint (`/v1/search`)
+
+Returns structured search results with citations:
+
+```bash
+# Quick GET search
+curl -s 'http://localhost:8741/v1/search?q=latest+rust+news' | jq .
+
+# Full POST search with options
+curl -s http://localhost:8741/v1/search \\
+  -H "Content-Type: application/json" \\
+  -d '{
+    "query": "latest Rust programming news",
+    "model": "gemini-3-flash",
+    "timeout": 30
+  }' | jq .
+```
+
+Response includes `summary`, `results[]` (title + URL), `citations[]`, and raw `grounding_metadata`.
+
+### 2. Inline Grounding (on any endpoint)
+
+Enable Google Search grounding on regular requests:
+
+```bash
+# Completions API
+curl -s http://localhost:8741/v1/chat/completions \\
+  -H "Content-Type: application/json" \\
+  -d '{
+    "model": "gemini-3-flash",
+    "messages": [{"role": "user", "content": "What happened in tech today?"}],
+    "web_search": true
+  }' | jq .
+
+# Responses API (OpenAI-style tool)
+curl -s http://localhost:8741/v1/responses \\
+  -H "Content-Type: application/json" \\
+  -d '{
+    "model": "gemini-3-flash",
+    "input": "What happened in tech today?",
+    "tools": [{"type": "web_search_preview"}],
+    "stream": false
+  }' | jq .
+
+# Gemini API
+curl -s http://localhost:8741/v1/gemini \\
+  -H "Content-Type: application/json" \\
+  -d '{
+    "model": "gemini-3-flash",
+    "message": "What happened in tech today?",
+    "google_search": true
+  }' | jq .
+```
+
 ## Authentication

 The proxy needs an OAuth token. Three ways to provide it: