feat: completions API improvements, gemini endpoint, response types

This commit is contained in:
Nikketryhard
2026-02-15 17:08:53 -06:00
parent afa96b88a5
commit ca9f808ee3
8 changed files with 1031 additions and 742 deletions

View File

@@ -47,17 +47,18 @@ sudo ./scripts/mitm-redirect.sh status # check current state
## Endpoints
| Method | Path | Description |
| -------- | ---------------------- | ----------------------------------------------------------- |
| `POST` | `/v1/responses` | **Responses API** (primary) — supports `stream: true/false` |
| `POST` | `/v1/chat/completions` | Chat Completions API (OpenAI compat shim) |
| `GET` | `/v1/models` | List available models |
| `GET` | `/v1/sessions` | List active sessions |
| `DELETE` | `/v1/sessions/:id` | Delete a session |
| `POST` | `/v1/token` | Set OAuth token at runtime |
| `GET` | `/v1/usage` | MITM-intercepted token usage stats |
| `GET` | `/v1/quota` | LS quota — credits, per-model rate limits, reset timers |
| `GET` | `/health` | Health check |
| Method | Path | Description |
| ---------- | ---------------------- | ----------------------------------------------------------- |
| `POST` | `/v1/responses` | **Responses API** (primary) — supports `stream: true/false` |
| `POST` | `/v1/chat/completions` | Chat Completions API (OpenAI compat shim) |
| `GET/POST` | `/v1/search` | **Web Search** — Google Search grounding, returns results |
| `GET` | `/v1/models` | List available models |
| `GET` | `/v1/sessions` | List active sessions |
| `DELETE` | `/v1/sessions/:id` | Delete a session |
| `POST` | `/v1/token` | Set OAuth token at runtime |
| `GET` | `/v1/usage` | MITM-intercepted token usage stats |
| `GET` | `/v1/quota` | LS quota — credits, per-model rate limits, reset timers |
| `GET` | `/health` | Health check |
## Available Models
@@ -116,8 +117,8 @@ curl -s http://localhost:8741/v1/responses \
}' | jq .
# Follow-up in same cascade:
curl -s http://localhost:8741/v1/responses \
-H "Content-Type: application/json" \
curl -s http://localhost:8741/v1/responses \\
-H "Content-Type: application/json" \\
-d '{
"model": "gemini-3-flash",
"input": "Now multiply that by 10",
@@ -126,6 +127,64 @@ curl -s http://localhost:8741/v1/responses \
}' | jq .
```
## Web Search
The proxy supports Google Search grounding in two ways:
### 1. Dedicated Search Endpoint (`/v1/search`)
Returns structured search results with citations:
```bash
# Quick GET search
curl -s 'http://localhost:8741/v1/search?q=latest+rust+news' | jq .
# Full POST search with options
curl -s http://localhost:8741/v1/search \\
-H "Content-Type: application/json" \\
-d '{
"query": "latest Rust programming news",
"model": "gemini-3-flash",
"timeout": 30
}' | jq .
```
Response includes `summary`, `results[]` (title + URL), `citations[]`, and raw `grounding_metadata`.
### 2. Inline Grounding (on any endpoint)
Enable Google Search grounding on regular requests:
```bash
# Completions API
curl -s http://localhost:8741/v1/chat/completions \\
-H "Content-Type: application/json" \\
-d '{
"model": "gemini-3-flash",
"messages": [{"role": "user", "content": "What happened in tech today?"}],
"web_search": true
}' | jq .
# Responses API (OpenAI-style tool)
curl -s http://localhost:8741/v1/responses \\
-H "Content-Type: application/json" \\
-d '{
"model": "gemini-3-flash",
"input": "What happened in tech today?",
"tools": [{"type": "web_search_preview"}],
"stream": false
}' | jq .
# Gemini API
curl -s http://localhost:8741/v1/gemini \\
-H "Content-Type: application/json" \\
-d '{
"model": "gemini-3-flash",
"message": "What happened in tech today?",
"google_search": true
}' | jq .
```
## Authentication
The proxy needs an OAuth token. Three ways to provide it: