feat: completions API improvements, gemini endpoint, response types
This commit is contained in:
85
GEMINI.md
85
GEMINI.md
@@ -47,17 +47,18 @@ sudo ./scripts/mitm-redirect.sh status # check current state
|
||||
|
||||
## Endpoints
|
||||
|
||||
| Method | Path | Description |
|
||||
| -------- | ---------------------- | ----------------------------------------------------------- |
|
||||
| `POST` | `/v1/responses` | **Responses API** (primary) — supports `stream: true/false` |
|
||||
| `POST` | `/v1/chat/completions` | Chat Completions API (OpenAI compat shim) |
|
||||
| `GET` | `/v1/models` | List available models |
|
||||
| `GET` | `/v1/sessions` | List active sessions |
|
||||
| `DELETE` | `/v1/sessions/:id` | Delete a session |
|
||||
| `POST` | `/v1/token` | Set OAuth token at runtime |
|
||||
| `GET` | `/v1/usage` | MITM-intercepted token usage stats |
|
||||
| `GET` | `/v1/quota` | LS quota — credits, per-model rate limits, reset timers |
|
||||
| `GET` | `/health` | Health check |
|
||||
| Method | Path | Description |
|
||||
| ---------- | ---------------------- | ----------------------------------------------------------- |
|
||||
| `POST` | `/v1/responses` | **Responses API** (primary) — supports `stream: true/false` |
|
||||
| `POST` | `/v1/chat/completions` | Chat Completions API (OpenAI compat shim) |
|
||||
| `GET/POST` | `/v1/search` | **Web Search** — Google Search grounding, returns results |
|
||||
| `GET` | `/v1/models` | List available models |
|
||||
| `GET` | `/v1/sessions` | List active sessions |
|
||||
| `DELETE` | `/v1/sessions/:id` | Delete a session |
|
||||
| `POST` | `/v1/token` | Set OAuth token at runtime |
|
||||
| `GET` | `/v1/usage` | MITM-intercepted token usage stats |
|
||||
| `GET` | `/v1/quota` | LS quota — credits, per-model rate limits, reset timers |
|
||||
| `GET` | `/health` | Health check |
|
||||
|
||||
## Available Models
|
||||
|
||||
@@ -116,8 +117,8 @@ curl -s http://localhost:8741/v1/responses \
|
||||
}' | jq .
|
||||
|
||||
# Follow-up in same cascade:
|
||||
curl -s http://localhost:8741/v1/responses \
|
||||
-H "Content-Type: application/json" \
|
||||
curl -s http://localhost:8741/v1/responses \\
|
||||
-H "Content-Type: application/json" \\
|
||||
-d '{
|
||||
"model": "gemini-3-flash",
|
||||
"input": "Now multiply that by 10",
|
||||
@@ -126,6 +127,64 @@ curl -s http://localhost:8741/v1/responses \
|
||||
}' | jq .
|
||||
```
|
||||
|
||||
## Web Search
|
||||
|
||||
The proxy supports Google Search grounding in two ways:
|
||||
|
||||
### 1. Dedicated Search Endpoint (`/v1/search`)
|
||||
|
||||
Returns structured search results with citations:
|
||||
|
||||
```bash
|
||||
# Quick GET search
|
||||
curl -s 'http://localhost:8741/v1/search?q=latest+rust+news' | jq .
|
||||
|
||||
# Full POST search with options
|
||||
curl -s http://localhost:8741/v1/search \\
|
||||
-H "Content-Type: application/json" \\
|
||||
-d '{
|
||||
"query": "latest Rust programming news",
|
||||
"model": "gemini-3-flash",
|
||||
"timeout": 30
|
||||
}' | jq .
|
||||
```
|
||||
|
||||
Response includes `summary`, `results[]` (title + URL), `citations[]`, and raw `grounding_metadata`.
|
||||
|
||||
### 2. Inline Grounding (on any endpoint)
|
||||
|
||||
Enable Google Search grounding on regular requests:
|
||||
|
||||
```bash
|
||||
# Completions API
|
||||
curl -s http://localhost:8741/v1/chat/completions \\
|
||||
-H "Content-Type: application/json" \\
|
||||
-d '{
|
||||
"model": "gemini-3-flash",
|
||||
"messages": [{"role": "user", "content": "What happened in tech today?"}],
|
||||
"web_search": true
|
||||
}' | jq .
|
||||
|
||||
# Responses API (OpenAI-style tool)
|
||||
curl -s http://localhost:8741/v1/responses \\
|
||||
-H "Content-Type: application/json" \\
|
||||
-d '{
|
||||
"model": "gemini-3-flash",
|
||||
"input": "What happened in tech today?",
|
||||
"tools": [{"type": "web_search_preview"}],
|
||||
"stream": false
|
||||
}' | jq .
|
||||
|
||||
# Gemini API
|
||||
curl -s http://localhost:8741/v1/gemini \\
|
||||
-H "Content-Type: application/json" \\
|
||||
-d '{
|
||||
"model": "gemini-3-flash",
|
||||
"message": "What happened in tech today?",
|
||||
"google_search": true
|
||||
}' | jq .
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
The proxy needs an OAuth token. Three ways to provide it:
|
||||
|
||||
Reference in New Issue
Block a user