Commit Graph

102 Commits

Author SHA1 Message Date
Nikketryhard
9ae6fa6eaf docs: improve README presentation by centering text and updating mermaid diagram colors. 2026-02-18 02:29:46 -06:00
Nikketryhard
f9fa9c6f22 docs: add project logo to README and adjust table formatting. 2026-02-18 02:28:04 -06:00
Nikketryhard
c7231e5590 feat: add Windows setup script (scheduled task) 2026-02-18 02:14:33 -06:00
Nikketryhard
8a9662edea feat: add cross-platform support via platform detection module
Introduces src/platform.rs with OS detection and env var overrides.
All hardcoded Linux paths replaced with Platform::detect() across
8 source files. Key changes:

- New Platform struct with 11 fields (all overridable via env vars)
- /proc/ access gated to Linux (#[cfg(target_os = "linux")])
- pgrep/pkill patterns broadened for cross-platform LS discovery
- sec-ch-ua-platform header now dynamic per OS
- Token, traces, config, CA cert paths use platform module
- LD_PRELOAD DNS redirect gated to Linux only
- Setup scripts for Linux (systemd) and macOS (launchd)
- find_ls_binary_path has cross-platform stubs

All 46 tests pass, cargo check clean.
2026-02-18 02:13:23 -06:00
Nikketryhard
7136c0e53c docs: mark /v1/search as WIP in endpoint tables 2026-02-18 02:00:57 -06:00
Nikketryhard
1a5075dd20 refactor: remove /v1/gemini endpoint, replaced by /v1beta routes
- Delete handle_gemini handler (identical to handle_gemini_v1beta)
- Remove /v1/gemini route from router
- Update root handler service name to zerogravity
- Clean all doc references
2026-02-18 01:59:22 -06:00
Nikketryhard
59ed872ed3 chore: fix remaining Antigravity Proxy refs, add systemd unit
- Rename CA org to ZeroGravity
- Fix lib.rs docstring
- Fix mitm-redirect.sh comment
- Fix README title
2026-02-18 01:56:43 -06:00
Nikketryhard
00587fcce8 feat: rebrand to ZeroGravity, replace proxyctl with zg Rust binary
Phase 1 - Rename:
- Crate: antigravity-proxy -> zerogravity
- Env: ANTIGRAVITY_OAUTH_TOKEN -> ZEROGRAVITY_TOKEN
- Paths: ~/.config/antigravity-proxy -> ~/.config/zerogravity
- Paths: /tmp/antigravity-* -> /tmp/zerogravity-*
- User: antigravity-ls -> zerogravity-ls
- Service: antigravity-proxy -> zerogravity

Phase 2 - zg daemon manager:
- New Rust binary src/bin/zg.rs replaces scripts/proxyctl bash
- Commands: start, stop, restart, rebuild, status, logs, test, health
- Auto-resolves project dir from binary location
- All commands exit immediately (safe for agent fast-bash)
2026-02-18 01:54:54 -06:00
Nikketryhard
409ee97405 fix: replace \\n with <br/> in mermaid node labels 2026-02-18 01:35:12 -06:00
Nikketryhard
3d87c04d20 docs: overhaul docs, add architecture and traces, update README/GEMINI
- Add docs/architecture.md with 4 mermaid diagrams
- Add docs/mitm.md with 3 mermaid diagrams (replaces mitm-interception-status)
- Add docs/traces.md documenting per-call trace system
- Rewrite README.md to be concise with mermaid and doc refs
- Rewrite GEMINI.md for core philosophy and agent usage
- Clean extension-server-analysis.md (remove stale debug sections)
- Delete temp docs: standalone-ls-todo, panel-stream-investigation,
  endpoint-gap-analysis, request-comparison
2026-02-18 01:31:18 -06:00
Nikketryhard
28d3296c87 fix: gemini route, usage capture, search timeout, and trace finalization
- Add missing /v1/gemini POST route and handler
- Capture MitmEvent::Usage in gemini sync/streaming handlers
- Add retry counter (max 3) to search handler to prevent hang
- Add trace finalization at all gemini_sync channel exit points
- Fix UpstreamError trace outcome label
- Add timeout trace with error recording
- Dispatch Usage before ResponseComplete in SSE flush
2026-02-18 01:31:18 -06:00
Nikketryhard
48674f65da refactor: decompose large functions and remove dead code
- Decompose modify_request() into 7 single-responsibility helpers
- Decompose handle_http_over_tls(): extract read_full_request, dispatch_stream_events
- Promote connect_upstream/resolve_upstream to module-level functions
- Split standalone.rs (1238 lines) into 4 submodules:
  standalone/mod.rs, spawn.rs, discovery.rs, stub.rs
- Extract proto wire primitives into proto/wire.rs
- Remove 6 dead MitmStore methods
- Remove dead SessionResult, DEFAULT_SESSION, get_or_create
- Remove dead decode_varint_at, extract_conversation_id
- Clean all unused imports across 10 files
- Suppress structural dead_code warnings on deserialization fields

Warnings: 20 -> 0. All 43 tests pass.
2026-02-17 22:27:26 -06:00
Nikketryhard
637fbc0e54 refactor: endpoint parity and proxy improvements
Mixed changes from recent sessions: endpoint feature parity
improvements, proxy bug fixes, and store cleanup.
2026-02-16 21:47:00 -06:00
Nikketryhard
86675fd960 docs: add real request comparison (proxy vs CLIProxyAPI)
Captured actual MITM headers and body from a live request.
CLIProxyAPI side reconstructed from source code.
2026-02-16 21:46:52 -06:00
Nikketryhard
34799fa2a9 feat: add official Gemini v1beta API routes
Replace /v1/gemini with proper Gemini API paths:
- POST /v1beta/models/{model}:generateContent (sync)
- POST /v1beta/models/{model}:streamGenerateContent (streaming)

Model is extracted from URL path. Uses axum wildcard
catch-all since colons in path segments are not supported.
2026-02-16 21:46:52 -06:00
Nikketryhard
eb4c846b24 feat: match CLIProxyAPI system instruction pattern
Replace custom IGNORE/no-tools messages with CLIProxyAPI-style
multi-part system instruction: part[0] = identity text,
part[1] = Please ignore following [ignore]...[/ignore].
2026-02-16 21:46:52 -06:00
Nikketryhard
cac30067ef feat: add structured output to Gemini endpoint
Gemini endpoint now accepts responseMimeType and responseSchema
fields, injected into Google's generationConfig via MITM. Supports
both snake_case and camelCase aliases.
2026-02-16 19:57:18 -06:00
Nikketryhard
135cd47f8f fix: proxyctl logs no longer hangs
logs command was using journalctl -f (follow) which blocks forever.
Split into three commands:
- logs [N]: show last N lines and exit (default 30)
- logs-follow [N]: tail + follow (old behavior)
- logs-all: full dump
2026-02-16 19:44:23 -06:00
Nikketryhard
a47c572e48 fix: forward Google's exact error messages to client
Root cause: errors from Google were being swallowed, replaced with
placeholders like 'Google API returned HTTP 400' or '[Timeout waiting
for response]', or silently converted to fake 'incomplete' responses.

Changes across all endpoints (/v1/chat/completions, /v1/responses,
/v1/gemini, /v1/search):

Error message fidelity:
- UpstreamError message now includes Google's status prefix: [STATUS] msg
- Falls back to raw body if JSON parsing fails (protobuf, HTML, etc.)
- ErrorDetail gains optional code and param fields

Timeout handling:
- poll_for_response returns UpstreamError(504, DEADLINE_EXCEEDED) on timeout
  instead of '[Timeout waiting for AI response]' placeholder text
- Streaming timeouts emit proper error events, not fake content
- Sync bypass timeouts return 504 Gateway Timeout, not 200 incomplete

Missing error checks added:
- responses.rs sync bypass: added upstream_error check in polling loop
- gemini.rs sync bypass: added upstream_error check in polling loop
- gemini.rs streaming: added upstream_error check in polling loop
  (was completely missing — errors only handled in sync path)

DRY helpers:
- upstream_error_message(): shared exact message extraction
- upstream_error_type(): shared Google→OpenAI error type mapping
- All streaming handlers use these instead of inline formatting
2026-02-16 19:30:32 -06:00
Nikketryhard
931e1cc5a1 chore: remove unused push_tool_round_calls and attach_tool_round_results 2026-02-16 19:22:09 -06:00
Nikketryhard
ba96534ead fix: prevent tool_rounds cross-cascade contamination causing hangs
Root cause: proxy.rs eagerly pushed tool rounds via push_tool_round_calls
when intercepting Google's functionCall response. These stale rounds leaked
into LS follow-up requests, producing malformed history that Google timed
out on (60s 'no upstream response').

Changes:
- Remove push_tool_round_calls from proxy.rs response interception
- proxy.rs: use get_tool_rounds (non-destructive) instead of take_tool_rounds
  so accumulated rounds persist across multiple LS requests per cascade
- responses.rs/gemini.rs: build rounds via take+push+set pattern — each
  handler accumulates its own rounds from get_last_function_calls + results
- completions.rs: unchanged (set_tool_rounds replaces from messages)
- clear_tools: also clears tool_rounds to prevent stale data between sessions
- store.rs: add get_tool_rounds (non-destructive clone) method
2026-02-16 19:21:03 -06:00
Nikketryhard
32f02d6456 fix: extend multi-round tool history to responses and gemini endpoints
- proxy.rs: push_tool_round_calls alongside set_last_function_calls
  when Google responds with functionCall — accumulates rounds
- responses.rs: attach_tool_round_results to pair tool results with
  the correct round instead of flat add_tool_result
- gemini.rs: same attach_tool_round_results integration
- store.rs: add push_tool_round_calls and attach_tool_round_results
  methods for cross-request round accumulation
- Legacy add_tool_result kept for backward compat alongside new path
2026-02-16 19:11:38 -06:00
Nikketryhard
39381a4dfe fix: multi-round tool history rewrite and finishReason handling
- Add ToolRound struct to pair function calls with results per-round
- Replace single-match history rewrite (broke after first round) with
  multi-round loop that rewrites ALL placeholder model turns
- Fix tool result name fallback: use positional index instead of always
  picking the first call
- Set is_complete for any finishReason (FUNCTION_CALL, MAX_TOKENS, etc.)
  not just STOP — prevents response_complete flag from never being set
- Legacy fallback: responses.rs path (single-round via last_calls +
  pending_results) still works when tool_rounds is empty
- Add tests: multi-round rewrite, single-round legacy, no-op, and
  FUNCTION_CALL/MAX_TOKENS finishReason handling
2026-02-16 19:05:37 -06:00
Nikketryhard
6bda2ecafa fix: tool call race conditions and missing completions tool result extraction
- store.rs: record_function_call now falls back to active_cascade_id
  (matching record_usage behavior) instead of blind _latest fallback
- store.rs: add cascade-aware take_function_calls(cascade_id) method
  with priority: exact match → active cascade → _latest → any key
- completions.rs: extract tool_calls from assistant messages and tool
  results from tool messages, storing them for MITM injection. This was
  the ROOT CAUSE — the completions handler stored tool definitions but
  never extracted tool results, so modify_request couldn't rewrite the
  LS conversation history with proper functionCall/functionResponse
- responses.rs: use cascade-aware take_function_calls for consistency
2026-02-16 18:43:16 -06:00
Nikketryhard
38b4130c55 feat: Implement request generation counter and state management to prevent stale data and unblock Language Server for follow-up requests. 2026-02-16 16:21:52 -06:00
Nikketryhard
e6a339d92e fix: clear request_in_flight when stream ends
Without this, request_in_flight stayed true after tool call streaming,
blocking all subsequent turns until the next completions handler
happened to clear it first.
2026-02-16 01:02:09 -06:00
Nikketryhard
3fdd0368a0 fix: block ALL LS follow-up requests across connections
Move the in-flight blocking check to the top of the LLM request flow,
BEFORE request modification. This catches follow-ups on ALL connections
(the LS opens multiple parallel TLS connections). Only the very first
modified request reaches Google — all others get fake STOP responses.

Previously, each new connection independently allowed one request
through before blocking, letting 4-5 requests leak per turn.
2026-02-16 00:57:33 -06:00
Nikketryhard
a8f3c8915f fix: block ALL LS follow-up requests, deduplicate function calls
- Add request_in_flight flag to MitmStore, set immediately when first
  LLM request is forwarded with custom tools active
- Block ALL subsequent LS requests (agentic loop + internal flash-lite)
  with fake SSE responses instead of waiting for response_complete
- Fix function call deduplication: drain() accumulator after storing
  to prevent 3x duplicate tool calls across SSE chunks
- Clear all stale state (response, thinking, function calls, errors)
  at the start of each streaming request
- Handle response_complete with no content (thoughtSignature-only)
  gracefully with timeout instead of infinite hang
2026-02-16 00:51:56 -06:00
Nikketryhard
5f40385c8d feat: sudoless MITM via LD_PRELOAD DNS redirect
Hook getaddrinfo() via LD_PRELOAD to redirect Google API domain
resolution to 127.0.0.1, combined with a port-modified endpoint URL.
This makes the LS connect directly to the local MITM proxy for ALL
API calls - even the CodeAssistClient which has Proxy:nil hardcoded.

Architecture:
  LS → DNS: googleapis.com → 127.0.0.1 (hooked via getaddrinfo)
     → Connect: 127.0.0.1:MITM_PORT (from -cloud_code_endpoint)
     → MITM proxy intercepts transparent TLS via SNI
     → Forward to real Google API

Key findings from investigation:
- Go uses raw syscalls for connect() (NOT hookable via LD_PRELOAD)
- Go uses libc getaddrinfo() for DNS (hookable via CGO path)
- dns_redirect.so is compiled from embedded C source on first run
- No iptables, no sudo, no CAP_NET_BIND_SERVICE needed
2026-02-15 23:24:43 -06:00
Nikketryhard
6a07786c4e feat: implement headless LS authentication via state sync
Reverse-engineered the UnifiedStateSyncUpdate protocol:
- initial_state field is bytes (not string), contains serialized Topic proto
- Map key for OAuth is 'oauthTokenInfoSentinelKey'
- Row.value is base64-encoded OAuthTokenInfo protobuf
- OAuthTokenInfo includes access_token, token_type, expiry (Timestamp)
- Set far-future expiry (2099) to prevent token expiry errors

Also fixed:
- PushUnifiedStateSyncUpdate returns proper empty proto response
- Stream keep-alive avoids sending empty envelopes (LS rejects nil updates)
- uss-enterprisePreferences topic handled (empty initial state)
2026-02-15 21:40:35 -06:00
Nikketryhard
4e4d8e9474 chore: code cleanup and documentation overhaul
- Remove debug header dump from MITM proxy (was temp debugging code)
- Suppress dead_code warnings for intentional OpenAI compat fields
- Rewrite README with styled mermaid architecture diagrams, full
  feature listing, usage examples, and CLI reference
- Update endpoint-gap-analysis: images implemented, audio only stretch
- Update mitm-interception-status: add request modification and error
  capture components
- Update standalone-ls-todo: add new endpoints to test results
- Zero compiler warnings
2026-02-15 18:27:53 -06:00
Nikketryhard
2882f7cce2 feat: propagate Google upstream errors to client
When Google returns an error (400, 429, 500, etc.), the MITM proxy now
captures it and the API handlers return it immediately instead of
hanging until timeout.

- UpstreamError struct stored in MitmStore
- MITM proxy parses Google error JSON (message + status)
- Polling handler checks for upstream errors each cycle
- Streaming handlers emit response.failed / SSE error events
- Error status mapped to OpenAI-style types (invalid_request_error,
  rate_limit_error, authentication_error, server_error, etc.)
- All handlers clear stale errors at request start
2026-02-15 18:19:38 -06:00
Nikketryhard
371c57bab0 fix: parse flat content arrays in Responses API input
When input is [{type: 'input_image', ...}, {type: 'input_text', text: '...'}],
the code was looking for items with role: 'user' which don't exist in flat
content arrays. Now extracts text from input_text items directly first,
falling back to role-based messages only if no flat text found.

Also adds debug header dump for MITM request forwarding.
2026-02-15 18:10:03 -06:00
Nikketryhard
1a6bfa5b53 fix: update Content-Length header when MITM modifies request body
The MITM modifier kept original HTTP headers (including Content-Length)
when replacing the body. When injecting a ~200KB image into a ~66KB
request, Google would only read Content-Length bytes, then hang waiting
for a new request that never comes.

Now we regex-replace the Content-Length header value to match the actual
rechunked body size after modification.
2026-02-15 18:02:13 -06:00
Nikketryhard
89bea030cc feat: inject images via MITM layer instead of relying on LS
The LS silently ignores the 'images' field from our
SendUserCascadeMessageRequest proto — it never forwards image data
to Google's API.

New approach: store the image in MitmStore, then the MITM request
modifier injects it as 'inlineData' directly into the last user
message's parts array in the Google API JSON request.

Flow:
  Client → Proxy (decode base64) → MitmStore.set_pending_image()
  LS → Google API → MITM intercepts → inject inlineData part
  → Google receives image + text together

This works for all three API endpoints (responses, completions,
gemini).
2026-02-15 17:57:32 -06:00
Nikketryhard
0a33c1b706 fix: send images as top-level ImageData field, not ChatMessage blob
SendUserCascadeMessageRequest proto field layout (from JS bundle analysis):
- Field 6 is 'images' (repeated ImageData) at the REQUEST level
- NOT a Blob sub-message inside ChatMessage (field 2)

ImageData proto uses base64_data (field 1) + mime_type (field 2),
not raw bytes. The LS was silently ignoring our ChatMessage blob
because the field structure didn't match.

Also protect MITM modifier from stripping messages containing
inlineData (image parts in Google API JSON).
2026-02-15 17:46:41 -06:00
Nikketryhard
2ac2016ed4 fix: resolve symlink in proxyctl before deriving PROJECT_DIR 2026-02-15 17:36:45 -06:00
Nikketryhard
976c44fdd4 feat: add image support across all endpoints (responses, completions, gemini) 2026-02-15 17:25:33 -06:00
Nikketryhard
ca9f808ee3 feat: completions API improvements, gemini endpoint, response types 2026-02-15 17:08:53 -06:00
Nikketryhard
afa96b88a5 chore: remove broken googleSearch grounding and /v1/search endpoint 2026-02-15 17:08:46 -06:00
Nikketryhard
cc5f48967a fix: LS cleanup uses sudo -u for same-UID kill, prevent double kill 2026-02-15 17:08:43 -06:00
Nikketryhard
b1bd57ab5e feat: forward generation params via MITM + add usageMetadata to Gemini
- Add GenerationParams struct to MitmStore for temperature, top_p,
  top_k, max_output_tokens, stop_sequences, frequency/presence_penalty
- MITM modify_request injects params into request.generationConfig
- All 3 endpoints (Completions, Responses, Gemini) store client params
- Add usageMetadata to Gemini sync responses (promptTokenCount,
  candidatesTokenCount, totalTokenCount, thoughtsTokenCount)
- Add generation param fields to GeminiRequest (temperature, topP, etc.)
- Completions stream_options.include_usage emits final usage chunk
- Completions reasoning_tokens in completion_tokens_details
- Update endpoint gap analysis doc (all high-priority gaps resolved)
2026-02-15 14:23:05 -06:00
Nikketryhard
735c3e357d chore: clean up dead code, fix broken test
- Remove unused methods: append_response_text, clear_response,
  has_pending_function_calls, take_function_calls
- Add #[allow(dead_code)] for intentionally kept future-use methods
  and response modification helpers
- Remove unused now_unix import from gemini.rs
- Fix test_modify_strips_all_tools: tools key is removed entirely
  when no custom tools provided, not left as empty array
- Zero warnings, 32 tests passing
2026-02-15 01:14:51 -06:00
Nikketryhard
981fb3b18d fix: resolve cascade correlation, update KNOWN_ISSUES
- MitmStore: added active_cascade_id field with set/get/clear methods
- record_usage() now falls back to active_cascade_id when the heuristic
  cascade hint is absent (fixes usage always going to _latest)
- All three API handlers set active cascade before send_message
- KNOWN_ISSUES: moved 3 issues to resolved:
  - Request modification (already true, was stale entry)
  - Cascade correlation (fixed via active_cascade_id)
  - Progressive thinking streaming (fixed via MITM bypass)
2026-02-15 01:10:34 -06:00
Nikketryhard
b3af73cebd feat: sync all endpoints with MITM LS bypass + real-time thinking streaming
- Responses API (streaming): MITM bypass path polls MitmStore directly
  when custom tools are active, skipping LS step polling entirely.
  Streams thinking text deltas in real-time as they arrive from the MITM.
  Handles function calls, text response, and thinking/reasoning events.

- Responses API (sync): Same MITM bypass for non-streaming responses.
  Polls MitmStore for function calls or completed text before falling
  back to LS path.

- Gemini endpoint: MITM bypass polls MitmStore directly for tool call
  responses, eliminating LS overhead.

- MitmStore: Added captured_thinking_text field with set/peek/take methods
  for real-time thinking text capture from MITM SSE.

- MITM proxy: Now captures both thinking_text and response_text from
  StreamingAccumulator into MitmStore when bypass mode is active.
2026-02-15 01:03:39 -06:00
Nikketryhard
50b53097bc fix: bypass LS entirely when custom tools are active
When custom tools are set, don't forward ANY response from Google
to the LS. Instead, capture text and function calls directly into
MitmStore. The completions handler reads from MitmStore.

This eliminates the LS multi-turn loop (5 requests, 30+ seconds)
that occurred because the LS kept processing responses internally.
Tool calls now return in ~1.3s instead of timing out.
2026-02-15 00:54:40 -06:00
Nikketryhard
ec1c0c700d fix: decouple function call detection from LS step polling
Move MitmStore function call check outside get_steps() block so tool
calls are detected immediately when captured by MITM, regardless of
LS processing state. Also reduce poll interval to 300ms.

The LS can take 20-30s for its internal multi-turn loop. Previously,
function call checks were nested inside the steps block and required
LS to have produced steps. Now the MITM capture is picked up within
300ms of detection.
2026-02-15 00:48:14 -06:00
Nikketryhard
4f08b994c7 fix: include tool results in conversation context
When OpenCode sends follow-up messages with tool results, include
the full conversation (user message, assistant tool calls, and tool
results) in the text sent to the model. Previously only the user
message was extracted, causing the model to never see tool results
and call the same tool repeatedly in an infinite loop.

Also add tool_calls and tool_call_id fields to CompletionMessage.
2026-02-15 00:42:43 -06:00
Nikketryhard
5d4125fa0d fix: suppress dummy text from tool call responses
Check for MITM-captured function calls BEFORE emitting text in the
streaming handler. This prevents the dummy 'Tool call completed'
placeholder (sent to the LS) from leaking to OpenCode, which was
confusing it into infinite loops.

Also removes duplicate function call storage at end of response loop
since they're now stored immediately when detected.
2026-02-15 00:37:39 -06:00
Nikketryhard
502318acec fix: store function calls in MitmStore immediately on detection
Previously, captured function calls were only stored in MitmStore
after the response loop ended. The completions handler polls
take_any_function_calls() during streaming, creating a race condition
where the MitmStore was empty.

Now function calls are stored immediately when parse_streaming_chunk
detects them, in both the initial body and body chunk paths.
2026-02-15 00:28:40 -06:00