feat: sync all endpoints with MITM LS bypass + real-time thinking streaming

- Responses API (streaming): MITM bypass path polls MitmStore directly
  when custom tools are active, skipping LS step polling entirely.
  Streams thinking text deltas in real-time as they arrive from the MITM.
  Handles function calls, text response, and thinking/reasoning events.

- Responses API (sync): Same MITM bypass for non-streaming responses.
  Polls MitmStore for function calls or completed text before falling
  back to LS path.

- Gemini endpoint: MITM bypass polls MitmStore directly for tool call
  responses, eliminating LS overhead.

- MitmStore: Added captured_thinking_text field with set/peek/take methods
  for real-time thinking text capture from MITM SSE.

- MITM proxy: Now captures both thinking_text and response_text from
  StreamingAccumulator into MitmStore when bypass mode is active.
This commit is contained in:
Nikketryhard
2026-02-15 01:03:39 -06:00
parent 50b53097bc
commit b3af73cebd
5 changed files with 564 additions and 14 deletions

View File

@@ -91,8 +91,10 @@ pub struct MitmStore {
// ── Direct response capture (bypasses LS) ────────────────────────────
/// Captured response text from MITM when custom tools are active.
/// The completions handler reads this instead of polling LS steps.
/// The completions/responses handler reads this instead of polling LS steps.
captured_response_text: Arc<RwLock<Option<String>>>,
/// Captured thinking/reasoning text from MITM (for real-time streaming).
captured_thinking_text: Arc<RwLock<Option<String>>>,
/// Whether the captured response is complete (finishReason received).
response_complete: Arc<AtomicBool>,
}
@@ -134,6 +136,7 @@ impl MitmStore {
call_id_to_name: Arc::new(RwLock::new(HashMap::new())),
last_function_calls: Arc::new(RwLock::new(Vec::new())),
captured_response_text: Arc::new(RwLock::new(None)),
captured_thinking_text: Arc::new(RwLock::new(None)),
response_complete: Arc::new(AtomicBool::new(false)),
}
}
@@ -414,5 +417,23 @@ impl MitmStore {
pub async fn clear_response_async(&self) {
self.response_complete.store(false, Ordering::SeqCst);
*self.captured_response_text.write().await = None;
*self.captured_thinking_text.write().await = None;
}
// ── Thinking text capture ────────────────────────────────────────────
/// Set (replace) the captured thinking text.
pub async fn set_thinking_text(&self, text: &str) {
*self.captured_thinking_text.write().await = Some(text.to_string());
}
/// Peek at the captured thinking text without consuming it.
pub async fn peek_thinking_text(&self) -> Option<String> {
self.captured_thinking_text.read().await.clone()
}
/// Take the captured thinking text (consumes it).
pub async fn take_thinking_text(&self) -> Option<String> {
self.captured_thinking_text.write().await.take()
}
}