fix: add retry logic for MITM thinking text merge race condition

The LS makes two Google API calls for thinking models. Call 2 (thinking
summary) may not have arrived by the time usage_from_poll runs after
Call 1 (response). Now we peek first, and if thinking tokens exist but
text is missing, wait up to 1s for the merge to happen.

Also adds peek_usage method to MitmStore for non-consuming reads.
This commit is contained in:
Nikketryhard
2026-02-14 19:54:37 -06:00
parent 34b9553484
commit 5c1f4c77d9
2 changed files with 34 additions and 5 deletions

View File

@@ -172,6 +172,13 @@ impl MitmStore {
/// Get the latest usage for a cascade, consuming it (one-shot read).
///
/// Peek at usage data for a cascade without consuming it.
/// Used to check if thinking text has been merged before taking.
pub async fn peek_usage(&self, cascade_id: &str) -> Option<ApiUsage> {
let latest = self.latest_usage.read().await;
latest.get(cascade_id).cloned()
}
/// Only returns exact cascade_id matches — no cross-cascade fallback.
/// The `_latest` key is only consumed when the caller explicitly requests it
/// (i.e., when the MITM couldn't identify the cascade).