I’ve been working on a Chrome extension that acts as a client-side privacy layer for LLM usage.
The idea:
Detect likely secrets in the prompt before it’s sent, replace them with local placeholders (e.g. [PWM_1]), and ensure only redacted data leaves the browser.
What’s currently working:
-
deterministic mapping (same secret → same placeholder)
-
idempotent behavior (already-redacted input stays unchanged)
-
mixed input handling (raw + placeholder in same prompt)
-
detection of common patterns (API keys, tokens, JWTs, connection strings, etc.)
-
verified via DevTools that outbound payloads contain only placeholders
This is not meant to be “perfect security,” but a safety layer to reduce accidental leakage during day-to-day LLM usage.
What I’m looking for:
-
where would you try to break this?
-
what edge cases am I missing?
-
how would you approach unknown secret detection (entropy vs context)?
Repo: you can find it in github with name petritbahtiri123/LeakGuard