Retrieval and JSON mode, using together

I gave another topic with the same question some insights just an hour before this was posited again here.

We can only infer how the JSON mode actually works - by producing a fixed response, and incrementing max_token one at a time, you can see when this token-run based method takes over and replaces the prior few logits with brackets, plus linefeed and tab characters very unlikely otherwise (should have been trained/biased on JSON without white space)