It seems to help a lot if you provide output formatting directives before stating the problem.
In particular, I was having difficulty getting gpt-3.5-turbo to respond in the format I wanted. When I moved the ‘respond using format…’ directive before finishing the actual problem statement, suddenly, problem solved!
hypothesis, (thanks to @N4U) is that the standard position-encoded llms have some trouble looking ahead in token strings, so once the complete problem has been stated, they start reasoning out the solution without being aware of the required format (apologies for the very crude misstatement of @N4U hypothesis and the gross anthropomorphism in my paraphrase. and yes, I know standard token placement algs use sinusoidal modulation of token position, maybe the actual issue is attention-head positioning, or …).