Hi,
I’m using gpt-4.1 for a conversational chatbot (gpt-5 is horrible for this purpose as i’ve tested up to now).
i’ve been using the Instructions parameter up to now in the responses API.
then i’ve seen that when the conversation log is long (40 messages for example), it’s sometimes impossible to give proper instructions to the model, it completely ignores them and continues the flow of the conversation as it sees fit based on the conversation up to now.
that sent me on a research regarding the ‘developer’ message as a replacement for using ‘instructions’.
from doing tests:
if i’m replacing the instructions with a ‘developer’ message at the beginning of the conversation, the results are slightly better but far from perfect.
and if i’m placing the ‘developer’ message at the END of the conversation, after the user’s last message, i get really good results.
i’m really ok with doing that constantly there’s no problem in the code to simply put it in the end, the issue is - caching.
if it comes at the end - it can’t get cached because the caching will end somewhere towards it.
and that’s a big cost for me - because my prompt is pretty long.
any ideas? anyone?
if i could put the developer message at the beginning and also get good results , that would be great because it will be cached properly.
or any other trick to help the model listen to instructions properly in long conversations, while maintaining as much caching as possible.
Thanks!