Hey there, first post so please bear with me.
I have a very long & complex prompt that involves aggregating multiple text sources and generating new writing based on the aggregation. Loosely, the prompt specifies a format, tone instructions, source content, and an outline. It is one step in a multi-step process. For this step, the system and user instructions comprise around 5,000 tokens.
gpt-4-1106-preview handles it just fine, but gpt-4-0125-preview responds with “I’m sorry, I can’t assist with that request”
It’s a bit ironic since I think 0125 is supposed to combat laziness. 1106 usually gives me just what I need, at around 1500 tokens in response. Has anyone been seeing similar issues?
NOTE: I can’t post the exact prompt here due to privacy issues with my company, but I’m trying to reproduce through similar but have not yet been able to.
temperature doesn’t seem to help, but my usual tempt for this prompt is 0.1 if that info is relevant.