New gpt-4-turbo-preview saying it can't help on complex prompt

devon · January 28, 2024, 11:49pm

Hey there, first post so please bear with me.

I have a very long & complex prompt that involves aggregating multiple text sources and generating new writing based on the aggregation. Loosely, the prompt specifies a format, tone instructions, source content, and an outline. It is one step in a multi-step process. For this step, the system and user instructions comprise around 5,000 tokens.

gpt-4-1106-preview handles it just fine, but gpt-4-0125-preview responds with “I’m sorry, I can’t assist with that request”

It’s a bit ironic since I think 0125 is supposed to combat laziness. 1106 usually gives me just what I need, at around 1500 tokens in response. Has anyone been seeing similar issues?

NOTE: I can’t post the exact prompt here due to privacy issues with my company, but I’m trying to reproduce through similar but have not yet been able to.

temperature doesn’t seem to help, but my usual tempt for this prompt is 0.1 if that info is relevant.

anon10827405 · January 28, 2024, 11:53pm

It seems to me what you are doing is spinning articles, are they news?

It could be that that the newer model has been trained to not accept whatever type of article content that you’re attempting to spin.

DevGirl · January 29, 2024, 12:58am

I agree, that is ironic re: 0125’s objectives!

I have a pretty strong suspicion of how I could offer a tweak or two that might fix this.

I just recently posted the following as a tip for another user. Would it be possible for you to do this, so that you can share a rough example:

If you ever need to provide examples, the best route is taking the most conventional/standard examples you have, feeding them to an LLM and requesting it give you an analog. In your prompt, specify the topic and meta details so that the compute time / prediction focuses solely on the most analogous, direct parallel.

Remove all of the long content so that you can fit it in 1500 tokens or so. Ideally preserving the meta/instruction side and trimming the variable content between them.

Again, I’m not requesting anything private or anything directly from your prompt, but the methodology/formatting/similar example that I can test as I have a few suspicions.

In addition, you mentioned temperature – for purposes of testing/reproduction, you’re reducing top_p as well, correct? Temp at 0.1 and top_p at 0, in this instance?

Also, I’m sure you’ve already done this; however it can’t hurt to mention – have you attempted to merely trim the variable content and retry, adding iteratively to identify the point where it begins to fail?

Similarly, you’ve also provided GPT a highly simplified straightforward overview in the first sentence w/markup to assure that all context that follows is more accurately transformed?

I know these are rudimentary; however, it never hurts to be certain

devon · January 29, 2024, 2:57am

Hey. This is a good theory. The thing that is strange is that it’s happening regardless of the source content, and the task is purely neutral aggregation of text (some news, some other content like scientific documents etc).

Also I can send “can you explain why not?” or “why?” and it usually does it (albeit a worse job) after apologizing. I would guess that it would say something about TOS, etc. The task is generally “Summarize in this format from these sources as neutrally as possible” (and as far as I know was also cleared back when use cases had to go through approval)

devon · January 29, 2024, 3:01am

Never too rudimentary! I feel like a lot of things that are assumed in some people’s workflows are different in others. I’ve tried most of these but haven’t dug too deep yet since I just noticed this today. I will try to get an analog as well.

Thanks for the thoughts!

Diet · January 29, 2024, 3:04am

I don’t have much experience with 125 yet; but typically if it apologizes, then you need to get rid of the apology and reframe the prompt (or just retry).

Sometimes a justification of why it’s OK or necessary at the end of the prompt might help the model get started.

SomebodySysop · January 29, 2024, 10:50am

My prompts are pretty complex as you can see here: API Prompt for gpt-3.5-turbo-16k - #12 by SomebodySysop

I haven’t tried the new gpt-4-turbo-preview but I’ve not had problems with these prompts with gpt-4-1106-preview. Although with gpt-3.5-turbo-16k I had your exact problem. I didn’t totally fix it, but improved the responses by sending the prompts in XML format. Maybe try that with your prompts to the new turbo preview to see if it makes a difference?

enochlev · January 29, 2024, 7:47pm

next step prompt inject malicious prompts into websites

Topic		Replies	Views
Gpt-4-1106-preview Doesn't listen to instructions! Bugs gpt-4 , api	11	1363	January 29, 2024
Complex Prompt Getting Continuously Worse Results Prompting api , gpt-4-vision , assistants-api	6	883	July 24, 2024
Complex prompt for complex gaming bot personality Prompting chatgpt	7	2279	December 15, 2023
Inconsistencies in API response to same prompt and similar content API gpt-4 , gpt-35-turbo , api	3	5176	July 18, 2023
Cannot, for the life of me, get a detailed enough response API gpt-4 , api	13	2811	February 22, 2024

New gpt-4-turbo-preview saying it can't help on complex prompt

Related topics