What are your impressions on gpt-4.1?

So what interesting things did you notice in the latest gpt-4.1 models?

If you have any interesting tips or cool prompts, please share it here.

Also, don’t forget to check the new updated prompting guides:

2 Likes

A particular information to be tested more thoughtly that is pointed in the guide, is where it says the model performed poorly with json as a delimiter in examples with large contexts, but did better with XML and another custom format.

A few tests, with 4.1-nano, have shown to me that it is a bit … lazy.

Not wrong, just lazy.

From logs (details reduced for brevity). The whole input is ~10 lines.


Sending query: 
{'model': 'gpt-4.1-nano',
 'messages': [{'role': 'user',
               'content': '```yaml\n'
                          'source: "<some text>"\n'
                          ...
                          '```'}],
 'temperature': 0.7,


With response_model:
{'properties': {'source': {'default': '', 'type': 'string'},
				...
  'required': [ ...
				],
 'type': 'object'}

Response: 
{'source': '',
 ...
 }

It could have returned source= "<some text>" but it chose to return just "" because it was permitted.

I know I could change the response_model to not specify a default value, but that is not the point.

It happened a few times already.
4o-mini was not doing the bare-minimum as 4.1-nano seems to do very frequently.

2 Likes

Not happy with nano or mini.

I find it speaks back to the user in the third person. 4.1 does not have this issue.

So I say hello and nano will respond like:

“merefield said hello earlier”

“Earlier” was just now!!

This is not good conversational style.

The bot should return my greeting not report the greeting!

4o-mini does not have this issue.

Unusable as Chatbot unless I find a prompt hack.

Ok I have a solution for that which seems to help.

If I put the username into the prompt attribute “name” instead of prepending username in prompt “merefield said hello” things work as expected.

2 Likes

Loving nano and mini. Still trying to hash them out.

Both are wonderful in agentic applications. I found gpt-4o-mini to struggle with simple instructions.

So, for a smarter model that’s cheaper than gpt-4o-mini – I am extremely happy.

2 Likes

To me, it is useless. The reason lies in the rate limits.

Here are the limits for tier 1:

gpt-4.1. 30,000 TPM. 500 RPM. 900,000 TPD

This means that when I have a task which requires say 500,000 tokens, it is borderline impossible to do this. I would need to split the context into 30,000 tokens over 20 minutes just to make a query. Given that it is an API, and that you need to keep context over the course of a session, this means I can do at max one query per hour, and a total of one query per day.

Who thought up these limits? The ONLY way that you could usefully use it is if you were at tier 4, But this requires $250 paid - and even then you’d be stuck with 2 queries per minute (don’t know how many queries total here per day). And to add insult to injury, you don’t even count the monthly service fee for your chatgpt subscription into the fold here.

So I’m really at a bind here. I’d love to experiment with it here to recommend it to my company for a corporate account, but there is no feasible way to do it since I don’t qualify for tier 4 even though I’ve been paying $200/month on the subscription pro service.

Since gemini has no such restrictions here and is pretty impressive, they are winning by default.

Ah well, c’est la vie.

If you want to get ahold of me to discuss this, you can reach me at [email redacted], but really I’m pretty flabbergasted at how short sighted and counter productive this rate limit is. If you want new customers, you should give them the ability to at least test it out with its full ability.