New reasoning models: OpenAI o1-preview and o1-mini

I just don’t see how using anything out of this model could be useful for fine-tuning. It will barely follow instructions or guidance, completely ignores reference code and documentation that could get output in line; the system message you are trying to minimize that can’t be placed at all.

“Here’s your revised snippet from your 2000-line project, with all the exception hierarchy I couldn’t understand about deleting variables stripped, and all the function calls replaced with my pre-training. You like ‘gpt-3.5’ and everything made non-working right? Oh, and for the libraries you already made extensive use of, here’s how you can pip install”

I think there’s just simply too much context junk inserted after your input for the gpt-4o base inside to still pay attention, so it reverts to what its small model (and the fine tune model) is powered by instead of emergent intelligence that scales: post-training.

Got it working great for me on my experiments. Follows instructions, understands examples, responds exactly in the desired output format.

4 Likes

Ive wrote with all of them. Am i the only one that has this?

Question 1 to God A, absolutely in no way resolves this, because it isn’t known if God A is true or false, and this question does nothing to establish that. It just establishes that if it’s honest, it’ll tell the truth and ascribe yes to da, and if it’s lying it’ll ascribe no to da, if that is in fact correct.

Right. I think the solution needs iff (in and only if) clause in the question. It’s not able to figure that out from the thinking data it presents, I tried a few times. For original puzzle see: pp62 THE HARVARD REVIEW OF PHILOSOPHY SPRING 1996