For synthetic data generation, does o3-mini, o1, or 4o generally fare better?

4o writes better in terms of creativity, but if you are using another text as input and generating summaries, the output quality will depend on the input length (ref: Reasoning Degradation in LLMs with Long Context Windows: New Benchmarks).

In this case, for very long texts, o3-mini is better.

1 Like