For synthetic data generation, does o3-mini, o1, or 4o generally fare better?

I’ve found 4o-mini to be the best so far, but am still waiting on API access for the o3 models. I’ve generally skipped o1 because I’m getting great results with 4o-mini. The 4o model was a little too verbose and tended to drift to adjacencies. For textual generation, 4o, with the proper guidance, has hands down been great.