When GPT4 first came out, I spent a lot of work trying to make this happen.
I think the problem is that GPT4 simply can’t hold a large idea in its head in an effective way.
There might be a way, fine tuning GPT4 to build some specific thing, ie, you fine tune the idea into the model. Something to try.
As for how long the stage is away from now, I don’t know. Breakthroughs are possible and certainly a lot of people are focused on this. What we have now however, it doesn’t seem doable.
What do you think of the swe-bench? Swe-bench: Very exciting eval, looking for SOTA - #5 by N2U