Training GPT-3 or similar on data

After some experimentation I currently think GPT-3 is very good at answering short, simple informational questions with no context, provided the question more or less has a “correct answer” (is not too open-ended), but the key limitation is what it does and does not know, i.e. what it was trained on.

This naturally leads me to consider applying a similar technology to training data of my own.

Has anyone created their own GPT or is it too computationally expensive? The architecture is public knowledge, isn’t it?

Is there a good second choice, like BERT?

Thanks very much.