While I am (most of the time) thoroughly blown away by what comes out of codex, I’m always wondering what the costs (both money and energy) are of such completions? With GPT3, I can get a reasonable idea of the underlying costs since it’s a paying service. But from what I understand, Codex is considerably larger (and thus more expensive to run)?
Also, from my (limited) understanding of Transformer models, the computational complexity is quadratic in sequence length. Does that mean that doing “back and forths” with codex (write a prompt, get an answer, then “answer” with additional fine-tunings) - which I think is one of the things that the model excels at - causes the cost to increase in a non-linear way?
TL; DR: How much does a completion cost? And what about a back-and-forth with the model?
Fun fact: the above tl;dr was generated by GPT3.