My question is the title basically. On the Playground, you have a counter specific to the codex counter, that can be simulated here: OpenAI API
I managed to build something in nodejs that returns this token result, but not for codex. Is this code available? On this website, they provide the encode/decode source code, but not for codex.
EDIT: I see that you want source code to the tokenizer for codex. Let me look if it’s on their GitHub, even better let’s get someone from OpenAI @staff who knows more about this.
I think there were just not enough people talking about it. I have also implemented a Python version in one of our open-source work in our GitHub. If there is a need I can make it a package.
But if you want to have a more proper tokenizer for Python, you can look into HuggingFace’s tokenizers package. It will be much faster than this original pure Python implementation.