After some searching, I found that @simonl from the community has already built a tokenizer that works for both GPT-3 and codex.
Here’s the GitHub
Here’s the original post: Codex Tokenizer Logic - #2 by simonl
After some searching, I found that @simonl from the community has already built a tokenizer that works for both GPT-3 and codex.
Here’s the GitHub
Here’s the original post: Codex Tokenizer Logic - #2 by simonl