I’ve been using CODEX a lot lately, and I’ve noticed it has several issues with Unicode characters, emojis (which it seems to hate quite a lot), and Latin characters.
Sometimes it changes files to UTF-8 with BOM; other times, even when a page or script is 100% correct, it reports many invalid characters.
In some cases, it even corrupted parts of the code, costing me a lot of tokens and time debugging because I wasn’t aware of this behavior.