You do not see the actual tokens placed into context by the messages containers and the prompt for where the assistant shall write its response.
That internal communication is only between the chatCompletions API backend and the model.
OpenAI further doesn’t let you see what the AI is writing to call functions, and you can only place assistant messages in history by a similar function abstraction.
The AI can repeat back a lot of context by instruction solely focused on jailbreaking it, but it cannot reproduce special tokens and is reluctant to precisely replay other formatted context.
This tokenizer has a template for GPT-4 that is mostly correct for all models except original gpt-3.5-turbo-0301.