Hi, for certain inputs, when I try to get surprisal values with gpt-3.5-turbo-instruct with Python’s surprisal package, concretely surprisal.OpenAIModel(), I get the following error:
in surprise
len(tokens) == len(tokenized[b]) + use_bos_token
AssertionError: Length mismatch in tokenization by GPT2 tokenizer `Encoding(num_tokens=
The input that I am trying to run is: “What it is not free to do is to covertly manipulate messages that are purportedly being created by the independent creative communities”
Am I missing something? Any input would be highly appreciated!