Measuring Perplexity using API

Hi! As a researcher, I would like to be able to compute perplexity measures for responses to a certain prompt — both in terms of what the system returns as a response (the perplexity of that), and in terms of a phrase/word that I want to measure the model against.

For example: how good a response would “cat” be (its perplexity score) to the prompt “what’s an animal with four legs?” And what is the perplexity score of the system’s response (say it’s “dog”)?

Thanks!

4 Likes

Hi! Thank you for posting the question. I have the same need. I wonder if you figured out a way to do it or if anyone else has an idea.