I’m using pre- trained gpt model for my use case. So I need ground truth to evaluate the results and form metrics