Text completion task with gpt-3.5-turbo

Hi, I am using gpt-3.5-turbo chat completion model for generating text. The response from openai api is in plain text. I want to get the response in table format. So i have mentioned it in the messages. But the response json has some thing like

\n\n| Film/TV Show | Year | Role |\n|--------------|------|------|\n| Thor | 2011 | Loki |\n| Midnight in Paris | 2011 | F. Scott Fitzgerald |\n| War Horse | 2011 | Captain James Nicholls |\n| The Avengers | 2012 | Loki |\n| Only Lovers Left Alive | 2013 | Adam |\n| Crimson Peak | 2015 | Sir Thomas Sharpe |\n| High Rise | 2015 | Dr. Robert Laing |\n| I Saw The Light | 2015 | Hank Williams |\n

How to format the text from the response, for at least new line, tables, bullet point, heading…Something like that?

It seems to be listing in markdown format. ChatGPT (the website) will format it, but you may have to put it the string into something else that can display markdown

Hi, I have used gpt3 encoder for counting the message tokens for gpt-3.5-turbo, before making an api call, which helps to set max_token value to api request, without getting length exceeded error. But, my token counting is differ from the count which is mentioned in api response, prompt_tokens of usage. so that, length exceeding issue occurred for some of my prompt inputs.

I have used php instead python. so is there any solution to get same prompt tokens count as in api response. How do i achieve this in php or else using python tiktoken is the only solution. Please guide me in this. Thanks in advance

Hi, Personally I only know of tiktoken for python or js to get an accurate token count, I have created a web based tiktoken front end and it’s trivial to do so, so presumably you could also do something like this. That would allow you to make a call in php to your own custom endpoint.

Note that the API will add some overhead for each record in the messages struct, so the tiktoken count of a particular statement/text isn’t entirely accurate compared to what the API library will issue to the back-end, and the back-end will actually format and use.
E g “these are system instructions” or “previous agent responses were” are instructions that the inference API will include in the data it sends to the model, and that counts as tokens, but I haven’t measured exactly how many tokens this adds – it’s greater than 0.