What is the final message (string input) generated from chat.completions

In the current syntax for generating model response, we use the chat.completions.create method and send in our input through messages, functions and other arguments in a structured format.

My understanding is that all of this structured input gets converted into a single string input which is then passed to the LLM model for prediction. How can you access the final string input that gets passed to the LLM model for prediction?

Hey there and welcome!

So, if everything goes according to plan, it should be sending a serialized json string to the API call.

May I ask why you need the final input? Oftentimes, you do stuff before you serialize it, and you typically donā€™t mess with the serialization afterwards. I thought the client calls handle this for you automatically, making this question rather unnecessary.

You cannot access that ā€œfinal stringā€, as it is created by the internals of the API endpoint after you pass a list of messages with role and content within an API request with the other parameters.

You can get a hint of the special tokens that are used to enclosed messages in the actual input to the AI model context by looking at the GPT-4 template here.

1 Like

Welcome to the community @roshan.santhosh

The final input passed to the model before being tokenized is inaccessible for safety and various other reasons.

However, you are correct that it is indeed a single string, as under the hood itā€™s still like the older completion models yet different.

That final string, AFAIK, is in ChatML. OpenAI used to have a GitHub readme about it, but it seems they have stopped maintaining it.

3 Likes

As others have said, you canā€™t access the final input string but you can get the total number of tokens it consumed by looking at the response object.

1 Like

This was more for my understanding of how the LLM handles such inputs, specifically the function calls.

Im assuming the models with function call capability were finetuned on a very specific prompt template to handle the function calls and any different prompt structure would probably not give the same level of results.

I dont know if this info was shared elsewhere through some paper. Or if people have been able to finetune models with the ability to work with function calls.

2 Likes

Thanks. This was pointed out recently to me as well. For chat.completions.create, including additional arguments like functions, tools etc increases the number of prompt tokens. So these arguments are definitely getting converted into string as part of the prompt

Early on someone got the model to dump its internal prompt for functions but theyā€™ve since plugged that hole and the last time I searched for the dumped inner prompt text I couldnā€™t find it.

Itā€™s not rocket science what theyā€™re doing though. They basically show the model a trimmed down version of the JSON schema you passed in

2 Likes

There is no ā€˜holeā€™ to be patched. The AI will be made to do whatever a determined person wants, including repeating anything in context.

Hereā€™s that function language as it is received by the AI, as part of the first system part message.

Parallel tool calls are by an additional tool called multi_tool_use, that wastes more tokens, and tells the AI to place multiple functions in the tool wrapper, and later models are trained to use these.