I have a long original text of about 10,000 tokens that I want to summarize using the GPT assistant API according to my guidelines. Could you suggest any example programs that might help with this?
Also, regarding the input for long texts, I understand that the context limit for GPT-4 is 128K tokens, but the limit for each input/output is only 4096 tokens, right?
Are the instructions and role content included in this 4096-token limit?
assistant = client.beta.assistants.create(
name="Data visualizer",
description="You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",
model="gpt-4-turbo",
tools=[{"type": "code_interpreter"}],
tool_resources={
"code_interpreter": {
"file_ids": [file.id]
}
}
One approach I’m considering is to split the long text into segments, have GPT remember each segment, and then produce a summary based on the guidelines at the end.
However, this method might lead to the generation of misleading or disconnected content.
Another method would be to process each segment separately, beginning each input with a prompt to summarize based on the guidelines, gradually building a comprehensive summary.
Which method would be more effective, or is there a better alternative?
Summarization is a frequent topic here in this forum and there are a few tried and tested methods for summarization of longer inputs. See the following post by @Diet for an illustrative overview of how to approach it.
In general, the approach depends somewhat on the level of granularity for the summary you are looking for. Technically, if your input text is solely 10,000 tokens, you can generate a summary via one API call. However, the maximum length of your summary is bound by completion token limit (see below). If you are looking for a high level of detail, then an approach like the one shared in the post is to first chunk your original text and then prepare summaries for individual chunks.
On your related question:
Essentially the context limit, i.e. 128,000 in the case of the GPT-4-turbo series, represents the upper limit of tokens for a given request. In sum, input tokens and completion tokens cannot exceed this limit. The 4,096 limit applies to the completion tokens, not the input tokens which also include the instructions along with additional context you are providing such as the original text for summarization in your case.
In practice, though, the 4,096 output token limit is rarely reached and you are more likely to get outputs in the magnitude of anywhere between 800-2000 tokens, depending on how you design your prompt. So linking this back to the point above, if you are looking for more details in your summary, then you need to opt for an approach that involves splitting/chunking the input text and running multiple API calls, the outputs of which you then combine to arrive at the full, detailed text summary.