Error when uploading a file (for Q&A)

Hello there!

I am trying to upload a document using the OpenAI API API but I get the error: “Invalid file format. Example 1 has a text that is too long.”

Is there a limit on the length of the “text” field for each line of the JSONL document (to be uploaded)?

In the current pipeline I am uploading something that might not be always “good” sentences. I have done two edge case experiments:
1 - I have tried to upload a JSONL file with one json line where the “text” field contains the string "ciao " repeated 1000 times. This gives me no error.

2 - I have tried to upload a JSONL file with one json line where the “text” field contains the string "ciao " repeated 1500 times. This gives me the error “Invalid file format. Example 1 has a text that is too long.”.

Is there a way to know in advance if the text uploaded will be valid or not?
Is it possible to know what kind of checks are done once the document is uploaded? so that I can do these checks before the I upload the document.

Many thanks

1 Like

I have another related question:

if I have a very long text to upload what would be the best strategy to upload it?
Would it be better to split it in many documents or in as few as possible?
→ “many documents” meaning splitting the text short documents, like a sentence for each document.
→ “few as possible” meaning splitting the text in documents that will generate close to 2048 tokens.
And after this split I will upload every documents in the same JSONL file.

The questions that I will ask later on might use the whole initial text.

Many thanks again!