Tricks for prompting when processing much data (long context)?

Hi!

I’m increasingly using the chat completion API to sequentially process a larger amount of data (say, a couple of ten thousand tokens) for tasks that are easily described to a LLM that has some background knowledge, but would be much harder to actually automate in code - so I’m automating these using the chat completion API and, e.g., gpt-4o-mini. But when the amount of data becomes larger, I’ve noticed several times that the output of the LLM degrades into a loop after a while, presumably because the LLM looses track, or just aborts. Did you try similar things and find some tricks to improve that?

For example, this list of OSGI configurations is generated automatically with my AI code generation pipeline from annotations in all Java files containing annotations @ObjectClassDefinition and @AttributeDefinition like this. Now, it’d be nice if you just could concatenate all java files with those annotations, and run a prompt that generates those tables. But what often happens for me is that it either aborts, or degenerates into a loop.

There are obvious workarounds:

  • Process the files one by one and combine the output. That takes more time and raises the token count considerably, as prompts and background knowledge will be repeated. On the bright side, if you create temporary files it’s easy to omit files that already have been processed and only process new / changed files (as aigenpipeline implements).
  • Process, say, 10 files at a time, which is somewhat cumbersome to implement and loses some advantages of the first approach.

But I wonder whether there are some prompt engineering type tricks I could pull instead, to process all files with one request. Does anybody have interesting suggestions / experiences there?

Thanks so much!

One “trick” you can use if you just want to tabulate an extremely long list of stuff, is to use a sequence bootstrapping / FSM method.

In your java code, I’d try inserting a counter (should be sequential!) in front of every item, such as

// #DefinitionSerial: 1
// AI Instruction: turn this into JSON_TYPE_ObjectClassDefinition
@ObjectClassDefinition(name = "Composum AI Autotranslate Configuration",
...
// #DefinitionSerial: 2
// AI Instruction: turn this into JSON_TYPE_AttributeDefinition 
@AttributeDefinition(name =  ...
...

// <------------------------------> //

...
// #DefinitionSerial: 992
// AI Instruction: just emit JSON_TYPE_Terminator and then close array.

provide the types

type JSON_TYPE_ObjectClassDefinition = 
{
    "title": string, // the verbatim title from the code
    "desc": string // short description of what it does.
}

type JSON_TYPE_AttributeDefinition = 
{
    "name": string // the name of the attribute
    ...
}

type JSON_TYPE_Terminator = 
{
    "done": true // verbatim this, this is just a terminator.
}

and then make a schema to fill this pattern:

[
    {
        "definitionSerial": 1,
        "type": "JSON_TYPE_ObjectClassDefinition",
        "data": {
            "title": "Composum AI Autotranslate Configuration",
            "desc": "Configuration of the automatic translation of AEM pages."
        }
    },
    {
        "definitionSerial": 2,
        "type": "JSON_TYPE_AttributeDefinition ",
        "data": {
            "name": // ...
        }
    } ,
// ------------------------------- //
    {
        "definitionSerial": 992,
        "type": "JSON_TYPE_Terminator",
        "data": {
            "done": true
        }
    } 
]

you can also consider just putting a stop sequence on "definitionSerial": 992, or JSON_TYPE_Terminator

While this is a fun exercise, I still think the cheapest way is to let gpt-4 generate a regex pattern/program and use that instead :laughing:

Also, don’t forget to keep the temp low!