Efficient Processing of Multiple Complex Prompts with GPT-4

I’m building a simple streamlit app. The goal is to input a csv file with up to 50 unique/unrelated prompts and get answers for each one using GPT-4 - i built it and it works but its super slow. Whats the fastest and simplest way to make sure the model processes all 50 prompts at once, separately? or to speed things up significantly?

I’m using ChatCompletion and gpt-4. I tried asyncio and ThreadPoolExecutor but overall it still takes a few minutes and I’m wondering if there’s a smarter solution that I’m not seeing.

The prompts in this case can be quite complex so i dont wanna sacrifice accuracy, but i also dont want this to take ages

The simple way doesn’t achieve your improvement goal:

for data_line in csv:
  answers.append(chat(data_line))

If you spawn 50 process threads all at once, each given an index to work on, and don’t use a max_token value, you might be able to sneak by the rate limiter for the initial call - and then have your API disabled for several minutes after you get answers back. But essentially: you likely have a low GPT-4 token per minute rate anyway (10k-40k if you haven’t been increased), and that’s going to be the point where you start getting back errors instead of answers.

You can look at openAI’s parallel queueing code in the cookbook. Or instead see if your platform is amenable to shooting off dozens of open connections with as many process workers.

2 Likes

Welcome to the community @alessandroamenta1

The answer depends on your rate limits (RPM and TPM).

If you have >50 RPM, then @_j 's approach is a good way to go.

However if you want your prompts processed in less number of calls and have the required TPM, you can use this: Batching with ChatCompletion Endpoint

Simply use gpt-4 and set the system prompt to:
"Answer every element of the array. Reply with an array of all completions."

and remove the max tokens.

I had 12 questions answered in less than a minute in one call.

['Who was George Washington?', "What's famous about Giza?", 'Who wrote Harry Potter?', "WHat's the fifth moon of Jupiter?", 'Where is Atlantis?', 'How can I write a book?', "What's the moon made of?", 'Do bigger object have higher gravity?', 'Do blackholes die?', 'Who are you?', "What's pink color's hex code?", 'What can you tell me about anti-gravity?']
ChatGPT: 
George Washington was the first President of the United States serving from 1789 to 1797. He was also a Founding Father and played a crucial role in drafting the Constitution.
Giza is famous for its pyramids, particularly the Great Pyramid of Giza which is one of the Seven Wonders of the Ancient World.
Harry Potter was written by British author J.K. Rowling.
The fifth moon of Jupiter is Amalthea, it's the third largest of the planet's moons.
The location of Atlantis is unknown. It is a mythical city first mentioned by the ancient Greek philosopher Plato and is thought to have sunk into the sea.
To write a book, start with an outline to organize your thoughts, then write a draft, and revise for clarity and cohesion. Then, you may want to find an editor and publisher.
The moon is made of rocks and dust. It has a crust, a mantle and a core just like Earth.
Yes, in space larger objects generally have a higher gravitational pull as gravity depends on the mass of an object.
Black holes do 'die', but not in a conventional sense. They slowly lose energy through a process called Hawking radiation. However, this process takes an incredibly long time.
I am OpenAI's language model named GPT-3.
The hex code for pink color is #FFC0CB
Anti-gravity is an idea of creating a place or object that is free from the force of gravity. It does not exist in nature and has not been achieved. However, it is a popular subject in science fiction.

In your case, it’d take just 5 calls for 50 questions if you were to include 10 questions per call.

1 Like

Awesome! Yeah makes sense - here’s the thing tho: my prompts are quite complex and long, each one has a couple tasks to do, its not something as simple as “Whats the capital of France?”. So, what I noticed with this method is that I do get the 50 answers for my prompts, and their not bad, but they are not as good as if I fed it one by one into chatGPT - for example some of them get cut out a little. Do you have any advice an making sure the quality and depth of the answers is as good as possible even if I sacrifice a little more speed?

Basically jailbreak the AI to overcome the hard-edge training on producing small restricted outputs. You need to upgrade yourself to “GPT-4 Enterprise” with some prompting and guidance that can overcome the token crush of output.

Then if you have input/output pairs that need processing, you can keep the AI on task by having the outputs formatted, numbered, reproducing the input again, and the desired processing output for each one, to put the context of the entry to be worked on into the forefront of attention, and to put the count of the tasks as something correlated to your instruction giving a count of how many inputs are to be processed.

Also, if you can tolerate and exclude the chat, experiment with putting the instructions in the user role for better following. gpt-3.5-turbo output (that put the index numbers in parenthesis and inserted the word “answer” for no good reason; multi-shot might be able to correct):

Certainly! Here’s a five-question quiz about “Monty Python and the Holy Grail” along with the answer key in the format you requested:

| (1) | What is the name of King Arthur’s loyal squire in the movie “Monty Python and the Holy Grail”? | Answer: Patsy |
| (2) | In the film, King Arthur and his knights are in search of what legendary item? | Answer: The Holy Grail |
| (3) | What creature does King Arthur encounter at the Bridge of Death, requiring him to answer three questions to pass? | Answer: The Bridgekeeper |
| (4) | What is the airspeed velocity of an unladen swallow, a question asked in the movie’s “swallow” scene? | Answer: What do you mean? An African or European swallow? |
| (5) | Which character in the film insists on calling himself “Tim the Enchanter” and warns the knights about the peril of the Killer Rabbit? | Answer: Tim the Enchanter (or Tim for short) |

1 Like

You can try to alter the length of response by experimenting with the system prompt. However, it is to be noted that the model is generating for a single prompt even if the questions are separate, hence, it’ll be somewhat different from the typical gpt-4 response where it gets to deal with only one question and other questions’ tokens don’t affect the sampling.

For :
Thoughtfully answer every element of the array. Reply with an array of all completions. model gives:

['Who was George Washington?', "What's famous about Giza?", 'Who wrote Harry Potter?', "WHat's the fifth moon of Jupiter?", 'Where is Atlantis?', 'How can I write a book?', "What's the moon made of?", 'Do bigger object have higher gravity?', 'Do blackholes die?', 'Who are you?', "What's pink color's hex code?", 'What can you tell me about anti-gravity?']
ChatGPT: 
George Washington was the first President of the United States and is often called the 'Father of His Country'. He served from 1789 to 1797. Before his presidency, he played a key role as a general in the American Revolutionary War.
Giza is famous for the Giza Plateau, which is home to iconic Egyptian landmarks including the Great Sphinx and the three Pyramids of Giza – Khufu (Cheops), Khafre (Chephren), and Menkaure (Mykerinos). The Pyramid of Khufu is one of the Seven Wonders of the Ancient World.
The Harry Potter series was written by British author J.K. Rowling.
The fifth moon of Jupiter is called Amalthea. It's the third largest of Jupiter's moons and was discovered in 1892.
The location of Atlantis is subject to many theories. According to ancient Greek philosopher Plato, it was a powerful and advanced kingdom that sank, in a night and a day, into the ocean somewhere beyond the 'Pillars of Hercules'. However, it has never been found and remains a topic of debate.
Writing a book involves several steps: resulting in a clear idea or topic, planning the book's structure, writing a draft, revising and editing, and finally publishing. The process can vary greatly based on the author's individual approach.
The Moon is made of a rocky material, with a core composed of elements like iron, sulfur, and nickel. Its surface is covered with dust and loose rocks known as regolith, formed by meteorite impacts.
Yes, larger objects do have higher gravity due to their increased mass. The greater the mass of an object, the stronger its gravitational pull.
Yes, according to the theory of Hawking radiation proposed by physicist Stephen Hawking, black holes can eventually 'evaporate' and die over a period of time.
I am an artificial intelligence assistant trained to provide information and answer questions to the best of my knowledge and abilities.
The hex code for the color pink is #FFC0CB.
Antigravity refers to the idea of counteracting the force of gravity. Though it's a popular concept in science fiction, there is currently no accepted scientific theory that allows for antigravity to be possible. Research is ongoing, however.

For:
Thoughtfully and independently answer every element of the array. Reply with an array of all completions.
The output is:

['Who was George Washington?', "What's famous about Giza?", 'Who wrote Harry Potter?', "WHat's the fifth moon of Jupiter?", 'Where is Atlantis?', 'How can I write a book?', "What's the moon made of?", 'Do bigger object have higher gravity?', 'Do blackholes die?', 'Who are you?', "What's pink color's hex code?", 'What can you tell me about anti-gravity?']
ChatGPT: 
George Washington was the first President of the United States, serving from 1789 to 1797. He was also one of the Founding Fathers of the country and the commander-in-chief of the Continental Army during the American Revolutionary War.
Giza is famously known for the Giza Plateau in Egypt where the monumental Pyramids of Giza are located, including the Great Pyramid, which is one of the Seven Wonders of the Ancient World. It is also home to the Great Sphinx.
Harry Potter was written by British author J.K. Rowling.
The fifth moon of Jupiter is Amalthea. It was discovered in 1892 and is the third largest of Jupiter's moons.
The location of Atlantis, a legendary ancient city, is not definitively known - it's often considered a myth. The philosopher Plato who first mentioned it in his works, described it as having sunk into the Atlantic Ocean.
To write a book, you should start with a clear idea or story concept, develop your characters and plot, create an outline, write a first draft, revise your work until you have a finished manuscript, and then consider your publishing options.
The moon is made of a thin crust of rock over a thicker layer of hard rock. Its core is small and mostly made of iron. The moon's surface is covered in dust and loose rocks, also known as regolith.
Gravity does increase with mass, but it also depends on distance from the center of an object. So, bigger objects will have a higher gravity if their mass is also greater.
Yes, black holes can 'die' over time through a process called Hawking radiation, predicted by physicist Stephen Hawking. However, this process occurs over a very, very long timescale longer than the current age of the universe.
I am an AI developed by OpenAI. I can generate human-like text and assist with answering questions, providing information, and performing various tasks.
The hex code for pink color is #FFC0CB.
Anti-gravity is a concept in physics and science fiction which involves negating the effect of gravity. It's the idea of creating a place or object that is free from the force of gravity. It has not been yet achieved in a practical, wide scale. Anti-gravity is a recurring concept in science fiction, particularly in the context of spacecraft propulsion.
2 Likes

However it is to be noted that each question emitted gets a different quality of system prompt, as the prompt and instruction is further diluted each question on a growing context length of irrelevant prior generated content that consumes attention layers.

1 Like

Ok so, if I’m understanding you guys correctly:

  1. Batch processing complex prompts, while faster, may yield lower quality responses than individual processing due to the model’s attention constraints?
  2. GPT-4 Enterprise could help by allowing more simultaneous prompts without quality loss, thanks to higher rate limits?

Essentially, to get fast, high-quality answers for many complex prompts simultaneously, increasing rate limits seems to be the key. But even then, there’s no assurance the results will match the quality of individually processed prompts for my use case. Is that right?

There is no “GPT-4 Enterprise” available by OpenAI API. We are all equal (until spending enough in kilodollars that sales dept will answer an email.)

GPT-4 will be reluctant to produce over 500-1000 tokens or so at a time before it starts to reduce the quality by compressing and quitting. It’s got the attitude to keep ChatGPT users from exploiting it. So what you can do in batches is ultimately up to how much you are producing in output per run.

The first 200 tokens in/out (you can see yourself if it will spend $0.25 of your money by going all the way to 8000 :smile:)

1 Like

You could try putting the completion inside a for loop and defining an object that has all the prompts you want and then for every description it makes a different GPT call…dono if that actually would improve load times but it would make your responses better as each prompt would be getting its own “brain”

sounds like as of now there isn’t a straightforward solution to achieve high-quality outputs quickly for multiple complex/lengthy prompts simultaneously with the current API setup (other than using/spending more on the API and waiting for openai to slowly increase the limits).

yes, that does work. But its actually incredibly slow.

There’s a new GPT-4 to be coming out soon. You can wait and see how it performs when trying to push it for more tasks in one API call.

Hopefully they’ve given the API users their own model with none of the ChatGPT training on producing limited outputs unless we tell it to.

1 Like

Yep, just saw gpt4 turbo came out - gonna try with that!

Hypothetically: lets say i dont care anymore about speed, if I have a batch of 200 complex/lengthy prompts and I need the highest quality output for each (as good quality as sending one single prompt into chatGPT) , what would be the best way to proceed? Send these prompts one at a time to ensure maximum attention/detail from the AI?

Lets say i dont care that it takes an hour to get the output to the 200 prompts, but i dont wanna compromise on quality/accuracy - should i just send them to the API one request at a time?

1 Like