GPT for long document generation

bruno.vaz · August 5, 2024, 9:57am

Hey everyone!

I have a long document generation use case and I would appreciate it if you could share some ideas on how to go about this problem.
In this use case I’m working on, the main goal is to create requests for proposals (RFPs) based on a template file and several human-written RFPs. All the files are either PDFs or Word documents – I think I can convert the Word files to PDF so that everything is a PDF, though.
My idea is to somehow chunk the files and store them on a vector DB. This way, every time a new RFP has to be made, I can retrieve relevant RFP chunks that can help write each section – I think doing the document building section by section might be best, otherwise the LLM would have to both receive and output a lot of text. I also have to ensure the template is followed, but some older RFPs do not have the template’s structure.
I haven’t found much literature regarding long-document generation; the most relevant one was this one LLM Based Multi-Agent Generation of Semi-structured Documents from Semantic Templates in the Public Administration Domain.

If you have any papers, blogs posts, your own experience, etc. related with this kind of use case, I would enjoy you sharing it By the way, I’m probably going to use GPT-4o for this, but I think I can use any other OpenAI model.

Thanks in advance!

Topic		Replies	Views
Chat with long documents (eg books) with long term memory API	0	709	October 16, 2023
Chatting with books with memory beyond token limits Community api	0	405	October 20, 2023
Reading Longer Documents/Inputs API	1	417	February 10, 2024
Creating large outputs (eg 20-30 page docs) with Assistants? API gpt-4	2	555	December 19, 2023
Seeking Help with Formatting a Large LaTeX File API embeddings , chatgpt , api , openai	5	1118	December 16, 2023

GPT for long document generation

Related topics