Enhancing GPT's Memory with Data from a ZIP Archive

qwebee · April 15, 2024, 3:39am

I’m building a GPT (in ChatGPT’s GUI GPT builder) to analyze data from my ZIP archive, which is quite large at about 400MB. Initially, my GPT was unable to open it and always froze. To address this, I created a custom Python script optimized for opening it quickly. Now, the GPT can successfully complete tasks and produce the data I need using my script (it changes hardcoded values in my code based on my prompts and runs the script).

However, each prompt requires the GPT to open the ZIP archive, which is time-consuming. Is there a better approach to train the GPT once on all the data from the ZIP archive so it can retain the information without needing to open it every time? I apologize if this is a stupid question, as I am new to the field.

jr.2509 · April 15, 2024, 7:13am

Welcome to the Forum!

No such thing as a stupid question.

What you are asking is not possible though. The GPT essentially performs RAG when processing information which you either upload directly to the GPT’s knowledge base or when connecting the GPT to an external database.

For example when you use the knowledge upload functionality within the GPT, the information residing in the files is accessed for every query in the form of semantic search. Initially, when the files are uploaded their contents are chunked and converted into vector embeddings. When a query requires access to the knowledge in the files that query too is in the backend converted to an embedding vector and then a search is performed to retrieve the most similar vectors along with the associated text and included as context when formulating the response to the query. In that sense, every query is handled separately in terms of knowledge retrieval.

The logic is similar for external data although the nature of information retrieveal depends on how your information is stored externally.

Only through the chat you have some temporary retention of the latest conversation history.

Does that make sense?

_j · April 15, 2024, 7:52am

Or more clearly:

zip files are just for code interpreter, and the AI must extract within the Python environment and run scripts to return their contents, knowing what’s inside, limited to 32k characters.

Only supported files, not zip, (by scanning the binary itself) are allowed for document extraction and ingestion into the retrieval tool. The documentation link on this forum’s bar will take you to assistants → tools, where there is a list of which destinations files can go to.

Retrieval doesn’t have an “unzipper” to bypass the file number limit.

Topic		Replies	Views
Data Limits for Custom GPTs GPT builders	16	4730	January 3, 2025
Question Regarding GPT Plugins with Code Interpreter with ZIP files Plugins / Actions builders plugin-development , chatgpt-plugin	10	2065	December 27, 2023
Inconsistent Behavior When ChatGPT Reads Uploaded Zip Files for Codebase Analysis GPT builders gpt-4 , chatgpt	0	266	November 13, 2024
How to best use GPTs with PDF files? Plugins / Actions builders plugin-development	14	17446	September 18, 2024
My GPT - Knowledge base - Best practices GPT builders	7	19662	January 25, 2024

Enhancing GPT's Memory with Data from a ZIP Archive

Related topics