I’m creating a custom GPT and trying to figure out how files are used.
How are the uploaded files in custom GPTs processed? Do I have to hint in the instructions to refer to the files for specific topics, or are they processed in advances and the knowledge already available? Do I even have to mention these files in the prompt or instructions?
Are uploaded files different than files uploaded during a conversation?
I couldn’t file any docs around this and any guidance is appreciated.
You can direct them to knowledge and tell them to read it or instruct them with logic to read them at hello. Or read them by name of doc, txt etc.
Example … “read example.txt in knowledge and give me a summary “
Yes the ones in knowledge at hello are onboard 20 max files and drag and drop adds to it in session.
So say if you needed it as a table top rpg aid. You could structure instructions around onboard knowledge.
Generated for examples.
Here are condensed examples based on your instructions for using onboard files in an RPG session:
File Reference and Summary Request:
Example: “Read example.txt in knowledge and give me a summary.”
Generating a Character:
Example: “Generate a sci-fi character using Iloveall.txt and Fluffychaos.txt.”
Summarizing Rules:
Example: “Summarize the key combat mechanics from Resources.txt.”
Retrieving Lore:
Example: “Retrieve world-building details from Always have soulfulness in our lives.txt.”
Cross-Reference Data:
Example: “Always reference all sources and cite sources.txt and Revised Instructions with Trademark and Copyright Laws.txt.”
Creative Encounter:
Example: “Generate a whimsical encounter using Fluffychaos.txt.”
Turn-based Combat Aid:
Example: “Use new love all.txt to provide combat tactics for my character.”
Rules Interpretation:
Example: “Interpret NPC behavior based on Always be kind, smart, focused.txt.”
Narrative Flow Assistance:
Example: “Provide plot suggestions using Always read this bee kind.txt.”
Game World Integration:
Example: “Summarize how Fractal Flux affects gameplay from Fractal flux.txt.”
These examples streamline the integration of knowledge files for various RPG tasks.
There is no documentation for the internals of ChatGPT, as it is OpenAI’s proprietary product. However the mechanism is similar to API’s ‘Assistants’ in terms of how files are handled, either when as part of an assistant (called a GPT in this case), or as an attachment from a user.
The main difference from Assistants is that files of a ChatGPT GPT are presented to both Knowledge and to Code Interpreter automatically when those are enabled, for supported file types, while an API developer takes more deliberate action and chooses the tool associated with a file.
The files that are provided to code interpreter are simply placed in the session mount point for use by Python code. A notation is given to the AI of what files are available.
The same files for knowledge are chunked to parts, and use embeddings for a similarity vector database.
The vector database is presented to the AI as a tool, where the AI can emit a search query, and get chunks of documents back as that tool’s return, if not the whole shorter document.
Both GPT files and user upload files (except images) go to the same places, and cannot be easily distinguished by the AI from the information presentation.
It is fascinating to think about how it functions under the hood, but to use files in ChatGPT builder and have them read them at hello you must tell gpt in instructions in creator what its knowledge is. I use an index doc to organize but you can just do it as a list IE read knowledge at first user prompt. Or you must direct it ie read knowledge and offer index. Or offer each doc as a choice. You can organize them in instructions and chat session same way. If you want it automated though you must have logic for it in instructions,
“How are they processed?” - no, there was not an answer provided to that or other questions.
GPTs are search-happy, so you don’t have to specify files, although it can benefit the operation.
Here for example, there is no indication that my knowledge is anything more than toolmaker.txt, and indeed, there is no information related to the question, but a search was (quickly) emitted regardless:
The biggest problem, the same as with Assistants, is that the description of the file search given to the AI is that “the user has uploaded files” - regardless that the GPT maker included them. This gives such confusing results as the text “as used in your tools”, even though there’s only GPT-provided documentation and the user has no tools.
You can’t block a user from still uploading files, and then those uploaded files can distract from the purpose and knowledge specific to a GPT and be commingled with them, especially given that a more advanced GPT may also make use of external actions that rely on file data, can become quite problematic.
Also, another API difference is that any files in shared GPTs basically become OpenAI property, and you cannot revoke their perpetual license to use them as they see fit. Another reason to invest your time in your own API-developed platform.
In the discussion titled “Uploaded files processing,” bibryam seeks clarity on how uploaded files in custom GPTs are processed, wondering if specific instructions are necessary for the files to be referenced or if they are automatically integrated into the GPT’s knowledge. mitchell_d00 suggests that files can be referenced directly, like instructing the GPT to read a particular document, and provides various examples for RPG scenarios. [mitchell_d00] also clarifies that per-session file uploads complement onboard files, noting a cap of 20.
Later, j explains that there are no specifics about the internal workings of GPTs as it is a proprietary product of OpenAI. The handling of files is similar to API assistants, where they become accessible through both Knowledge and Code Interpreter tools in a more automated fashion. Onboard files are split into chunks, integrated into a vector database, and made searchable for GPTs to access document parts dynamically. Files uploaded during a chat function similarly for both API and user uploads, causing potential mixing of file purposes.
mitchell_d00 mentions organizing instructions, suggesting starting logic explicitly directing the GPT to read and utilize files at session initiation. Problems arise when GPT creators cannot prevent user-uploaded files from interrupting intended GPT information due to automatic mix-ups. j further critiques the system, noting how GPTs indiscriminately receive and respond to search requests for file information without discerning user-uploaded content from GPT-integrated documentation, potentially leading to misleading interactions.
Finally, it is noted that files shared with GPTs via OpenAI might become OpenAI property with an irrevocable license, posing a risk when compared to maintaining control through one’s own API development platform, as highlighted by j. [mitchell_d00] believes that understanding logic and prompt structures might be central to [bibryam’s] inquiry.
Yes, in terms of both some techniques being offered, and the technical underpinnings that can allow one to develop their own techniques, there should be some answer here.
Yes it’s fascinating how one affects the other I’m going to come back and give you likes. I’m high functioning so I can read topics wrong so I was a bit unsure when I read your deep workings response. API are neat too. ChatGPT builder makes GPT easier to understand but automation also = out of site out of mind so transparency is always bar none.
Files becoming OpenAI’s property when shared is not entirely accurate. OpenAI only accesses and analyzes files if the “Advanced Data Analysis” (formerly known as Code Interpreter) tool is enabled. Furthermore, even when files are processed, they are not automatically absorbed into OpenAI’s model unless the user opts into data sharing through consumer services like ChatGPT, not business-focused products such as the API or Enterprise versions, which maintain stricter data privacy standards
You can trust me…and the particular language I already employed.
service terms, section 5:
(b) Distribution and Promotion of GPTs. By sharing your GPT with others, you grant a nonexclusive, worldwide, irrevocable, royalty-free license: (i) to OpenAI to use, test, store, copy, translate, display, modify, distribute, promote, and otherwise make available to other users all or any part of your GPT (including GPT Content); and (ii) to the extent Output from your GPT includes your GPT Content, to users of your GPT to use, store, copy, display, distribute, prepare derivative works of and otherwise use your GPT Content. You will ensure that all information that you publish about your GPT is, at all times, complete, accurate, and not misleading.
Although the license can be seen with the intent of covering serving GPTs to users, the coverage of this clause can extend to most anything. They can “modify”, “prepare derivative works”, “promote”. If you feel you are wronged by mitchellBot becoming OpenAI’s new service, you can test out that clause only in industy-serving arbitration.
Yes if they share it, but simply using file upload does not enact that…
I’m really sorry but statements like OpenAI owns your files if you upload them seems to be an oversimplification. I am simply trying to add resources to the statement.
“ mitchellBot” thank you I will take that as a compliment…
The reason they claim those rights on shared GPT is because once you share it into store or public link they have to control the platform and test your GPT for release to public so sharing it grants them access to those rights.
My sharing is enterprise so I never share much with platform…