Hi, we took a few screenshots from our web application, added some brief descriptive text about what it does, fed it to regular ChatGPT 4+ and asked it to analyze the screenshots. Then we asked it a few user questions of the “How do I do this?” variety. It did extremely well!
Would this process scale, if we took 50-100 screenshots, stuffed them into files and fed them as the initial file list for an Assistant? The 20 file limit, 512MB per file is huge - I imagine we could fit everything we needed in a single PDF file of 5-10 MB. The user conversations would be short, we are not looking for it to solve complex questions.
Any feedback on the feasibility of this would be appreciated and whether the “stuffing all screenshots and descriptive text into one PDF” is the best way to feed a file to the Assistant.
Edits - Work Plan For Customer Service Assistant
I will update this section as new information learned.
Open issues/notes for myself in italics below
- Generate screenshots of web application
- Host screenshots on a web site that ChatGPT can access (Use GUID parameter in URL for a minimal security layer?)
- Create a single markdown file, in which each screenshot’s link is provided followed by a descriptive annotation for that image.
- Alternately, use API https://platform.openai.com/docs/guides/vision which shows how to send an image URL and associated “what does this image contain?” text.
- The total text will be far less than the million words mentioned by Jay F. But how are screenshot images counted in this mix (words/tokens)? (Answer: cost info provided in vision URL above.)
- Feed the single markdown file to ChatGPT via API
- ChatGPT will read the text of the markdown file, and will also fetch and analyze each linked screenshot, adding it to its knowledge base (Is this correct? Or do the images need to be fed separately, or as a folder - but then how to tie each image to its description?)
- How to update the Assistant, do we update the main file and screenshots and make a new one, or do we keep incrementally giving the same Assistant new information and corrections?
- Create Chatbot that will, for each user, open a new Thread to the Assistant and send Messages to it for that user (Find open source chatbot code to do this)
- This will use private API key so that others cannot use this Assistant (Right?)
- This will prevent one user from seeing another user’s chats (Ideally)
- Can the bot refer the user back to the images it used in its answer?
- Do we log all messages to review if it’s doing a good job?
- Does each new user Thread start with a “fresh clone” of the trained Assistant, or does each new user Thread continue the same assistant - raising risk of memory loss after some time?
- Should each user/customer’s interactions be capped?
- How to prevent the user from just chatting with the support bot about life? (Limit the number of messages in a Thread?)
- Or will all messages fail once our company’s monthly cap is reached?