I have a set of images I want to upload to gpt4 vision, for summarization and then to create an API call based on the text contained within, which would have some variations for every 3-4 pages. There are about 20 pages in total. the max tokens I can get back is 4096. Is there any way to have vision “embeddings” where I can ask different questions of the same image sets? or do I have to run it all individually?