Hope these will help to strike inspiration:
- DALL·E: Creating Images from Text
- Image GPT
- MuseNet
-
My image completions using GLIDE and my text to image completions using minDALL-E - OpenAI Community
While MuseNet isn’t for images (instead it’s for MIDI files) it provides an interesting read for integrating generative models like GPT-3 directly for file encoding.