For a small university project, I am trying to generate small images from text with GPT-3. My goal is that the completion is a matrix of 0s and 1s which shows the object I asked for in the prompt.
I have been playing around with very simple 8x8 matrixes showing a digit. With few shot training and very simple prompts, I was able to generate digits that were not used in the prompt. However, it was not very consistent and also could not further improve (e.g, to show smiley faces).
Q: Make a grid of pixels that form an image of the number eight
A:
00000000
00111100
01100110
00111100
01100110
00110100
00011000
00000000
Q: Show a pixel matrix which represents a white and black image of the number one
A:
00000000
00110000
00011000
00011000
00011000
00011000
00011000
00000000
Q: Create an black and white image showing the number three by outputting a pixel grid of the size 8 by 8
A:
00000000
00111110
00000110
00111100
00001110
00000110
00111100
00000000
Q: Create an 8 by 8 pixel matrix that shows the a black and white image of the number 7
A:
00000000
01111110
00000110
00001100
00011000
00110000
01100000
00000000
Q: Create an black and white image showing the number 5 by generating a pixel grid of the size 8 by 8
A:
Completely new to GPT-3, I also tried fine tuning it with 500 examples of black and white digits with 28x28 pixels. But after reading the tips by @daveshapautomator I think that I am not ready yet for fine tuning and my task could also be done with better prompts.
So I have played around with improving and specifying the prompt. However, it does not generate my desired results.
The task is to generating a black and white image by creating a pixel matrix
which consists of 0s and 1s. The matrix should be of the size 8 by 8.
Each entry represents a pixel of a black and white image. That means
the image has a display capable of 8 pixels in width and 8 pixels in height.
Here is an example of an 8 by 8 pixel matrix showing the number 3
00000000
00111110
00000110
00111100
00001110
00000110
00111100
00000000
And here is an example of a grid of pixels that form an image of the number eight
00000000
00111100
01100110
00111100
01100110
00110100
00011000
00000000
Now create a pixel matrix of size 8 by 8 that represents an black and white image of the number 4.
I was also thinking about limiting the completion vocabulary to 0 and 1s by using the logit bias.
Any tips on how to approach this task and how to design the prompt are highly appreciated