Asking follow-up questions with retrieval

Here is the results of a “toy” HyDE example,

                        The ball is blue.
The ball is blue.                  1.0000
The ball is yellow.                0.9384
The ball is purple.                0.9365
The ball is green.                 0.9356
The ball is white.                 0.9324
The ball is pink.                  0.9279
The ball is red.                   0.9269
What color is the ball?            0.9217
The ball is black.                 0.9214
The ball is orange.                0.9183

Now, this is very much a toy example. But, here we can see that almost all color “guesses” do better than the simple question.

I expanded on this test quite a bit here: https://github.com/elmstedt/Color-HyDE/blob/main/color_HyDE.csv

If we look at how the question performs across a greater array of colors, the answer gets a bit murkier. When we look at the distribution of where the question ranks among all guessed colors for each ground-truth color, we see the question is essentially no better or worse than any random guess, the question embedding beats a randomly guessed color about half the time.

Min. 1st Qu. Median Mean 3rd Qu. Max.
0.08392 0.41958 0.50350 0.49259 0.58042 0.80420

That said, there are a lot of weird colors in there (e.g. papaya whip and bisque)

If we look at the median similarity for each guess among all the colors we can see that the colors we might expect GPT-4 to “guess” tend to do a bit better than the question.

Big Table
HyDE Median Similarity
The ball is light blue. 0.9141
The ball is dark blue. 0.9140
The ball is violet. 0.9102
The ball is pale turquoise. 0.9098
The ball is light sky blue. 0.9088
The ball is dark green. 0.9084
The ball is blue. 0.9083
The ball is dark gray. 0.9081
The ball is dark grey. 0.9081
The ball is dark cyan. 0.9079
The ball is light gray. 0.9078
The ball is dark red. 0.9075
The ball is light green. 0.9073
The ball is grey. 0.9073
The ball is gray. 0.9073
The ball is lavender. 0.9069
The ball is light grey. 0.9065
The ball is sky blue. 0.9065
The ball is dark turquoise. 0.9064
The ball is light yellow. 0.9062
The ball is beige. 0.9062
The ball is pale green. 0.9058
The ball is pink. 0.9050
The ball is cyan. 0.9049
The ball is yellow. 0.9044
The ball is light cyan. 0.9042
The ball is brown. 0.9039
The ball is medium blue. 0.9039
The ball is blue violet. 0.9037
The ball is light pink. 0.9033
The ball is yellow green. 0.9032
The ball is dark orange. 0.9029
The ball is pale violet red. 0.9028
The ball is navy blue. 0.9026
The ball is white. 0.9026
The ball is dim gray. 0.9025
The ball is violet red. 0.9024
The ball is khaki. 0.9022
The ball is green. 0.9022
The ball is dark sea green. 0.9021
The ball is rosy brown. 0.9020
The ball is purple. 0.9019
The ball is dark violet. 0.9018
The ball is sea green. 0.9018
The ball is dim grey. 0.9015
The ball is light sea green. 0.9013
The ball is green yellow. 0.9009
The ball is turquoise. 0.9009
The ball is lime green. 0.9000
The ball is navy. 0.8999
The ball is tan. 0.8994
The ball is black. 0.8993
The ball is midnight blue. 0.8992
The ball is dark salmon. 0.8988
The ball is maroon. 0.8987
The ball is dark slate blue. 0.8984
The ball is deep pink. 0.8982
The ball is royal blue. 0.8980
The ball is powder blue. 0.8977
The ball is light slate blue. 0.8974
The ball is dark olive green. 0.8973
The ball is orange. 0.8972
The ball is light goldenrod. 0.8970
The ball is azure. 0.8969
The ball is red. 0.8967
The ball is magenta. 0.8960
The ball is light steel blue. 0.8960
The ball is medium purple. 0.8958
The ball is slate blue. 0.8954
The ball is steel blue. 0.8950
The ball is sandy brown. 0.8947
The ball is light slate grey. 0.8947
The ball is light goldenrod yellow. 0.8946
The ball is light salmon. 0.8945
The ball is light slate gray. 0.8943
The ball is orange red. 0.8942
What color is the ball? 0.8941
The ball is dark slate gray. 0.8936
The ball is slate grey. 0.8934
The ball is light coral. 0.8934
The ball is medium sea green. 0.8932
The ball is deep sky blue. 0.8932
The ball is slate gray. 0.8929
The ball is hot pink. 0.8923
The ball is dark goldenrod. 0.8922
The ball is pale goldenrod. 0.8918
The ball is dark slate grey. 0.8916
The ball is dark orchid. 0.8916
The ball is cornflower blue. 0.8911
The ball is coral. 0.8908
The ball is dark magenta. 0.8906
The ball is forest green. 0.8900
The ball is chartreuse. 0.8897
The ball is medium violet red. 0.8892
The ball is floral white. 0.8890
The ball is dodger blue. 0.8888
The ball is spring green. 0.8885
The ball is olive drab. 0.8884
The ball is dark khaki. 0.8883
The ball is alice blue. 0.8877
The ball is aquamarine. 0.8872
The ball is lawn green. 0.8869
The ball is gold. 0.8867
The ball is medium spring green. 0.8867
The ball is cadet blue. 0.8862
The ball is misty rose. 0.8855
The ball is lavender blush. 0.8855
The ball is ghost white. 0.8851
The ball is medium slate blue. 0.8851
The ball is sienna. 0.8849
The ball is salmon. 0.8849
The ball is plum. 0.8840
The ball is cornsilk. 0.8839
The ball is medium turquoise. 0.8838
The ball is orchid. 0.8836
The ball is chocolate. 0.8834
The ball is ivory. 0.8824
The ball is antique white. 0.8805
The ball is linen. 0.8793
The ball is goldenrod. 0.8787
The ball is seashell. 0.8780
The ball is tomato. 0.8778
The ball is mint cream. 0.8771
The ball is gainsboro. 0.8765
The ball is saddle brown. 0.8762
The ball is wheat. 0.8758
The ball is moccasin. 0.8751
The ball is white smoke. 0.8751
The ball is peach puff. 0.8742
The ball is medium orchid. 0.8738
The ball is indian red. 0.8737
The ball is thistle. 0.8727
The ball is snow. 0.8724
The ball is blanched almond. 0.8722
The ball is old lace. 0.8721
The ball is lemon chiffon. 0.8662
The ball is burly wood. 0.8644
The ball is medium aquamarine. 0.8642
The ball is firebrick. 0.8636
The ball is navajo white. 0.8588
The ball is honeydew. 0.8578
The ball is bisque. 0.8576
The ball is peru. 0.8576
The ball is papaya whip. 0.8459

Now, again, this is a toy example. To show just how important the structure of an embedded text is, even something like “The dog is old.” has an average similarity against the ground-truth colors of about 0.78. When the embedded documents are more complex, I will expect a hypothetical document to substantially outperform the base query.

Ultimately though—as is the case with most things—I suspect a combination approach will be the most effective. Combining some keyword filtering to target a HyDE-based similarity search to those documents most likely to contain relevant information then, perhaps, assessing each individually to determine if it is relevant to the question and tossing it if it is not, ingesting the remaining documents, and generating another response. Then, finally, as a last resort iterating this process as necessary.

One of the other things I am planning to investigate is “synthetic” embeddings, which I wrote about briefly here: Text Pre-processing for text-embedding-ada-002 - #2 by elmstedt

Basically, take whatever text information you want to embed and re-write it a bunch of times, summarize it, expand it, explain it to a five-year-old, etc. The idea being it will be much easier to find the appropriate information if it is expressed in many different ways in the embedded space.

Again, once I am to the point in my project of actually embedding things, I can do a full work-up on the differences I experience between using and not using HyDE and all the other things I want to try.

2 Likes