I have embedded about 20k short texts using text-embedding-ada-002 and I am trying to visualize the embeddings in 2D using UMAP. However the results were not what I was expecting. I tried different values for n_neighbors, min_dist params and ‘cosine’ for metric parameter. I think min_dist param is not applied by UMAP properly as I still see lot of overlapping samples in the lower dimension. Is there a recommended min_dist / n_neighbor value for visualizing chatgpt embeddings using UMAP properly?
Any help is appreciated.