I have read an article that said that order of words to be embedded doesn’t matter , but when I embed two texts with same words but different order I get totally different vectors(none of the numbers is the same). Does this mean that the order of words does matter for making embedding vectors?
Welcome to the forum!
Can you provide a reference so we can read the article.
The answer could be Yes or No. You did not give us enough information.
Short answer for any problem is, if what you are trying to do does not work then don’t use it.
Thank You. I can’t include links to my posts. It was a question(" Is there order among features of word embeddings?") in Quora site.
In my website users have projects and each project have topic, title, tags. I sum up each users all projects datas(title, tags, topic) and based on this project data I created embeddings. Now I want to understand should I order projects by created date to make the last created project more prioritized or there is no point in doing it
Can you paste the link as text and then we can convert it to a link for you.
One of the main reasons for using embeddings is to find related information. If the embeddings are working as needed then there should be no need to order the results as the most useful result should be the closest match. Again you have to decide, do you use embeddings with a sort, do you use different data for creating the embedding, do you use a different means to compare the embeddings, etc.
You might find this post of interest.
For others seeking the Quora article.
FYI
The Quora answer appears to be AI generated.
Hello! Welcome to the forum!
Yes, the order of the words would matter if you’re looking for identical vectors.
What typically happens under the hood for an embedding model is that the words are first converted to tokens, and the learnt embedding is a function not just of the tokens themselves, but also the relative position of the tokens (i.e. context).