Get accurate embedding from a complete conversation

Hello,

I’d like to build a natural conversation flow with openAI using embedding data from my database. But, longer is the conversation, less accurate is the matching. So there is something I’m doing wrong.

To be more clear, more details on my case
I embedded list of apartment description (Rooms, city, surface, garden, gym…) in different cities like Madrid, Paris, New York, Hong Kong…

First problem: If I ask for an apartment in Spain, the matching fails to give me Madrid

User: Hello, I’m looking for an apartment with 2 rooms and a gym included in Spain
Assistant: Sorry, We have no offer in this area

Sources from the match will be various 2 rooms apartments with gym in various cities (Madrid can be one of them)


Second problem: If I ask for an apartment in Madrid, it works, but if in the same conversation i ask for other cities, it fails. I know the reason is because I embed the whole conversation but I need to do so if the user gave informations shared between multiples message (surface, city, number of room…)

User: Hello, I’m looking for an apartment in Madrid with 2 rooms and a gym
Assistant: We have some apartment that fit your needs, here is a list: [List of apartment that match the user request]
User: And do you have other suggestion in Madrid for a house ?

Here, the matching will still take some 2 rooms apartment with gym. And Questions after questions, the accuracy of the sources will decrease.

So 2 questions from my 2 examples:
Are the vectors from the embedding able to match that Madrid entries is the good answer when I ask for a place in Spain ?
How to get good sources for a long conversation ?

Thanks

1 Like

You don’t embed the whole conversation. You want to distill it & neutralize the query using GPT, and then embed the final product.

User: Hello, I’m looking for an apartment in Madrid with 2 rooms and a gym
Assistant: We have some apartment that fit your needs, here is a list: [List of apartment that match the user request]
User: And do you have other suggestion in Madrid for a house ?


2 Bedroom Houses in Madrid, Spain with a gym(?)

But, you may find better results using function calling.

So

“Hello, I’m looking for an apartment in Madrid with 2 rooms and a gym” → find(location=Spain.Madrid, building=Buildings.Apartment, rooms=2, inclusive=[Inclusives.Gym], startAt=0, endAt=50)

“And do you have other suggestion in Madrid for a house ?” → find(location=Spain.Madrid, inclusive=[Inclusives.Gym], building=Buildings.House, rooms=2)

Going even FURTHER though. I would actually use a Centroid-based recommendation engine

I think there’s a funny saying here that needs to be said:

“This damn chatbot could have been a form”.

Even if your example the fact that you are continuing a conversation makes it very difficult to understand the actual preferences. When the user switched to houses did they want to retain the gym and 2 bedrooms?

There’s some background inference that we humans perform.

Okay, this person just asked me for a 2bdr apartment, their budget is probably a couple thousand a month. Now they are asking about 2bdr houses with built-in gyms? Probably not. I’ll just show them typical houses and ignore the gym part.

GPT does not work well in these situations and ultimately a chatbot would just be much more complicated than a single search/recommendation engine.

1 Like