I’m wondering what other folks think about using openAI for object tracking. This would be a phone app that allows you to speak and say things like ‘I put the outdoor holiday plugs in the large box on the top shelf in the garage.’
The app would translate the words into text and then create a embedding representing the sentence ‘I put the outdoor holiday plugs in the large box on the top shelf in the garage.’
So next December I would ask the app ‘Where did I put the outdoor holiday plugs’, and the app would then search the embeddings to come up with several probably answers including that the plugs are in the large box on the top shelf in the garage.
The app would also use the gpt-3 api to answer questions such as ‘where is all of the items I need to put up my holiday lights?’
While I’m thinking of a personal app, this concept could be extended to inventory on factory floors. A person could walk around and talk to the app about where different product and quantities are located instead of using a bar scanner.
Welcome to the community eColumbia78.
When I go an an Amazon Echo, what feels like many years ago, I thought about building what you I think you are suggesting
“Ha Alexa, remember that I put such and such here”. And then later. “Ha Alexa, do you have any memory of such and such?” - “Yes, you put it here”.
I think it’s a great idea.
I have not used them, but I think Embeddings is what you want. Apart from solving your problem it’s also cheaper. I won’t bother explaining how as there is lots written on the topic elsewhere.
Well, these Embedded vectors are basically one-way functions and so by storing the embedding vector without the actual ‘I put the outdoor holiday plugs in the large box on the top shelf in the garage." you are not gaining anything and losing the context.
It is better for you to simply store the text “‘I put the outdoor holiday plugs in the large box on the top shelf in the garage.” and that’s it. You don’t need an AI for this application, to be perfectly honest and you don’t need embedding vectors to keep track of these things.
You could build something like this fairly easily using Google sheets and AppSheet.
- maybe an hour’s time?
- (minus the voice interaction)
You could set this up with the very basic relational database built inside of a Google sheet:
Then you could either:
- Create a third table that joins those two together, in combination with a time stamp and anything else that you’d like to record at the time; or
- simply change the column in the item table denoting what location it’s in
With all that said and done, once you created a system like this to store all the data and manage the connections between things… You can fairly easily implement a call to open AI with a text block that you dictated to your phone.
The AI could be instructed to extract out all the relevant details and return them; then you could create a follow-up process to then handle the return.
I’ve got a couple sample apps on my AppSheet portfolio you could probably adapt into this scenario - if you were so inclined.
I’ll review your appsheet portfolio. Thanks!
I appreciate your explanation and I agree this particular scenario can be used with setting up the relational database as you described. My day job is writing databases.
However I got to thinking about embeddings and use cases where it is not necessary to have the complexity or overhead of writing a relational database. At a minimum the data store will just consist of a set of embeddings and the original text along with a timestamp which represents, in this case, where objects are in a point in time.
At a later time the information is retrieved by making a request and turning that request into an embedding and then finding the stored embeddings along with their text that are related to the request embedding.
I think embeddings are a good recommendation, especially if you are looking for a flexible solution that can be used for many cases.
It’s easy to store the original text alongside the embedding. Popular vector storage solutions already provide this as a
Lastly, because of how you presented the problem, “to return a year later by asking a question,” when your embedded text is a statement, using the HyDE strategy (I wrote about it here on my blog) would be a wise choice.
Thanks for the HydDE strategy.
I’ll work on the sample app over the next week and post the results here in case people are interested and can help test the concept.
Technology hype and trends cause developer bias and it can influence developers to choose suboptimal software tools for the job. This can be caused by current sentiment, bias, and hype (to name a few). If you are looking to learn how to use embeddings. then your idea is fun and interesting. However, as I gently mentioned earlier, you do not need embeddings to accomplish the task you have outlined in your original post. All you basically need is a DB or flat file or a notepad to save voice prompts to text like ‘Outdoor holiday plugs in the large box on the top shelf in the garage.’
This type of object tracking does not require OpenAI or any network-based AI API, TBH. It’s very easy to keyword search this using standard text indexing and search. Embedding are not necessary.
As mentioned, if you wish to have a fun project to learn OpenAI and in particular, how to apply embeddings to a fun project you have a passion about, then your idea, of course, is a great idea. However, if you really want to build a simple application to track where objects are located, as you have described, you do not need an AI NLP (GPT) tool for this task.
But, as I mentioned, if you just want to learn OpenAI / Embeddings and this project interests you from that perspective, then by all means, go for it! After all, you asked your question in an OpenAI community, so there is a very high likelihood you will get a lot of encouragement.
On the other hand, if you asked your same question in a forum on text-based indexing and search and retrieval (and left out the part about OpenAI and embeddings) you would get a totally different solution because databases have been indexing this kind of data for search and retrieval for decades and it works well.