I have a website with a lot of information and I would like gpt to search it before answering something specific. it’s possible
Yes this can be done very well
Use this extension and then append to your prompt site:domain.com
Or do the same thing with Perplexity.com
https://www.perplexity.ai/?s=u&uuid=dba9a968-22aa-4b05-973a-3addb5826878
but gpt would look for the answer first on the website and then if it doesn’t find a search elsewhere? (in the data he already has training)
“model”: “text-davinci-003”,
“prompt”:“site: terra.com.br qual a noticia mais importante hoje?”,
?
I believe you misunderstood, it would be for example how can I use the openai gpt api the user sends a question and I search on a website/pdf before answering something that is on the website/pdf and if I don’t find it there yes search in its base by knowledge standard.
You can do this with embeddings, take a look at the API docs.
But I didn’t understand how can I use this embadding in my model?
You must use the embeddings API to index the information you want to use as a source of data, then you do it again for the search query. You perform a cosine distance calculation operation between the search embedding and those of your data source. The first results will be the search results. These can be used in the prompt to query the Completions API for a final and elaborated response.
I simplified it a lot, but more or less…
damn, thanks for the reply. I still don’t understand, do you happen to have an example? From the documentation, I didn’t understand how to get and consult this data whenever the user sends a question at the prompt
You could ask ChatGPT for instructions/examples in the language you need
OK… here ya go… here is how I do this in my lab for testing and system engineering eval:
-
I have a DB table with the text I want to search stored with other params in each row. I do this with completions text but of course you can populate your DB as you like.
-
When I commit the text to the DB, the model calls the embeddings API, gets a vector, and stores the vector in the same table row. Now the table row has both the text and the vector (the embedding).
-
When I search, I choose the ranking method (normally use the dot_product because it is the same as the cosine similarity function for unit vector 1), and use the API to get the vector for the search term.
-
Then I pull every vector from the DB (along with the id of the row) and run the dot_product of the search term vector with each text vector and store the results in an array.
-
Then, I sort the array (based on the correlations method used) and there you have it
Example: Set up Search
Example: Top Vector Search Results
HTH
Of course, if I add a completion for:
What is a cave system?
… and run the vector search again, we get:
So, I think it’s easy to understand, now right @regulador261 ?
does this work? OpenAI API