I provide a webpage link to chat/completion api and ask targeted questions based on the tags/paragraph headers (which are in bullet format) in the hyperlink. What I observe is that the response inherently misses on 1-2 bullet points from the list. Based on the response it feels like gpt is making the best guess and not actually capturing the actual content. Does anyone have experience in this area. ?
I m using gpt-3.5-turbo.
1 Like
Hi,
Welcome to the forum.
Indeed this is your issue, the llm is not collecting the URL you requested.
Also even supplying it the HTML webpage would potentially also return the wrong results too.
The result is ‘non-deterministic’ meaning that when you send the same question you will often receive a different result.
LLMs make a predictive next step decision based on probability.
To do this task 100% accurately you would do this task in logical code.
1 Like
Welcome to the dev forum @tadpolehop
2 Likes
I ended up setting up a tool with a call to jina.ai reader endpoint, as a quick hack initially… Then it’s kind of ok to run on the minor projects I have, when it doesn’t work, I still can do custom code or brightdata proxy’s.
1 Like