GPT 3.5 16K responds differently from two different servers

Not sure if anyone faced this, I have the same code, same prompt, same inference parameters in two server. And I have implemented citation through prompt tuning. In one server the citation comes correctly, the other one is wrong most of the time.

There has to be a difference, create a 3rd, local test env and try it there, see if you get good/bad results.

Cam you post some example logs from both servers?

Getting completely different answers is normal, but if you’re getting wrong answers one of them is either not configured correctly or you’re accidentally using a different model than you think you are.

1 Like

Hi Foxabilo, i will not able to share logs as it has sensitive data

The answer is correct, but the citations after the answer is sometimes wrong. I am asking the LLM to cite the document name that it used to answer. sometimes the citation and content does not match

Citations by the model will not be reliably accurate, the model is creating a set of text that best matches the request given to it. It does not have a deterministic lookup table of facts it can refer back to when asked for sources.

For that kind of result you will need to use a search method with identifiable sources, like a vector database or a search engine in combination with the AI by providing the model the context from the search and then including a link to that context in the answer.

2 Likes

I see, I misunderstood what you meant. Like @Foxabilo said I think the common approach would be to use a Vector DB and ‘cosine similarity’ to get a rank ordering of most likely documents matching the semantics vector (embedding) of the question.

Search for “LangChain” and searching documents in Youtube, if you’re not familiar with it. There’s several good videos on it.

That I am already doing. But I want to cite only those documents that were actually referred from the similar documents that matched the question

LLMs don’t have the ability to explain their sources in that way. It’s a big model with trillions of points of reasoning, and all of them “play into it” potentially.