Text-davinci-003 model half cooked answer

faizurrehman566 · July 3, 2023, 5:43am

Hi I am using text-davinci-003 model but it is giving me half cooked answer like the end lines are cut in the response, while using gpt-3.5-turbo giving me the full response.

Also text-davinci-003 model is giving me a lot of error 400 while giving the same query 3-4 times it gives the answers but 1st two times it gives error a lot on many queries while using gpt-3.5-turbo error 400 is eliminated.

Any solution for this?

Foxalabs · July 3, 2023, 6:15am

Welcome to the forum!

Can you post the snippet of code that calls the API and any setup code that it relies on please.

faizurrehman566 · July 3, 2023, 6:36am

import { OpenAI } from 'langchain/llms/openai';
import { PineconeStore } from 'langchain/vectorstores/pinecone';
import { ConversationalRetrievalQAChain } from 'langchain/chains';

const CONDENSE_PROMPT = `بالنظر إلى المحادثة التالية وسؤال المتابعة ، أعد صياغة سؤال المتابعة ليكون سؤالاً مستقلاً.

Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:`;

const QA_PROMPT = `أنت مساعد AI مفيد. استخدم أجزاء السياق التالية للإجابة على السؤال في النهاية.
إذا كنت لا تعرف الإجابة ، قل فقط أنك لا تعرف. لا تحاول اختلاق إجابة.
إذا لم يكن السؤال متعلقًا بالسياق ، فأجب بأدب أنك مضبوط للإجابة فقط على الأسئلة المتعلقة بالسياق.

{context}

Question: {question}
Helpful answer in markdown:`;

export const makeChain = (vectorstore: PineconeStore) => {
  const model = new OpenAI({
    temperature: 0.2, // increase temepreature to get more creative answers
    modelName: 'text-davinci-003', //'gpt-3.5-turbo-0301', //'text-davinci-003', //change this to gpt-4 if you have access
  });

  const chain = ConversationalRetrievalQAChain.fromLLM(
    model,
    vectorstore.asRetriever(),
    {
      qaTemplate: QA_PROMPT,
      questionGeneratorTemplate: CONDENSE_PROMPT,
      returnSourceDocuments: true, //The number of source documents returned is 4 by default
    },
  );
  return chain;
};

Foxalabs · July 3, 2023, 7:52am

My only recommendation is that you use the 3.5-Turbo model, it’s quicker, cheaper and more powerful. May I ask why you would want davinchi-003 in this scenario?

supershaneski · July 3, 2023, 7:54am

Check the finish_reason and if shows length when the output is cut.

If yes, this means “Incomplete model output due to max_tokens parameter or token limit”.

faizurrehman566 · July 3, 2023, 8:18am

I have tried different models for checking the response accuracy as the accuracy of text-davinci-003 model is very good as it retrieve the respnse data from pdfs files I have ingested in pincone. Also used gpt-3.5-turbo but it gives very general response not that accurate or specific.
Is any flaw in text-davinci-003 model regarding error 400 & text limitation?

faizurrehman566 · July 3, 2023, 8:19am

I have checked in terminal it is showing me the max-token-limit=256 as how to increase that to get full response text.

Foxalabs · July 3, 2023, 8:21am

No, my guess is that you hit upon a temporary outage, you should always expect all API endpoints to give an error and build your system to gracefully handle them.

That is to say expect the worst and handle it in a graceful manner and then your system will perform well for any given scenario.

Topic		Replies	Views
Finetune model completion cut off too short Prompting	7	3909	January 17, 2023
Gpt-4-1106-preview: 400 This model's maximum context length is 4097 tokens API api , token , gpt-4-turbo	8	5480	March 18, 2024
GPT-4o-mini max token 16,384 API gpt-4 , api	2	1797	August 31, 2024
Token Limitization Error when prompting Prompting chatgpt , api	8	3253	December 6, 2023
Need to come of finish_reason: length Max token alternatives API api , api-output-length	5	1484	April 10, 2024

Text-davinci-003 model half cooked answer

Related topics