Gemini live today in Bard

In all fairness to Google Gemini Pro, I finally tried it out today. I asked it how to format the json request object to make calls to it’s API, and it failed (hallucinated) miserably at that.

However, using the API, I asked it this question using the same exact parameters as I did with gpt-3.5-turbo-16K: Gpt-3.5-turbo-16k api not reading context documents

It answered it as gpt-4 always did. It is the only test I’ve performed so far, but, it is encouraging.

Addendum: As I continue testing gemini-pro, I find that, at least in my environment, it, so far, comparing apples to apples, does not even surpass gpt-3.5-turbo-16k – at least in it’s ability to perform as a RAG LLM. It’s task is to read the returned context documents and respond to the question posed.

This is an example of the kind of results I have seen so far:

The answer is clearly in the first 2 or 3 documents in every response, but Gemini Pro consistently fails to either read or comprehend the documents.

Now, a mitigating factor may be the the structure of the Gemini Pro API request body, which is essentially this:

// Define the request data as an associative array
$requestData = [
    'contents' => [
        [
            'parts' => [
                [
                    'text' => $text
                ]
            ]
        ]
    ]
];

Compare that to OpenAI’s Chat Completion API:

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Who won the world series in 2020?"
      },
      {
        "role": "assistant",
        "content": "The Los Angeles Dodgers won the World Series in 2020."
      },
      {
        "role": "user",
        "content": "Where was it played?"
      }
    ]
  }

OpenAI’s API separates the roles, whereas Google’s API forces you to describe everything in the “text” element, which, if you are sending system and user instructions as well as context documents, could get a little confusing.

I am really anxious to hear what other experiences are with this new API.