[Embeddings.create] Improve InvalidRequestError message: "['', 'a'] is not valid under any of the given schemas - 'input'" for large arrays

roy-pstr · February 5, 2023, 4:42pm

Hi,
The error message returns from embedding.create does not point to the exact element in the input array that us not valid. This is really annoying when you pass large array and need to debug what element is not valid.

My suggestion here is to point to the specific element in the input array that is not valid instead of returning the whole array.

E.g.
Running this code in python:

from openai.utils import get_embeddings
get_embeddings(['','a'])

will return the following error message
ERROR: openai.error.InvalidRequestError: ['', 'a'] is not valid under any of the given schemas - 'input'

A better error message in this case will be

`ERROR: openai.error.InvalidRequestError: [''] is not valid under any of the given schemas - 'input'`

Thanks,
Roy.

ruby_coder · February 6, 2023, 3:20am

Yes, the API is officially a beta and your feedback is very important. Thank you @roy-pstr

Regarding your embedding prompt, I tried two versions and both worked for me. Here is an incomplete snapshot (not showing the entire 1024 long vector, in the interest of saving space, haha ):

Prompt: ‘’,‘a’

Prompt: [‘’,‘a’]

HTH

roy-pstr · February 6, 2023, 10:30pm

from openai.embeddings_utils import get_embeddings
get_embeddings(['','a'])

raise the following error:

openai.error.InvalidRequestError: ['', 'a'] is not valid under any of the given schemas - 'input'

version:
openai 0.26.4

It worths mention that the default engine is used here: “text-similarity-davinci-001”

And make sure the input you are using is list of prompts. not a single prompt which is a string of list.

Anyway. I’m 100% sure that this raising an error, and my suggestion is to point to the element in the array that cause the error.

Best,
Roy.

robustness · February 9, 2023, 11:38pm

i encounter same error

ahmed.elashry · March 19, 2023, 1:41am

I had the same error and couldn’t figure out what’s wrong as I have big volume of data. Strangely, when I worked without batching (passing only a list of length 1 each time) it worked

siddhant.saurabh · May 29, 2023, 7:22am

me too facing this error

error_code=None error_message="[] is not valid under any of the given schemas - 'input'" error_param=None error_type=invalid_request_error message='OpenAI API error received' stream_error=False

error_trace

Traceback (most recent call last):\n  File \"/usr/local/lib/python3.8/site-packages/tenacity/__init__.py\", line 382, in __call__\n    result = fn(*args, **kwargs)\n  File \"/usr/local/lib/python3.8/site-packages/llama_index/embeddings/openai.py\", line 149, in get_embeddings\n    data = openai.Embedding.create(input=list_of_text, model=engine, **kwargs).data\n  File \"/usr/local/lib/python3.8/site-packages/openai/api_resources/embedding.py\", line 33, in create\n    response = super().create(*args, **kwargs)\n  File \"/usr/local/lib/python3.8/site-packages/openai/api_resources/abstract/engine_api_resource.py\", line 153, in create\n    response, _, api_key = requestor.request(\n  File \"/usr/local/lib/python3.8/site-packages/openai/api_requestor.py\", line 226, in request\n    resp, got_stream = self._interpret_response(result, stream)\n  File \"/usr/local/lib/python3.8/site-packages/openai/api_requestor.py\", line 619, in _interpret_response\n    self._interpret_response_line(\n  File \"/usr/local/lib/python3.8/site-packages/openai/api_requestor.py\", line 682, in _interpret_response_line\n    raise self.handle_error_response(\nopenai.error.InvalidRequestError: [] is not valid under any of the given schemas - 'input'\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File \"/app/src/chatbot/query_gpt.py\", line 249, in get_answer\n    context_answer = self.call_pinecone_index(request)\n  File \"/app/src/chatbot/query_gpt.py\", line 217, in call_pinecone_index\n    index_response = custom_index.query(final_query)\n  File \"/usr/local/lib/python3.8/site-packages/llama_index/indices/query/base.py\", line 20, in query\n    return self._query(str_or_query_bundle)\n  File \"/usr/local/lib/python3.8/site-packages/llama_index/query_engine/retriever_query_engine.py\", line 145, in _query\n    response = self._response_synthesizer.synthesize(\n  File \"/usr/local/lib/python3.8/site-packages/llama_index/indices/query/response_synthesis.py\", line 158, in synthesize\n    text = self._optimizer.optimize(query_bundle, text)\n  File \"/usr/local/lib/python3.8/site-packages/llama_index/optimization/optimizer.py\", line 78, in optimize\n    text_embeddings = self.embed_model._get_text_embeddings(split_text)\n  File \"/usr/local/lib/python3.8/site-packages/llama_index/embeddings/openai.py\", line 253, in _get_text_embeddings\n    return get_embeddings(\n  File \"/usr/local/lib/python3.8/site-packages/tenacity/__init__.py\", line 289, in wrapped_f\n    return self(f, *args, **kw)\n  File \"/usr/local/lib/python3.8/site-packages/tenacity/__init__.py\", line 379, in __call__\n    do = self.iter(retry_state=retry_state)\n  File \"/usr/local/lib/python3.8/site-packages/tenacity/__init__.py\", line 326, in iter\n    raise retry_exc from fut.exception()\ntenacity.RetryError: RetryError[<Future at 0x7fd6fc0a5c10 state=finished raised InvalidRequestError>]\n

any solution

mustafa · August 14, 2023, 3:49am

Simply don’t enter any empty string. Then it will work inshaAllah.

roy-pstr · August 14, 2023, 4:17pm

“Simply don’t write code with bugs then you won’t get errors”…

mustafa · August 14, 2023, 4:49pm

It’s obvious that there is a bug but I tell about how to get around of it and it’s useful to say because I faced with this error when it’s not that possible to realize it’s because of empty strings. So some people may find it useful.

Have a good day

Topic		Replies	Views
Embedding with "" and list [..., "", ...] API embeddings , api	5	2513	December 26, 2023
Embedding API change? $.input is invalid API embeddings , api	5	5074	September 3, 2024
Getting 400 response with already working code API	16	9966	August 6, 2024
Create emebeddings API does not work if the text embedded is in iso-8859-1 format API	1	596	March 26, 2023
[Invalid] Error embedding certain Unicode in square brackets (ERR_UNESCAPED_CHARACTERS) API gpt-35-turbo , chatgpt , api	3	1353	July 28, 2023

[Embeddings.create] Improve InvalidRequestError message: "['', 'a'] is not valid under any of the given schemas - 'input'" for large arrays

Prompt: ‘’,‘a’

Prompt: [‘’,‘a’]

Related topics