Summary
When using the Vector Stores Search API with rewrite_query=true, the query rewriting feature incorrectly mixes Dutch and German languages, producing nonsensical hybrid queries that break semantic search functionality.
Environment
-
API: OpenAI Vector Stores Search API
-
Endpoint:
POST /v1/vector_stores/{vector_store_id}/search -
Parameter:
rewrite_query=true -
Language: Dutch (nl-NL)
Bug Description
The query rewriting feature is translating portions of Dutch text into German while retaining other Dutch words, creating invalid hybrid queries that don’t exist in either language.
Expected Behavior
When provided with a Dutch query, the system should either:
-
Keep the query in Dutch for optimal semantic search
-
Translate the entire query to English if optimization requires it
-
At minimum, maintain linguistic consistency within a single language
Actual Behavior
The query rewriting produces a mixture of Dutch and German that is grammatically incorrect and semantically confusing in both languages.
Reproduction
Original Query (Dutch)
Wie fietst of loopt vaker?
Translation: “Who cycles or walks more often?”
Rewritten Query (Incorrect Dutch-German Hybrid)
Wer fietst vaker dan loopt?
Issues:
-
“Wer” is German (should be “Wie” in Dutch)
-
“fietst” is Dutch (correct)
-
“vaker dan” is Dutch (correct)
-
“loopt” is Dutch (correct)
-
The grammar is broken: mixing German question word with Dutch verb conjugations
Code to Reproduce
from openai import OpenAI
client = OpenAI()
search_result = client.vector_stores.search(
vector_store_id="vs_xxx", # Your vector store ID
query="Wie fietst of loopt vaker?",
max_num_results=10,
rewrite_query=True
)
print(f"Original query: Wie fietst of loopt vaker?")
print(f"Rewritten query: {search_result.search_query}")
# Expected: Dutch query or English translation
# Actual: "Wer fietst vaker dan loopt?" (German-Dutch hybrid)
If any more information is needed to reproduce or fix this, let us know.