It’s going to support it, but the question is: how well.
Some example code. Hebrew input, translations, and a near miss.
from openai import OpenAI as o; cl = o()
import numpy as np
text =[ "אל תלטף את הדורבן.", "אני אוהב/ת אייפון!"]
text += ["Don't pet the porcupine.", "I love iPhone!", "Avoid the platypus"]
for model in ["text-embedding-3-small", "text-embedding-3-large"]:
try:
out = cl.embeddings.create(input=text, model=model)
print("\n---", model)
except Exception as e:
print(f"ERROR {e}")
array = np.array([data.embedding for data in out.data])
for compi, comp in enumerate(text[:2]):
print("====", compi, comp, "====")
for i, j in zip(text, array):
print(f"{i}: {np.dot(array[compi], j):.5f}")
Gives us some data points:
--- text-embedding-3-small
==== 0 אל תלטף את הדורבן. ====
אל תלטף את הדורבן.: 1.00000
אני אוהב/ת אייפון!: 0.26534
Don't pet the porcupine.: 0.25681
I love iPhone!: 0.08965
Avoid the platypus: 0.25611
==== 1 אני אוהב/ת אייפון! ====
אל תלטף את הדורבן.: 0.26534
אני אוהב/ת אייפון!: 1.00000
Don't pet the porcupine.: 0.07188
I love iPhone!: 0.62643
Avoid the platypus: 0.02311
--- text-embedding-3-large
==== 0 אל תלטף את הדורבן. ====
אל תלטף את הדורבן.: 1.00000
אני אוהב/ת אייפון!: 0.30880
Don't pet the porcupine.: 0.28773
I love iPhone!: 0.01331
Avoid the platypus: 0.20583
==== 1 אני אוהב/ת אייפון! ====
אל תלטף את הדורבן.: 0.30880
אני אוהב/ת אייפון!: 1.00000
Don't pet the porcupine.: 0.01012
I love iPhone!: 0.58528
Avoid the platypus: 0.04751
Analysis:
3-small can’t distinguish porcupine from platypus comparing to English.
3-large can do that much better
Both tend to prefer their own language about a different subject instead of the direct translation. This is not seen in comparing Latin languages.
All-Hebrew analysis is not done, as I would not and few readers would understand the results. You can come up with your own native-written texts for the quick script for curiosity. Then embed your application.