I find this very interesting:
But it’s also a more complex topic and that involves specialized knowledge about advanced math to answer.
If we go back to the example from your second question, it might be a bit easier to understand:
Here’s some python spaghetti I made according to your example:
import openai
import unicodedata
import random
openai.api_key = 'your-api-key'
def unicode_case(symbol):
category = unicodedata.category(symbol)
if category == 'Ll':
return 'lower'
elif category == 'Lu':
return 'upper'
else:
return 'unknown'
def unicode_case_openai(symbol):
response = openai.Completion.create(
engine="text-davinci-002",
prompt=f"What is the Unicode category of the symbol '{symbol}'?",
max_tokens=100
)
response_text = response['choices'][0]['text'].strip()
if 'lowercase letter' in response_text:
return 'lower'
elif 'uppercase letter' in response_text:
return 'upper'
else:
return 'unknown'
def generate_symbols(n):
return [chr(random.randint(0, 10000)) for _ in range(n)]
# Generate a test set of 100 symbols
symbols = generate_symbols(100)
total = len(symbols)
correct = 0
for symbol in symbols:
expected = unicode_case(symbol)
actual = unicode_case_openai(symbol)
if expected == actual:
correct += 1
print(f'Correct: {correct}/{total} ({correct/total*100:.2f}%)')
This script generates a list of 100 random Unicode characters and compares the case (upper, lower, or unknown) of each symbol as determined by the Python unicodedata library with the case determined by an OpenAI API call. The percentage of matches between these two methods is then calculated and printed. (note: this example is not an eval, just a script to capture the essence of your request to make it more verbose)
It looks fine on the surface, but there one issue here:
def generate_symbols(n):
return [chr(random.randint(0, 10000)) for _ in range(n)]
# Generate a test set of 100 symbols
symbols = generate_symbols(100)
The issue is that we won’t know if the tests are comparable when we use the generate_symbols() function multiple times for multiple tests. We can make the test repeatable by moving the function outside the script. We can then use it to generate the symbols list:
# Test set of symbols
symbols = ['a', 'A', 'b', 'B', 'α', 'Α', '1', '!', ' ', 'ζ', 'Δ', and so on]
The tests will now be the same when this is inserted in our original script instead of the generate_symbols() function.
I’m leaning towards yes, but you have to be very careful about repeatability.
I hope this helps answer your question 