While using fine tuning, get this error:UnicodeEncodeError:

while running openai command, keep getting this error, both these two commands: “openai api fine_tunes.list” and “openai api fine_tunes.create” .

plz help me

yafengluo@YadeMacBook-Air ~ % openai api fine_tunes.create -t “Desktop/gpt/25night15_prepared.jsonl” -m davinci --suffix “25night”
Traceback (most recent call last):
File “/usr/local/bin/openai”, line 8, in
sys.exit(main())
File “/Library/Python/3.7/site-packages/openai/_openai_scripts.py”, line 63, in main
args.func(args)
File “/Library/Python/3.7/site-packages/openai/cli.py”, line 369, in create
args.training_file, args.check_if_files_exist
File “/Library/Python/3.7/site-packages/openai/cli.py”, line 345, in _get_or_upload
openai.File.retrieve(file)
File “/Library/Python/3.7/site-packages/openai/api_resources/abstract/api_resource.py”, line 20, in retrieve
instance.refresh(request_id=request_id, request_timeout=request_timeout)
File “/Library/Python/3.7/site-packages/openai/api_resources/abstract/api_resource.py”, line 36, in refresh
request_timeout=request_timeout,
File “/Library/Python/3.7/site-packages/openai/openai_object.py”, line 186, in request
request_timeout=request_timeout,
File “/Library/Python/3.7/site-packages/openai/api_requestor.py”, line 224, in request
request_timeout=request_timeout,
File “/Library/Python/3.7/site-packages/openai/api_requestor.py”, line 523, in request_raw
timeout=request_timeout if request_timeout else TIMEOUT_SECS,
File “/Library/Python/3.7/site-packages/requests/sessions.py”, line 587, in request
resp = self.send(prep, **send_kwargs)
File “/Library/Python/3.7/site-packages/requests/sessions.py”, line 701, in send
r = adapter.send(request, **kwargs)
File “/Library/Python/3.7/site-packages/requests/adapters.py”, line 499, in send
timeout=timeout,
File “/Library/Python/3.7/site-packages/urllib3/connectionpool.py”, line 710, in urlopen
chunked=chunked,
File “/Library/Python/3.7/site-packages/urllib3/connectionpool.py”, line 398, in _make_request
conn.request(method, url, **httplib_request_kw)
File “/Library/Python/3.7/site-packages/urllib3/connection.py”, line 239, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File “/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/http/client.py”, line 1229, in request
self._send_request(method, url, body, headers, encode_chunked)
File “/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/http/client.py”, line 1270, in _send_request
self.putheader(hdr, value)
File “/Library/Python/3.7/site-packages/urllib3/connection.py”, line 224, in putheader
_HTTPConnection.putheader(self, header, *values)
File “/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/http/client.py”, line 1202, in putheader
values[i] = one_value.encode(‘latin-1’)
UnicodeEncodeError: ‘latin-1’ codec can’t encode character ‘\u2018’ in position 7: ordinal not in range(256)

Sounds like you have a foreign character in your dataset. Are you using UTF8?

1 Like

I’m getting the exact same error for position 7, but mine is character ‘\u201c’. I am using UTF8.

Here is the first test prompt in my file: {“prompt”:“appointment time for Wanda ->”,“completion”:" 2:00 pm"}

Very strange.

I had the similar backtrace of a problem as you posted. I solved it by using correct sign of double quotes in OPENAI_API_KEY. I was using “ instead of ". First I thought my dataset was a problem, but it’s was a wrong guess.

4 Likes