UnicodeEncodeError while using load_summarize_chain


I’m new to langchain. I’m trying do summarize a PDF document with langchain.load_summarize_chain and I’m getting this error: UnicodeEncodeError: ‘ascii’ codec can’t encode character ‘\u201c’ in position 22: ordinal not in range(128) . The document is a PDF with text in Brazilian Portuguese.

Direct access to the text through

from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader()
doc = loader.load()

works fine!

Any help?

Thanks a lot!


This sounds like a langchain/programming issue. Have you tried looking around for langchain UnicodeEncodeError?

sometimes libs are out of whack.

Yes, it’s about libs and about API credits too.

Langchain version issue resolved with:
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate

API credits issue resolved as allways (lol).