Hi, These days, I am creating a Korean graph RAG.
For efficiency, I applied the Ray framework.
However, the triplet extraction part for building the graph DB has been stopped for several hours due to a 502 error.
I already put in a code to handle when a 502 error occurs, but I don’t know why it’s been stuck for several hours.
Even if error 502 appears, won’t it be resolved if I wait a few minutes?
The code below is only the triplet extraction part of my code.
@ray.remote
def extract_korean_triplet(text):
import time
api_key = ’ … ’
chat_gpt_url = "… "
print('start extracting triplet...')
SYS_PROMPT = (
" … “
"출력 형식은 용어 쌍 및 그들 간의 관계를 포함하며 다음과 같습니다: \n"
"[\n"
" {\n"
' "node_1": "A concept from extracted ontology", it is noun \n'
' "node_2": "A related concept from extracted ontology", it is noun \n'
' "edge": "Key verb phrase that explains the relationship between node1 and node2 "\n'
" }, {...}\n"
"]"
)
# one or two sentences
input_text = f” … : ’{text}'"
#print(input_text)
messages = [
{"role": "system", "content":SYS_PROMPT},
{"role": "user", "content": input_text}
]
headers = {"Authorization":f"Bearer {api_key}"}
call_data_ = {"model": "gpt-3.5-turbo", "messages": messages}
#requests.post(url, headers=headers, json=call_data_).json()
print(text)
data = requests.post(url, headers=headers, json=call_data_).json()
print('decode', data)
#print('type', type(data))
# type(data) is python dictionary and when 502 error occur, It contain error key
if 'error' in data:
time.sleep(60)
print(text)
data_ = requests.post(url, headers=headers, json=call_data_).json()
return_triplet = data_['choices'][0]['message']['content']
else:
return_triplet = data['choices'][0]['message']['content']
return return_triplet
ray.init()
triplets_list = ray.get([extract_korean_triplet.remote(doc) for doc in test_set])
Thanks!