I am trying to write R code to read in a pdf, and then use chatgpt to make sense of the often messy text and then output it as a table or data fram.e
I know this is possible because if I copy paste the text from a pdf into chatgpt interface and prompt it to “convert to table” it does it perfectly.
This is currently my code:
library(pdftools)
setwd("")
pdf_text <- pdf_text("1pagepdffile.pdf")
pdf_text
pdf_text <- paste(pdf_text, collapse = " ") # Collapse multiple pages into a single string
pdf_text
library(httr)
# API call to GPT-4
response <- POST(
"model url",
add_headers("Authorization" = "Bearer APIKEY", "Content-Type" = "application/json"),
body = list(
prompt = paste("Please format the following data as a table:", pdf_text),
max_tokens = 500 # You can adjust this based on your needs
),
encode = "json"
)
# Parse the response to get the text output
response_content <- content(response, "parsed")
response_text <- response_content$choices[[1]]$text
# Print the response or write to a file
cat(response_text)
Any help would be appreciated.