I am trying to write R code to read in a pdf, and then use chatgpt to make sense of the often messy text and then output it as a table or data fram.e
I know this is possible because if I copy paste the text from a pdf into chatgpt interface and prompt it to “convert to table” it does it perfectly.
This is currently my code:
library(pdftools) setwd("") pdf_text <- pdf_text("1pagepdffile.pdf") pdf_text pdf_text <- paste(pdf_text, collapse = " ") # Collapse multiple pages into a single string pdf_text library(httr) # API call to GPT-4 response <- POST( "model url", add_headers("Authorization" = "Bearer APIKEY", "Content-Type" = "application/json"), body = list( prompt = paste("Please format the following data as a table:", pdf_text), max_tokens = 500 # You can adjust this based on your needs ), encode = "json" ) # Parse the response to get the text output response_content <- content(response, "parsed") response_text <- response_content$choices[]$text # Print the response or write to a file cat(response_text)
Any help would be appreciated.