How can the OpenAI API clean CSV data?

In this case, I don’t think you can just throw 2 lists and ask for a result.

It works best, like I mentioned, by providing a list on company names you want, show some examples of incorrect and correct versions, and ask it to give it a guess. One prompt per row of data to classify , only the necessary columns or at most just a few.
It probably won’t work if you have thousands of possible companies.

It is not a trivial task to guide in a simple post though. You are asking for a complete system solution.

I suggest first familiarizing with how to use the API, study structured outputs, and use chatgpt to help you prepare an outline of what you need to learn to implement it.

1 Like