Text classification and text extraction best practices

meri.davtyan · July 2, 2024, 11:53am

Hi there!
I’ve just started to learn how to work with api and currently i’m working on 2 projects(one is text extraction and the other is classification)
So in first project i have medical data that includes transcriptions column, from which api should extract the age of the patient, treatment and ICD code. Now the question is how is it better to give my data to API? Now i’ve just extracted the transcription, converted to a list and gave in prompt, but i’m sure there are better ways to do this. Is it better to give it in one list all together or one transcription at the time? (which way is more accurate, fast and/or cost efficient).
In the second mini project i have list with names that api should classify by gender. Here i have same question. Also since those are singe turn requests do i need to use both user and system roles or can just give everything in one?
I’ll be glad to get any best practices and advise that will make this whole topic clearer for me. Thank you so much for your time!)

kristinholt305 · August 7, 2024, 12:02pm

You’re diving into some exciting projects with APIs. I’d recommend feeding the transcriptions to the API one at a time for your first project. This approach typically leads to more accurate results since the API can process each transcription individually without getting confused by multiple entries simultaneously. It might be slower than batching them together, but the improved accuracy often outweighs the time cost.

meri.davtyan · August 15, 2024, 10:27am

Thank you so much for a recommendation! Thats exactly how i did it and the result was pretty accurate. I also used api in some other more complicated projects and this method was the best working one

Topic		Replies	Views
Is it possible to give a answer for each line? API	3	188	May 31, 2024
How to make classification and information extraction task cost-effective API	4	1270	March 20, 2024
Help needed: sending segmented long document to api and create long text API	3	99	November 6, 2024
Best practice scanned PDF / What model to use? API chatgpt , plugin-development , api , gpt-4-vision	3	1172	February 19, 2025
API choice for research question API	6	243	July 4, 2024

Text classification and text extraction best practices

Related topics