Hi,
So here’s the problem i’d like to solve. I pick a random patent from google patents, and give it to chatgpt, then ask it to generate claims. It does a bad job. So i thought i can fine tune it to do so. But, the input context is so big, it will never see the whole context, and in most cases the output required.
I need advise on how to design my instruction dataset for this usecase, and how to handle for the token limitation.