Hi everyone, we’re building Elicit, a research assistant powered by GPT-3.
Our main task is a literature review task, where users ask a question and we answer it based on published research. We use GPT-3 in a lot of ways:
To rank papers based on relevance
To find the most relevant sentence from a paper’s abstract and convert it into an answer to the question
To identify how other authors have critiqued the papers, and which critiques are methodological
We also supplement it with other tools e.g. the Semantic Scholar paper corpus and other libraries for tagging papers by type.
We’re now working on full-text search & extracting key information from papers like details about interventions, outcomes, effects, etc.
We’ll probably focus on empirical questions (e.g. research involving randomized controlled trials and systematic reviews) for the next quarter but hope to help you all with more ML-relevant research soon.
It’s open & free now so feel free to try it out and share feedback. There are other tasks in the app you can explore as well.
Very nice. I am working on something quite similar. Can I ask what kind of database you are using to store the research papers? I built my website with wordpress because I didn’t have a technical team at the time and wordpress enabled me to build on my own, so I’ve got a mysql database storing my text. Soon we’re going to move to a better database and custom site and I’m looking for suggestions for the best way to store long pieces of text. MongoDB? Also, I love your UX. Thanks for any insight. Leslie
I have an idea for you guys. It would be very interesting to customize the literature search section of papers specific to the background knowledge/interest/intentions of a reader. This eliminates the need for including significant but redundant material in a technical paper, while it might not be still proper for all individual readers. Good luck!
Definitely interested in this and doing some early prototypes but we expect this to be pretty hard. Our early prototypes are letting people do a form of “custom Q&A” over a large set of papers e.g. figure out what was the sample population across many randomized controlled trials.
We use the Semantic Scholar API and don’t store most of the papers ourselves right now. We have been using MongoDB but are thinking about moving to postgres.
I took another look at Elicit to see how it’s coming along and it looks great! Love the ability to add your own columns and how clean the interface is.
We get back an initial batch of papers from Semantic Scholar (I believe they use a primarily keyword-match-based search). Then we rerank those papers using the babbage search engine.
Hi Jungwon, I just got your most recent newsletter with your product roadmap. I’m following you closely because what I’d doing for law is analogous to what you are doing for research papers. It might be nice to compare notes over zoom some time. I’m a Canadian/U.S. qualified lawyer, with a research specialty. What’s neat about U.S. law, compared to Canadian, is that the rules are available in machine-ready format. It would be neat to brainstorm with you. Very few other companies, if any, are using GPT-3 for legal rules. Best, Leslie
I apologize if this question is inappropriate, but I was wondering how you implement the function that allows users to ask questions about a paper they have uploaded. As far as I know, it’s much longer than the maximum length limit.