Foundational must read GPT/LLM papers

qrdl · October 23, 2023, 2:11am

Recently have engaged in some RAG / retrieval augmented generation.

[2307.03172] Lost in the Middle: How Language Models Use Long Contexts
[2004.04906] Dense Passage Retrieval for Open-Domain Question Answering
[2309.09117] Contrastive Decoding Improves Reasoning in Large Language Models
[2209.10063] Generate rather than Retrieve: Large Language Models are Strong Context Generators
[2304.14856] A Unified Generative Retriever for Knowledge-Intensive Language Tasks via Prompt Learning
Curious idea - [2212.02027] Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer
[2304.14856] A Unified Generative Retriever for Knowledge-Intensive Language Tasks via Prompt Learning
[2004.12832] ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
Combining Embedding and Keyword Based Search for Improved Performance | by Zachariah Zhang | Medium
[2308.14963] Vector Search with OpenAI Embeddings: Lucene Is All You Need
[2212.09146] Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model
[2305.15294] Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy
[2205.01230] Retrieval-Enhanced Machine Learning

One big takeway was the power of blending bm25, tfidf, and dpr / embedding based retrieval. Different strategies can be used, such as reciprocal rank fusion. One always important task, though inevitably challenging, is evaluation criteria for potential training of your retriever.

Another is that by precomputing such things as tfidf/sentence embeddings, you can achieve significant speedups. See the colbert paper above for other approaches to this.

Two papers above, while not particularly ‘foundational’ I think capture some key constraints of RAG quite well. " Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model" and the curious paper - " Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer". The former becomes quickly apparent there is a clear tension between the two main architectural components, and the latter as a potential and novel resolution helps bring color to the discussion.

Topic		Replies	Views
Discussion thread for "Foundational must read GPT/LLM papers" Community gpt-4 , gpt-35-turbo , chatgpt , research	75	10547	September 3, 2024
Phasm - Macro Assembler of User Concepts Community chatgpt , project , macros , phasm	29	717	April 24, 2025
Day 12 of Shipmas: New frontier models o3 and o3-mini announcement Community shipmas	71	8209	December 26, 2024
Mystery model popped up on lmsys gpt2-chatbot - gpt4.5? Community gpt-4	53	11565	May 14, 2024
A sanity check for future plugins to access private SQL databases Plugins / Actions builders	61	5715	November 30, 2023

Foundational must read GPT/LLM papers

Related topics