What is the current best option for providing documentation to LLMs so they can use an open-source library?

fine_tune · February 28, 2025, 6:28pm

There is an open-source library I am using that has a few .rst (Sphinx documentation) files in addition to clean code. I am interested in providing the package to LLMs—either/or providing just the .rst meant for humans or the .py files themselves.

What are the best open-source options to provide either a small codebase or human-readable documentation to LLMs so they can use the tool?

jochenschultz · March 1, 2025, 11:40pm

What do you mean by “open-source” - this sounds more like a 20-30 lines of code python script…

something like this:

#!/usr/bin/env python
import argparse
from pathlib import Path

def gather_files(root, ext):
    return list(Path(root).rglob(f'*{ext}'))

def main():
    parser = argparse.ArgumentParser(description="Aggregate .py or .rst files for LLM ingestion")
    parser.add_argument("root", help="Root directory of the package")
    parser.add_argument("mode", choices=["code", "doc"], help='Choose "code" for .py or "doc" for .rst files')
    parser.add_argument("--outfile", default="llm_input.txt", help="Output file to aggregate content")
    args = parser.parse_args()

    ext = ".py" if args.mode == "code" else ".rst"
    files = gather_files(args.root, ext)
    
    with open(args.outfile, "w", encoding="utf-8") as out:
        for file in files:
            out.write(f"=== {file} ===\n")
            out.write(file.read_text(encoding="utf-8"))
            out.write("\n\n")
    print(f"Aggregated {len(files)} {ext} files into {args.outfile}")

if __name__ == "__main__":
    main()

You could also use a shellscript or you run a script to analyse them all and make a small summary on what they are useful for and feed this combined summary on every request like so:

hey bot… you could use stuff from this list:

[list]

Or you go full overkill and use LangChain or Llamaindex… or do a combination of a neo4j graphdb with many analyzers that feed the information graph from the rst files and the code together with embeddings.

And then maybe train a GNN from the graph

fine_tune · March 2, 2025, 12:35am

The problem is that the full python files and all RST files can’t fit into a context window so that’s what I was thinking of using RAG. And code is a little different from plaintext. But yeah maybe the solve here is just thoughtful chunking of the code so that LLM can call for a particular piece of information which then feeds in relevant documentation etc?

jochenschultz · March 2, 2025, 12:47am

So a single RST doesn’t fit into o3-mini context of what? 200k?

Let it give you a summary and a file location and then give that to the model and tell it it can get the full description with a tool call…

Like so:

hey bot you can read full rst descriptions when using the rst reader tool. Here is a short summary of each file:

1.rst - contains x…
2.rst - contains y…
2.rst - contains z…

Or tell o3 to give you a summary of a rst but only every second token haha

@edwinarbus wouldn’t that also be an idea to summarize older chat messages so the chat’s get better memory?

To efficiently manage long conversations within token limits, chat messages can be stored in a fast-access database like Redis, keeping both the full version and a compressed summary using an nth-token approach. On each follow-up request, the system sends only the summarized past messages to the model, preserving context while reducing token usage. If the user asks for specific details, the system can retrieve and send the full, original message that best matches the query using semantic search or keyword matching on the original text (or use embeddings - just to select the best matching full context that needs to be send). This approach ensures efficient memory management while still allowing access to precise information when needed.

Or here is a compressed version

To manage long within limits, messages be in fast-access like keeping full and compressed using nth-token or summarization On follow-up, system only summarized messages model, context reducing usage. user for details, system retrieve send full, message best query semantic keyword This efficient management allowing precise needed.

Or give this to the devs:

tokens = tokens[::2]

which could be used using tiktoken and potentially reduce the api costs by 50%

Topic		Replies	Views
Leveraging LLMs with Vast Mechanic Datasets and Guides API api	6	2371	August 31, 2023
How knowledge base files are handled (Assistants API) API assistants-api	14	8289	February 8, 2024
Strategies for long RAG conversations Prompting rag	14	5052	May 17, 2024
Context Limit Token Issue in openai.ChatCompletion.create API Call API chatgpt	17	818	September 2, 2024
Information summary by using API API	3	6773	January 9, 2024

What is the current best option for providing documentation to LLMs so they can use an open-source library?

Related topics