Building an AI coding assistant for the OpenAI API

I was thinking of building a quick assistant to help out with writing Python code related to the OpenAI API. I thought that instead of using the built-in retrieval tool to access the documentation and API reference, it might yield better results if I upload the documentation as PDF files or add them to a vector store and build a search tool using function calling.

Has anyone already created a similar tool to help answer questions about the OpenAI API, or is there an official way to download the documentation? Additionally, has anyone written a script to automate this or the fetching of the documentation and API refrence?

I also welcome other ideas on how to improve the functionality! I thought of maybe also adding in the OpenAI API cookbooks, but it might be better to concentrate on the documentation.

2 Likes

Great idea! Did you end up doing this? Would like to try it.

I made a GPT which does exactly this. It also does a bunch of other unrelated stuff so I’m not able to share it without making a privacy policy and such.

Essentially I gave it the oenapi spec from openai-openapi/openapi.yaml at master · openai/openai-openapi · GitHub, which it’s able to understand and parse. In its instructions I explain what the file is, to ignore legacy/deprecated aspects, and its commit hash and date for the purpose of change tracking.

It works really well where you can ask it to implement the data structures and http calls for the various endpoints. The thing I need to be the most particular about is for complex types like polymorphism via oneOf definitions and such. There’s no defacto correct answer to how this should be implemented in your code, so you need to tell it how you prefer to handle it. It’s also useful to allow it to use placeholder types so that it won’t start implementing the entire schema at once.

It’s also not likely to intuit stuff like how you won’t need to deserialize request payloads and vice versa. This can lead to it trying to force out functions which don’t make sense and aren’t needed.

It’s actually not able to browse the pages at https://platform.openai.com/docs all that well since a lot of the information is tucked behind javascript. I ended up stripping out the html for certain stuff, like for assistants. Even if the resulting html has a ton of cruft it’s able to read it really well.

After pasting in the html along with uploading the openapi.yaml to its knowledge, it’s able to give me this kind of advice and summary which I find extremely useful:

Edit: I went and stripped out the openai stuff and put it in a new GPT for anyone curious. It has some issues reading the files from knowledge since I tried to make it aware of the entire API, but it’s functional if you are specific enough.

https://chat.openai.com/share/1c0cccfc-384d-4171-8c7e-aacb66b8d4c0

5 Likes

This is pretty good. Just a few more steps and you can get API keys for your agents through a fetch and screen capture system to build your own AI interfacing with your agent, but that might break OpenAI rules about using your assistant or openAI apis to train other agents… so be careful not to do anything outside of the rules.

:grinning: :grinning:
Came across this randomly just when i needed after i had bashed around by my clueless self.
Trying to develop tools for interfacing with a data API based on a python program i wrote and a gradio interface to help query. Was hoping i could then give to a GPT as tools to use, and then further get that GPT to learn to generalize an approach to accessing other data APIs.
Long story shorter… the OpenAI GPTs don’t even seem to know what openai GPTs are, instead just generalizing it to AI models with no knowledge of how to build themselves. Thought that if i could feed it OAI API docs it could learn…looks like you’ve already achieved that.

Again…wanted to say thanks. Any chance you have an open github or whatever you used to do this, or did you just feed documents to a GPT?

I have tried to implement something like this -
Technical Implementation
The system is designed to ingest, index, search, and retrieve Python code snippets efficiently while providing a chatbot interface to assist users in understanding and exploring code. Let’s break it down into key components:

1. Code Ingestion and Indexing
To enable efficient code search, we extract functions and classes from Python files and store their metadata in a database.

1.1 Extracting Function/Class Names, Code, and Docstrings
We use Python’s built-in ast module to parse files and extract:

Function/Class Names
Code Body
Docstrings (Documentation within functions/classes)
Example Extraction Logic

import ast
def extract_functions_from_file(file_path):
    with open(file_path, "r", encoding="utf-8") as f:
        content = f.read()
    tree = ast.parse(content)
    functions = []
    for node in ast.walk(tree):
        if isinstance(node, (ast.FunctionDef, ast.ClassDef)):
            name = node.name
            start_line = node.lineno
            end_line = node.body[-1].lineno if node.body else start_line
            function_code = "\n".join(content.splitlines()[start_line - 1:end_line])
            docstring = ast.get_docstring(node) or ""
            functions.append((name, function_code, docstring))
    return functions

Why store file path, docstring, chunk id?
file_path: Helps retrieve the original file where a function/class is defined.
docstring: Enables documentation-based search for better code understanding.
chunk_id: A unique identifier for each function (combines file name + function name).

1.2 Storing and Indexing Code with Embeddings

Once we extract function details, we store them in a database and generate embeddings using OpenAI’s text-embedding-ada-002 model.

Embedding Generation

embedding_response = client.embeddings.create(
    input=[code + " " + docstring], 
    model="text-embedding-ada-002"
)
embedding = embedding_response.data[0].embedding

Why embeddings?
Traditional keyword-based search fails when looking for semantically similar code. By converting text into vector embeddings, we can perform similarity-based retrieval.

Word2Vec for Code Understanding
We train a Word2Vec model on function names, docstrings, and code tokens.

w2v_model = Word2Vec(tokens, vector_size=100, window=5, min_count=1,
workers=4)
Why Word2Vec?
Helps in synonym expansion (e.g., search_function vs. lookup_method).
Enables suggestions for similar function names.

2. Query Processing
To ensure efficient search and retrieval, we process user queries before searching.

2.1 Query Preprocessing (preprocess_query)

import re
import spacy
nlp = spacy.load("en_core_web_md")
def preprocess_query(query):
    query = query.lower()
    query = re.sub(r"[^a-z0-9\s]", "", query)  # Remove special characters
    doc = nlp(query)
    words = {token.lemma_ for token in doc if token.pos_ in {"NOUN", "VERB", "PROPN"} and not token.is_stop}
    # Synonym Expansion
    words.update({syn for word in words if word in SYNONYM_DICT for syn in SYNONYM_DICT[word]})
    return " ".join(words)

Why NLP & Lemmatization?
Removes stop words (the, is, a etc.).
Extracts meaningful words (find similar function → find function similar).
Expands synonyms for better recall.

3. Code Search Mechanism

3.1 Vector Similarity Search
Once a query is processed, we generate query embeddings and retrieve similar documents.

Query Embedding

query_embedding = client.embeddings.create(input=query,
model="text-embedding-ada-002").data[0].embedding
Vector Search (PostgreSQL with pgvector)
def search_similar_documents(query_embedding, top_k=3):
    embedding_array = f"[{','.join(map(str, query_embedding))}]"
    with connection.cursor() as cursor:
        cursor.execute(
            """
            SELECT id, title, content, docstring, file_path, embedding <=> %s::vector AS distance
            FROM document
            ORDER BY distance ASC
            LIMIT %s;
            """,
            [embedding_array, top_k]
        )
        results = cursor.fetchall()
    return results

Why pgvector?
Performs fast vector similarity searches.
Uses cosine similarity (<=>) to find the most relevant results.

3.2 Keyword-Based Search (Trigram Similarity)

To complement vector search, we use PostgreSQL’s TrigramSimilarity for text matching.

from django.contrib.postgres.search import TrigramSimilarity

keyword_results = Document.objects.annotate(
    similarity=TrigramSimilarity("title", query) + TrigramSimilarity("docstring", query)
).filter(similarity__gt=0.3).order_by("-similarity")[:3]

Why Trigram Similarity?
Helps when query is misspelled (serch function → search function).
Matches partial words (find meth → find_method).

4. Chat Session Management
We store chat sessions so users can interact with the system over time.

4.1 Rate Limiting and Session Handling

MAX_QUERIES_PER_HOUR = 100

one_hour_ago = now() - timedelta(hours=1)
user_message_count = Message.objects.filter(
    chat_session=chat_session, role="user", created_at__gte=one_hour_ago
).count()
if user_message_count >= MAX_QUERIES_PER_HOUR:
    return Response({"error": "Query limit reached. Try again in an hour."}, status=429)

Why rate limiting?
Prevents abuse/spam.
Ensures API cost control (OpenAI API calls).

4.2 Summarizing Older Chats (Using tiktoken)
To prevent exceeding OpenAI’s token limits, we summarize old chats.

import tiktoken
TOKEN_LIMIT = 3000

encoding = tiktoken.encoding_for_model("gpt-4o-mini")
total_tokens = sum(len(encoding.encode(msg["content"])) for msg in chat_history)
if total_tokens > TOKEN_LIMIT:
    summary_prompt = f"Summarize the chat:\n\n{chat_history}"
    summary_response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "system", "content": summary_prompt}],
        max_tokens=300
    )
    summary = summary_response.choices[0].message.content.strip()
    Message.objects.create(chat_session=chat_session, role="system", content=summary)

Why tiktoken?
Calculates exact token count before sending requests.
Prevents exceeding OpenAI limits (4096 tokens per request).

5. AI Chat Completion
Finally, after retrieving relevant code snippets, we generate an AI response.

completion_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=chat_history,
    max_tokens=500
)

answer = completion_response.choices[0].message.content.strip()
Why GPT-based completion?
Generates contextual answers based on retrieved code.
Supports natural language explanations.