Consequences of Assistants API for LangChain and vector database

aringa · November 14, 2023, 7:52pm

I’m building a domain specific chatbot assistant with some hundreds of documents.
I understand I can supply those documents to the assistant, so embeddings are no longer needed.
The bot needs to be fine-tuned to respond in a defined style.
My question is, do I still need LangChain with a vector database like weaviate, or can I leave them out?

_j · November 14, 2023, 8:15pm

Actual fine-tune models cannot be used with assistants.

Embeddings is still used on “retrieval” “assistant files” uploaded.

You cannot “attach” more than 20 files per assistant, but they can be quite large.

You are billed per day for storage…times the number of attached assistants.

You have no control over the amount of chat context, metadata enhancement, or prompt enhancement techniques used for the semantic search retrieval.

The model will be filled to the max context length with retrieval regardless of quality.

In short: there are only about 100 different reasons why you would NOT want to use assistants.

aringa · November 14, 2023, 9:01pm

Got it. So the assistants API is not an adequate replacement for LangChain and local vector database. That’s a pity as it would have simplified the technical implementation a lot.

alvetr0 · November 14, 2023, 9:34pm

Hi aringa, you could potentially use this python code to get a max of 20 chunks files of all your files:

import os
import sys
import tiktoken

def create_chunks(directory_path, model_name="gpt-4", max_tokens_per_chunk=1500000, max_chunks=20):
    # Get the tokenizer encoding for the specified model
    encoding = tiktoken.encoding_for_model(model_name)

    # Initialize variables
    chunks = []
    current_chunk = ""
    current_token_count = 0
    chunk_count = 0

    # Iterate over each file in the directory
    for file_name in os.listdir(directory_path):
        file_path = os.path.join(directory_path, file_name)

        # Ensure that it's a file
        if os.path.isfile(file_path):
            # Read the file
            with open(file_path, 'r') as file:
                text = file.read()

            # Divide the text into chunks based on tokens
            for line in text.split('\n'):
                line_tokens = encoding.encode(line)
                line_token_count = len(line_tokens)

                if current_token_count + line_token_count > max_tokens_per_chunk:
                    chunks.append(current_chunk)
                    current_chunk = line + '\n'
                    current_token_count = line_token_count
                    chunk_count += 1

                    if chunk_count == max_chunks:
                        break
                else:
                    current_chunk += line + '\n'
                    current_token_count += line_token_count

            if chunk_count == max_chunks:
                break

    if current_chunk and chunk_count < max_chunks:
        chunks.append(current_chunk)

    # Save each chunk to a separate text file
    for i, chunk in enumerate(chunks):
        chunk_file_path = f'chunk{i+1}.txt'
        with open(chunk_file_path, 'w') as chunk_file:
            chunk_file.write(chunk)

# Example usage
create_chunks('/path/to/directory', model_name="gpt-4")

based on this tutorial: Tutorial

DarthFader · November 15, 2023, 2:57am

This is interesting , I don’t know that. Do you have a reference by any chance?

_j · November 15, 2023, 5:52am

It’s the old “I came here with questions so I can doubt the answers”.

You mean like an “API reference”? Yeah, I’ve got one. There’s a link on the side of the forum.

When there you can either read the technical stuff, or go to the top bar and choose documentation and go to assistants and threads there also.

You’ll find that both threads and retrieval promise to fill the AI context to the max. So especially bad for you if you let random people at a chatbot with no “hangup” function and you’ve got lots of documents about your knowledge to share with potential or existing customers when they have a long chat.

The engagement can cost you more than a human.

Topic		Replies	Views
Understanding the current Assistant Retrieval process API assistants	7	13294	November 20, 2023
New "Assistants" API a potential replacement for low level "RAG" style content generation? API	9	8382	March 4, 2024
The OpenAI console Assistant does not use or find some of the files uploaded in its file search zone API	5	174	October 10, 2024
Did assistant api kill manual RAG with vector databases? API	8	6231	December 18, 2023
Assistants API Cost Exceeds Reasonable Expectations API gpt-4	4	931	April 11, 2024

Consequences of Assistants API for LangChain and vector database

Related topics