@ruby_coder The database cost (for me) is still the dominant term. For whatever reason, data in the cloud is expensive (why? … I have no idea). But, luckily I don’t need to use vector databases for search, just in-memory data structures that I code myself, but to scale up to billions of embeddings, you should look into vector databases for quick vector searches. There are theoretical reasons for this, and I am thinking of the algorithm in FAISS now.

As for SQL DB’s being cheap … in the cloud … no way! Go serverless and forget about it! For any SQL DB in the cloud you pay by hour. I pay by read/writes or I go provisioned.

@bill.french I think revectoring is the cost of maintenance!

Everyone’s infrastructure requirements are different.

As mentioned, I was not commenting on cloud services, I mentioned it was free to host these DB on your own network.

In addition, there is a big difference between running your own databases (which my clients and I do, many internally and many hosted in data centers) and buying DB “cloud services”.

The issue I have here is that people often post “solutions” without getting down into the weeds of an organization’s capabilities, current IT infrastructure, employee skill sets, prior investments, etc.

The fact of the matter is, and I am sure you agree, is that there is no technical reason you cannot use a SQL database to store vectors as serialized data in the DB; and that doing this (storing vectors, arrays, etc) as serialized data in SQL DB has been around for a very long time and it works very well.

It is not a “requirement” in this field to use vector-based DBs “as a service” to build reliable and scaleable embedding applications successfully.

HTH

:slight_smile:

2 Likes

I agree that vector DB’s are WAAY overhyped, and only fill a niche of maybe 1% of the systems out there. Whereas most folks, like me, can get away with two things … in-memory search (using naive/linear techniques), followed up by simple DB lookups (serverless here, it’s cheap!).

It’s better to start simple, for sure. NO VECTOR DB’s!!! Grow to them if you have to, but not the first choice.

2 Likes

Agreed, of course,

I have had good luck with standard “SQL DB” as mentioned and if speed becomes an issue on the server side, I can easily add Redis.

From what I have seen here, the limiting performance factor is not DBs on the user / client side, but the OpenAI infrastructure performance, specially in the recent turbo models.

I agree that for people building their first apps and creating a user base, etc. that there is really no need to go "vector DB’ when this can easily be done with “traditional” SQL DBs.

In addition, when searching text, vectoring keywords and short phrases provide very poor results compared to DB keyword and / or DB full-text searches.

When a system designer moves (for example) to a “fully vectorized DB approach” they can lose the ability to use standard full-text DB searches when more optimal than vector-based semantic search.

That is why, in my view and I agree with you @curt.kennedy , it is good to not fall “for the hype” and to start off simple as you have said @curt.kennedy.

There is no shortage of “hype” and no shortage of people, start-ups, etc hoping to profit off OpenAI technology and the current hype.

:slight_smile:

2 Likes

This is really good to know. I was convinced by an OpenAI blog post a vector database was required.

This is very helpful. Imagine I wanted to perform embedding search on say, a Jetson. Could the database provide an edge-based solution without connectivity?

You search for something on the comic “The Jetsons” ?

Sorry, but it’s hard to answer your question without specifying was “a Jetson” is in your question? You mean George Jetson or his son Ellroy or Jane, his wife? Or daughter Judy?

:slight_smile:

Ha ha! Well, Jane was hot, but not as hot as a NVIDIA Jetson running at MAXN with six cores.

Yeah, words have meaning; we need to make sure we use enough of them.

Pinecone, as you know, cannot run on-prem. My requirements for this product is to perform embedding searches during periods of disconectivity.

Your GPU might pair well with the open source Facebook AI Similarity Search (FAISS). But if you have less than 1 million embeddings, like discussed above, you can do this “by hand” with the naive searches like this:

def mips_naive(q, vecs):
    mip = -1e10
    idx = -1
    for i, v in enumerate(vecs):
        c = np.dot(q,v) # dot is the same a cosine similarity for unit vectors
        if c > mip:
            mip = c
            idx = i
    return idx, mip

Also you could use Redis, see this thread: Using Redis for embeddings

1 Like

Doesn’t Pinecone provide the ability to query by cosine similarity, meaning Pinecone performs the task of both storing the vectors and performing the linear algebra?

How do you find similarity in your method? You query (import) all the stored vectors and compute cosine similarity in a loop against your query?

Do you have a link to this post that mentions their use of PostgreSQL DB and the announcement of scaling up infrastructure? I’m curious to read more. Thanks!

1 Like

The claim that OpenAI uses PG ergo vector DBs are not useful is about as credible that all the startups that put Fortune 100 company logos on their websites because somewhere, one mid-level dev with a corporate card once signed up for a trial and might have forgotten to turn it off and billed a month ergo “Apple uses our product”.

Databases are tools.

While I share your skepticism of “hype” and think VCs have rushed into raising enormous rounds for vector DB startups at insane valuations without truly understanding the specialized nature of the product and segment, I think your post here might do the opposite: discount the very utility of such a specialized tool.

Can you fasten/remove a Torx bolt by jamming just the right size Phillips drivers onto it and will it work in a jam, for a single little project, or a few times? SURE. Will it come and bite you later if you try to confuse it for a Torx driver? Yes. Without a doubt. Is it the right tool for the job for a professional, at scale, wanting to give their customer the best work? No. Objectively: no.

Postgres is amazing. What a wonderful general purpose data store it is! It even has some incredible plug ins. But the very fact that its wire protocol has been used to reimplement the actual engine for things like time series, active-active, sharding, and horizontal scale tells us a very important fact: it is not the silver bullet you are making it out to be.

Your commentary is hardly objective nor based in “system engineering”: I am sure OpenAI uses Postgres. I am sure they use it for its strengths (like transactional data, HRIS applications, or the myriad other things any business does). If it underpins their actual technology as a primary vector store, I would guess that it is only with some very, very advanced, proprietary pg_* plugins, storage layers, etc that basically turn it into a CockroachDB style implementation of where it’s just the PG wire protocol talking to an enormously different storage engine (read: NOT at all Postgres).

I love me some PG just as much as the next guy, and think this “Vector DBs are the greatest thing since sliced bread and will solve all my problems and make everything else obsolete” is just as crazy as “Vector DBs are just a fad, meh, Postgres FTW”.

If you can actually substantiate that OpenAI is using vanilla-ish (or close to it) PG (and its actual storage, query, etc engine) for actual OpenAI vector or embedding use, I encourage you to substantiate your claim, but I suspect that’s not possible because 1) that information is largely proprietary 2) we know that PG as a datastore is not built for that at even a fraction of a percent of OpenAI’s scale. I am sure vanilla-ish PG exists in their ERP, CRM, etc systems abound, but that’s a specious argument to confuse that with the actual service delivery stack to try and discredit Vector DBs.

EDIT: Also, let’s not confuse pg_openai and attendant end user functions/stored procedures/UDFs with what it takes to run OpenAI’s service delivery fabric.

1 Like

Don’t hold your breath. OpenAI is not using Postgres in lieu of specialized vector data stores.

Keep in mind that people out there pay a monthly fee for feature flags as a service. There’s definitely a market for OP’s product.

@curt.kennedy can you expand on what you mean by " followed up by simple DB lookups (serverless here, it’s cheap!)."

Serverless is an increasingly meaningless / hype term to my hears. What is a serverless database? A flatfile database on an edge node?

An example of a serverless database would be DynamoDB.

Google search:

Amazon DynamoDB is a fully managed, serverless, key-value NoSQL database designed to run high-performance applications at any scale.

The original question asked about Llama index as well but it really didn’t get touched on in this thread.

I’m trying to get a better understanding of Llama index vs vector databases. I’m sure a bunch of you are going to scoff at my question thinking that they are two wildly different tools but I’ve been doing a bunch of reading and they SEEM to server somewhat similar purposes. I have looked at a bunch of tutorials on how to build chat bots based on data feeds or specific files. All the tutorials always use langchain and a GPT integration but what varies is whether they use Pinecone, Llama index or both.
Note: when I say pinecone, I’m really referring to the concept of a vector database, but the tutorials pretty much exclusively use Pinecone (when they are using a vector DB).

I would love if one of you experts would give me a brief rundown on what the core difference is between Llama index and vector DBs (whether on the cloud or locally hosted) and when you want to use one or the other or both.

From the little I’ve been able to piece together, I think it has to do with:

  • number of embeddings
  • type of content (i.e. querying a book vs a product feature database)
  • how dynamic your data is (i.e. is it constantly being updated every single day)
  • I’m sure there are other categories that I’m missing

For context, I’m looking to build a chatbot that queries a product feature database with about 20k records)

Any help would be greatly appreciated.
Thanks in advance

Not to be that guy… but this is similar to asking whats the difference between MySQL and WordPress.

  • MySQL is a database where you can store data for many many different types of applications
  • WordPress is a package of code that uses MySQL and other tools to accomplish a specific goal (blogging).

There are many uses for vector databases outside of Chatbot applications.
LlamaIndex packages up a lot of helpful scripts to accomplish a specific goal (Chatbots on your data (I know you can do more low-level stuff, but thats the general purpose)). LlamaIndex uses vector databases to work.

If you’re just getting started in this space, I’d say use LlamaIndex or LangChain to do a proof-of-concept with your data and start to learn all the moving parts. The same way that many websites outgrow WordPress, you may outgrow LlamaIndex and decide to research other vector databases or write more of the code yourself.

1 Like

No, I appreciate the feedback. I knew the answer was something like this but it just saves so much time up front instead of going down a rabbit hole.

The analogy really helps. I am definitely at the wordpress level with this chatbot implementation so I will definitely take your advice and go the LlamaIndex route.

If there are any resources you can suggest to accelerate my learning curve, I’d really appreciate it.
Thanks

Hi @ruby_coder i found your response very helpful. And I wanted to ask whether it would be a good idea also, to keep my embeddings inside my Azure FunctionApp, instead of using any separate vectorDB, in a production application.
I understand it could have some limitations, and if you could help me with what are those.
I would really appreciate your response as it is going to help a lot in my current use case .