Vector Similarity Search in Postgres with pgvector, text-embedding-ada-002, and bit.io

We used the pgvector Postgres extension and embeddings from the text-embedding-ada-002 model to make our documentation semantically searchable in Postgres.

Why does this matter? There are, at this point, quite a few companies that offer specialized vector databases for semantic search. More than one of those explicitly focuses on the use case of searchable docs. In some cases, these specialized solutions might be useful/necessary.

This project showed that there are fast and cost-effective alternatives. The bit.io free tier is more than sufficient for just about any project’s documentation. Generating embeddings for our docs with the text-embedding-ada-002 model cost about $0.02 (and that was after generating embeddings for all of our docs multiple times as we were getting the pipeline set up and tested). The whole thing took us an afternoon to set up, and 80% of that time was spent figuring out how to export our docs.

It’s worth giving this approach a try if you want to use vector similarity search but don’t want to pay for a specialized solution. Best of all—it’s Postgres. It integrates easily with anything that works with Postgres.

2 Likes

Congrats. Sounds cool. Thanks for sharing with us.

1 Like