Embedding - text length vs accuracy?

ruby_coder · March 14, 2023, 7:31am

Its easy if you store both the text (what you call “chunks”) and the embedding vectors in DB table rows.

Then, when you do a query you can select the search method based on the search text (length, context, etc)

HTH

Appendix: Example DB schema (FYI, only)

 create_table "embeddings", force: :cascade do |t|
    t.string "model"
    t.string "text"
    t.string "vector"
    t.datetime "created_at", precision: 6, null: false
    t.datetime "updated_at", precision: 6, null: false
  end

Topic		Replies	Views
How to Optimize Text Chunking for Improved Embedding Vectorization? API vector-db , semantic-search	6	9430	December 15, 2023
Searching Using Vectors Derived from Long Text Segments in an Embedding Model API embeddings , api	4	2191	December 15, 2023
Embeddings results using Ada-Embedding-data-002 API	10	2365	March 29, 2023
Embedding and searching from similar embeddings API	6	6149	October 27, 2023
Prompting with the chat/completions API against a large transcript file API	5	3432	October 4, 2023

Embedding - text length vs accuracy?

Appendix: Example DB schema (FYI, only)

Related topics