Using gpt-4 API to Semantically Chunk Documents

I think the best (and easiest) course would be to eliminate it since the point of the text being crossed out is that it is being replaced. Including it in the embeddings would have the effect of giving the impression the crossed out text is valid.

1 Like