Hello Pinecone Community,
Let’s take the example of a mid-size vector database with 50k vectors. When you retrieve the most relevant documents with respect to a given query, you get the vectors not the original documents. To retrieve the original documents, you have two options:
- either attaching the original document into the metadata. So Pinecone returns the most relevant vectors with their document attached in the metadata.
- Or storing only the id of the document in the metadata, and storing the original document in a separate database that you query in a second stage to get the document associated to the id you got with pinecone.
Which solution do you recommend ?
I guess the first solution is more expensive in terms of metadata size (but the second solution necessitates to buy a second database), is it also slower due to the metadata attached to each vector?
2 posts - 2 participants