Find answers from the community

Updated 2 months ago

Pinecone refresh

How do you go about refreshing pinecone data? The data will come from Google Docs files. Do I delete the data and re-upload it?
b
L
13 comments
And can llama-index handle this?
Yea that's probably the best approach I think. Working on making this easier in the future too!
Cool! Would it be possible to update only a specific doc by passing the doc ID? Is this currently possible or is this what you are looking into improving in the future?
Yea if you are explicitly setting doc_ids, I think the refresh function should work too (it works by checking doc ids and hashes of documents)
There's also the update function, which does the same thing as refresh but for a single document rather than a list of documents
Thanks. I'll look into this later
@Logan M I'm struggling with this a bit :T.
This is how I create my index locally:
Plain Text
    index = GPTVectorStoreIndex.from_documents(
        documents, service_context=get_service_context()
    )

Now if I just want to refresh that index, how do I do that using the refresh method?
It looks like I need to somehow load that index and then call teh refresh method? What would this look like? My docs do have doc_id and doc_hash
Yea refresh works assuming the doc ids are the same (they won't be unless you manually set them though 🥲)

index.refresh(documents) is all it should take after loading your saved index
You can also call index.update(document) for a single document (again, that assumes the doc ids are consistent)
Got it.
Do I need to pass the service_context when loading the data to refresh it?

storage_context = StorageContext.from_defaults(persist_dir="./storage")

index = load_index_from_storage(storage_context, service_context = get_service_context())

index.refresh(documents)
Mmm, yea it's good practice to do that (not 100% sure if it's needed here, but better safe than sorry lol)
True. Thanks ;D
Add a reply
Sign up and join the conversation on Discord