raw
and additional_kwargs
when defining a custom LLM complete method? I'd like to get back additional information besides the text from the full response when running query() with my custom LLM class. Thanks!@llm_completion_callback() def complete(self, prompt: str, **kwargs: Any) -> CompletionResponse: return CompletionResponse(text="MY RESPONSE", raw=full_reponse, additional_kwargs=full_response) ... response = query_engine.query(query_str) # Can't seem to access raw or additional_kwargs in the response... print(response.raw) # Errors print(response.additional_kwargs) # Also errors
index.vector_store._data.embedding_dict
but it doesn't seem to work with all vector stores. My docstore is also in Postgres, so it would be nice to get all the document data and the embeddings together, if possible.refresh_ref_documents
works when my vector_store, docstore, and index store are all in Postgres but the files I want to ingest are in the file system. It seems to handle changes to the existing files, but what about deleted files? Is there a way to delete the associated rows in the docstore and vector_store when I've deleted a file from the file system since the last ingestion? Or do I need to handle that manually? (If so, sample code would really help.) Here's a snippet of my indexing code. Please let me know if I'm doing anything wrong!storage_context = StorageContext.from_defaults( vector_store=postgres_vector_store, index_store=postgres_index_store, docstore=postgres_docstore, ) # Add filename to metadata. filename_fn = lambda filename: {"file_name": filename} documents = SimpleDirectoryReader( "./sources", recursive=True, file_metadata=filename_fn, filename_as_id=True, ).load_data() try: print("Loading index from docstore...") index = load_index_from_storage( storage_context=storage_context, service_context=service_context ) except: print("Creating initial docstore...") index = VectorStoreIndex.from_documents( documents=documents, store_nodes_override=True, # Do I need to set this override? storage_context=storage_context, service_context=service_context, show_progress=True, ) print("Refreshing vector database with only new documents from the file system. TO DO: Handle deleted files.") refreshed_docs = index.refresh_ref_docs( documents=documents, )
from llama_index.vector_stores.types import ( MetadataFilter, MetadataFilters, FilterOperator, ) filters = MetadataFilters( filters=[ MetadataFilter(key="url", operator=FilterOperator.EQ, value="https://mysite.com/page"), ] )
llm
and embed_model
, but even just importing basic things from llama_index
seems to trigger errors, like this one: "You tried to access openai.Completion, but this is no longer supported in openai>=1.0.0..."docstore.json
, vector_store.json
, etc.) and somehow get a list of all the nodes and their embeddings (call it all_nodes
) so that all_nodes
can be added to another type of index? Like this: vector_store.add(all_nodes_from_json)
where vector_store
is an OpenSearchVectorStore
?MetadataExtractor
can operate on the entire document (that fits within the context window)? I've tried setting chunk_size
to a large number and chunk_overlap
to 0 in both the text_splitter
and the node_parser
and set a large context_window
in the prompt_helper
but my input files still continue to be split into small chunks. What am I missing here?VectorStoreIndex
of chunked documentation pages, how would I implement a "related pages" search where a filename (that is part of the index) is given as input and the filenames that are associated with semantically related chunks in the same index are returned as output? (The index was created with filename_as_id=True
.)