richie404

Is it possible to consume values of

Is it possible to consume values of raw and additional_kwargs when defining a custom LLM complete method? I'd like to get back additional information besides the text from the full response when running query() with my custom LLM class. Thanks!

Plain Text

@llm_completion_callback()
    def complete(self, prompt: str, **kwargs: Any) -> CompletionResponse:
        return CompletionResponse(text="MY RESPONSE", raw=full_reponse, additional_kwargs=full_response)

...

response = query_engine.query(query_str)
# Can't seem to access raw or additional_kwargs in the response...
print(response.raw) # Errors
print(response.additional_kwargs) # Also errors

13 comments

rrichie404

Data

Is there a way to return all the embeddings in a Postgres vector store (or index based on the vector store) with no query? I tried using index.vector_store._data.embedding_dict but it doesn't seem to work with all vector stores. My docstore is also in Postgres, so it would be nice to get all the document data and the embeddings together, if possible.

2 comments

rrichie404

I'm trying to understand how `refresh_

I'm trying to understand how refresh_ref_documents works when my vector_store, docstore, and index store are all in Postgres but the files I want to ingest are in the file system. It seems to handle changes to the existing files, but what about deleted files? Is there a way to delete the associated rows in the docstore and vector_store when I've deleted a file from the file system since the last ingestion? Or do I need to handle that manually? (If so, sample code would really help.) Here's a snippet of my indexing code. Please let me know if I'm doing anything wrong!

Plain Text

storage_context = StorageContext.from_defaults(
    vector_store=postgres_vector_store,
    index_store=postgres_index_store,
    docstore=postgres_docstore,
)

# Add filename to metadata.
filename_fn = lambda filename: {"file_name": filename}

documents = SimpleDirectoryReader(
    "./sources",
    recursive=True,
    file_metadata=filename_fn,
    filename_as_id=True,
).load_data()

try:
    print("Loading index from docstore...")
    index = load_index_from_storage(
        storage_context=storage_context, service_context=service_context
    )
except:
    print("Creating initial docstore...")
    index = VectorStoreIndex.from_documents(
        documents=documents,
        store_nodes_override=True, # Do I need to set this override?
        storage_context=storage_context,
        service_context=service_context,
        show_progress=True,
    )

print("Refreshing vector database with only new documents from the file system. TO DO: Handle deleted files.")
refreshed_docs = index.refresh_ref_docs(
    documents=documents,
)

11 comments

rrichie404

Filters

Are metadata filters supported for Postgres databases or am I missing something? I'm trying to use this code snippet, which is, admittedly, adapted from a Pinecone page not a Postgres page in the docs.

Plain Text

from llama_index.vector_stores.types import (
    MetadataFilter,
    MetadataFilters,
    FilterOperator,
)

filters = MetadataFilters(
    filters=[
        MetadataFilter(key="url", operator=FilterOperator.EQ, value="https://mysite.com/page"),
    ]
)

4 comments

rrichie404

Non-OpenAI

Is there a way create a simple LlamaIndex app that does not involve OpenAI at all? I'd like to avoid any risk of communicating with external LLMs or embedding models. I have my own classes for llm and embed_model, but even just importing basic things from llama_index seems to trigger errors, like this one: "You tried to access openai.Completion, but this is no longer supported in openai>=1.0.0..."

1 comment

rrichie404

Is it possible to load a JSON based

Is it possible to load a JSON-based vector index from storage (docstore.json, vector_store.json, etc.) and somehow get a list of all the nodes and their embeddings (call it all_nodes) so that all_nodes can be added to another type of index? Like this: vector_store.add(all_nodes_from_json) where vector_store is an OpenSearchVectorStore?

4 comments

rrichie404

Is it possible to create an index

Is it possible to create an index without chunking where a MetadataExtractor can operate on the entire document (that fits within the context window)? I've tried setting chunk_size to a large number and chunk_overlap to 0 in both the text_splitter and the node_parser and set a large context_window in the prompt_helper but my input files still continue to be split into small chunks. What am I missing here?

7 comments

rrichie404

Docs

Hiya! If I have a VectorStoreIndex of chunked documentation pages, how would I implement a "related pages" search where a filename (that is part of the index) is given as input and the filenames that are associated with semantically related chunks in the same index are returned as output? (The index was created with filename_as_id=True.)

2 comments

Find answers from the community

Is it possible to consume values of

Data

I'm trying to understand how `refresh_

Filters

Non-OpenAI

Is it possible to load a JSON based

Is it possible to create an index

Docs