chsurf

how can I use an embedding model like embed_model = HuggingFaceEmbeddings( model_name=

how can I use an embedding model like embed_model = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-mpnet-base-v2"
) to directly generate an embedding for a text snippet?

4 comments

cchsurf

So I guess the real question is why does

So I guess the real question is: why does the text_splitter behave this way, and how can I index metadata in my nodes so that it is searchable but not a part of this 'segmentation' process? This works like this:

Plain Text

    for idx,raw_document in enumerate(documents):
        turns = []
        raw_document = raw_document['raw_doc']
        ...

        metadata_core = {k:v for k,v in raw_document.items() if not '__' in k}
        excluded_keys = list(metadata_core.keys())
        document = Document(
            text=conversation,
            metadata_seperator="::",
            metadata_template="{key}=>{value}",
            text_template="Metadata: {metadata_str}\n-----\nContent: {content}"
        )
        formatted_documents.append(document)

        # process document by document so correct metadata
        # remains associated with the nodes
        raw_nodes = node_parser.get_nodes_from_documents(
            [document]
        )

        # now add the custom metadata
        for node in raw_nodes:
            node.metadata.update(metadata_core)
            node.excluded_llm_metadata_keys = excluded_keys
            node.excluded_embed_metadata_keys = excluded_keys
            formatted_nodes.append(node)

5 comments

cchsurf

Kapa couldn't handle this one. Can I manually supply a set of 'nodes' retrieved from a r

Kapa couldn't handle this one. Can I manually supply a set of 'nodes' retrieved from a retrieval engine for use in a filtered query engine query? I have something like this:

Plain Text

    nodes = retriever.retrieve(args.query)
    # now filter the nodes somehow, e.g. make sure we use only
    # the 'best' result from each unique document
    # Create a query engine that only searches certain footnotes.
    filtered_query_engine = indexes[args.index].as_query_engine(
        filters=meta_filter
    )
    res = filtered_query_engine.query(args.query)
    print(res.response)

Id like to directly supply a filtered set of nodes so that I can control the set of nodes supplied.

11 comments

cchsurf

Is there any documentation about how the

Is there any documentation about how the kapa.ai bot is trained? Is it using llama-index?

1 comment

cchsurf

How can I manage introspect the chunk

How can I manage/introspect the chunk size in the TreeSummarize class? I am using gpt-4-32k to process a longish document that is still only 9k tokens long. My understanding is that this should easily fit in a single request. However when I run the tree summarizer in verbose mode, it shows:

3 text chunks after repacking
1 text chunks after repacking

which seems to indicate that it is dividing the input initiall up in to three chunks, even though it should not need to. I also note that I see the exact same thing if I run with gpt-3.5 even though this reports it's context window is just 4096. So it seems like some other default config parameter is resulting in the additional available context just not being used.

4 comments

cchsurf

What is the best way to generate a

What is the best way to generate a summary for a (very) long document? I asked kapa.ai about this but the response regarding using the SummaryExtractor seems to only really cover enlarging the summary context for a particular chunk with the additional context of the previous and following chunks. Is there any other general best practice on this topic? If not I was considering a kind of multi-query sequential update. In this case we'd start with a long document, say 20 pages that doesn't fit in any context, then we provide a summary prompt template that says, keeping in mind the summary up to this point and anything else you remember, please update the existing summary with this new chunk". The first chunk would start with the max context and provide a summary of that (first 2 pages or whatever), then each subsequent request would provide the next page(s) plus the partial summary and request that the summary be updated with the new information. This continues until the entire document has been read. It doesn't seem like a particularly novel idea on my part but I couldn't find an example of something like this. Would it work?

21 comments

cchsurf

Embedding setup

I want to use sentence-transformers for embeddings but this seems to make llama-index force me to install llama-cpp and download the llama-2-13b which I don't need or want. Is there any way to avoid this?

13 comments

cchsurf

Hi I m trying to design and utilize a

Hi, I'm trying to design and utilize a Document type with metadata which also permits hybrid/filtered queries. I was unable to find a good tutorial on this unfortunately and I'm not exactly sure how to proceed. My use case is pretty simple: I have a JSON doc which encapsulates a conversation between 2 or more parties. I want to encode the doc so that it includes metadata like date, speaker, timestamps, and then make it possible restrict the target of the vector-based retrieval to a traditional query. For example, index 10 conversations, then formulate a complex query: "tell me about conversations from yesterday". Here "yesterday" can be transliterated into a timestamp, and then the retrieval process should focus on only the subset of documents with timestamps in that range. I can't see a way to easily do this at the moment, except maybe using OpenSearch somehow?

11 comments

cchsurf

Filters

Is there any workaround right now for utilizing more complex filter queries with backends that support them? I'm using OpenSearch and I have complex data where I want to mix queries that filter by date range or gt/lt values, etc, but it appears that the match filters are locked to ExactMatch. Are there any other (lower level approaches are fine) approaches that would help me to do this?
class MetadataFilters(BaseModel):
"""Metadata filters for vector stores.

Currently only supports exact match filters.
TODO: support more advanced expressions.
"""

filters: List[ExactMatchFilter]

@classmethod
def from_dict(cls, filter_dict: Dict) -> "MetadataFilters":
"""Create MetadataFilters from json."""
filters = []
for k, v in filter_dict.items():
filter = ExactMatchFilter(key=k, value=v)
filters.append(filter)
return cls(filters=filters)

18 comments

cchsurf

I have a collection of long documents

I have a collection of long documents, which I am trying to chunk and index with a SentenceSplitter. How can I help to ensure that RAG retrieval returns results at a document level?

To be clear, what I mean is, I want to index chunks, but when I perform a search like, "find me conversations with rude speakers", that I get back a list of the best matching documents and not just the best matching nodes as the latter may result in just returning 10 segments/nodes from the same 'very rude conversation'.

The chunking helps with search, but does llama-index provide any sort of built in mechanism to facilitate the retrieval filtering?

Ideally my goal is to support things like "find me the rudest conversations from last week" or something along those lines. I'm close but this document-level post-filter has got me stuck.

7 comments

cchsurf

Is there a migration guide from v6 x to

Is there a migration guide from v6.x to v8.x? There seem to be some pretty fundamental changes in things like the Node vs TextNode and PromptHelpers etc.

5 comments

Find answers from the community

how can I use an embedding model like embed_model = HuggingFaceEmbeddings( model_name=

So I guess the real question is why does

Kapa couldn't handle this one. Can I manually supply a set of 'nodes' retrieved from a r

Is there any documentation about how the

How can I manage introspect the chunk

What is the best way to generate a

Embedding setup

Hi I m trying to design and utilize a

Filters

I have a collection of long documents

Is there a migration guide from v6 x to