Find answers from the community

Home
Members
captam_morgan
c
captam_morgan
Offline, last seen 3 months ago
Joined September 25, 2024
I have three questions:

  1. Are there use cases where decomposable graph makes more sense than subquestion query? I feel like sub-question can handle everything graph does?
  1. Maybe this has to do with node post-processing. Is there a dynamic way to set similarity_top_k so I always use the maximum # that can fit inside the context window?
  1. Does llama index offer any smart chunking algorithms? For example, instead of a fix length cutoff, can I do it by paragraphs or contextual topics?
1 comment
L
Curious if anyone tried using Llama.cpp with LlamaIndex so they can easily access quantized models with CPU only setup. Will using CustomLLM do the trick?
4 comments
L
c
Anyone tried using text2sql on Databricks Delta Tables and Snowflake tables?
1 comment
c
High level question on structuring nodes/docs. So far, I broke down one 10-k document and set each section as a node along with metadata of that section. But that is for one year and one company. How should I think about the structure if I want to have multiple years and documents? I would like to start simple, but maybe this requires composability.
40 comments
c
u
L
Anyone else having issues loading the 70B llama2 model on LlamaCPP? I was successful with the 7B and 13B models but I’m getting a vague error for 70B. (See attached image)

My cluster is CPU only but has up to 96 workers and 768GB ram.
11 comments
L
c
“Llama.generate: prefix-match hit”
3 comments
c
L
Has anyone tried to load the LLaMA 2 model via HuggingFaceLLM instead of Replicate? Im also trying to do this on Azure Databricks
4 comments
L
c