Find answers from the community

Home
Members
BioHacker
B
BioHacker
Offline, last seen 2 months ago
Joined September 25, 2024
Has anyone ever gotten an error like this kth(=-12) out of bounds (13)?? I am getting this error when using .retrieve in llamaindex.
11 comments
L
B
s
Have you seen this error before?
Calculated available context size -4316 was not non-negative
I just updated my llama-index and am getting this.
11 comments
L
B
d
Is there notebook regarding source retrieval for chunks? For example if my chunks are 512 tokens and my query engine returns 3 of the top chunks I can't return those to the user because 512 tokens is like multiple paragraphs.
12 comments
y
L
B
V0.10 issue with Simple DirectoryReader
I am loading a text document using
documents = SimpleDirectoryReader("/Users/sina/Downloads/Uflo Platform/extract_pdf/transcript-merge/transcript.txt").load_data()
Imported using
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

Getting the following error:
llama-index-readers-file package not found
ModuleNotFoundError: No module named 'llama_index.readers'
23 comments
B
a
L
When running chat gpt models, whenver i try to run the same query twice, it says: In addition to the...(what is already mentioend)...
How do i make this behavior more like davinci where it does not have history of the chat everytime i ask it?
Also where can i see the history of chat? Lastly, how do i provide system message?
40 comments
B
L
L
that seems feasible as well though it assumes the docs have been paginated into single pages.
9 comments
j
L
B
Hello, I seem to have encountered a error regarding the base GPT Index class. I am generating custom nodes through a for loop. As you can see I am including the doc_id for each node and have checked that all are filled. Yet, I get this error: ValueError: Reference doc id is None. It refers me to this file /site-packages/llama_index/indices/base.py and highlight the following line:

if index_struct is None:
...
--> 108 raise ValueError("Reference doc id is None.")
109 result_tups.append(
110 NodeEmbeddingResult(id, id_to_node_map[id], embed, doc_id=doc_id)

Code
nodes = []
#transcript_array refers to an array of phrases that Whisper outputs.
for index,phrase in enumerate(transcript_array):
#current obj index
node = Node(text=phrase['content'] + " " + str(phrase['start']), doc_id=index)
if index > 0 and index < len(transcript_array) - 1:
node.relationships[DocumentRelationship.PREVIOUS] = index - 1
node.relationships[DocumentRelationship.NEXT] = index + 1
elif index == 0:
node.relationships[DocumentRelationship.NEXT] = index + 1
elif index == len(transcript_array) - 1:
node.relationships[DocumentRelationship.PREVIOUS] = index - 1
nodes.append(node)
index = GPTSimpleVectorIndex(nodes)

Could it be from my custom nodes? I have attached a txt file of how they look like when i print(nodes)I am following the tutorial from here so some help would be really appreciated.https://gpt-index.readthedocs.io/en/latest/guides/primer/usage_pattern.html
5 comments
j
B
oh I get the ... When I use response.source_nodes I can see the whole text but not with .get_formatted_sources
So baesd on what you said, if I want the whole thing don't use get_formatted_sources? what if I don't wnat it truncated? Is there an option to get the whole source using .get_formatted_sources? or do i always need to use .source_nodes?
4 comments
L
B
Hello what are the best resources to learn about GraphRAG and put it into production? Also is Neo4j the best DB to deploy a graph rag solution? If so how does adding and removing documents work exactly? I can’t seem to find much info on document/chunk insertion and deletion.
4 comments
B
b
@Logan M Hey Logan, thanks for your work on RAPTOR. I have two questions as I try to deploy this for my product:
How would I save RAPTOR into a hosted vector DB like pinecone? Do I basically just load the vector DB and say in the raptor pack: pack = RaptorPack(..., vector_store=pinecone_vector_store)?
How do I add more documents to an already existing RAPTOR pack? Do I simply load the vector store and then fill the document parameter with more documents?
15 comments
b
L
B
Question on TokenCountingHandler:

I am using this tutorial to create a hybrid retriever with re-ranking https://docs.llamaindex.ai/en/stable/examples/retrievers/bm25_retriever.html#advanced-hybrid-retriever-re-ranking
I am trying to count tokens via
Plain Text
token_counter = TokenCountingHandler(
tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode
)
callback_manager = CallbackManager([token_counter])
Settings.callback_manager = CallbackManager([token_counter])

I am able to count embedding tokens but not completion tokens. Is this because of using a custom retrieval?
6 comments
B
L
I tried upgrading to alpha 0.9 using pip install llama_index --pre but my version is still at 0.8.68. My python is 3.11 and am using conda. Any thoughts on how to upgrade?
2 comments
B
L