BioHacker

Has anyone ever gotten an error like

Has anyone ever gotten an error like this kth(=-12) out of bounds (13)?? I am getting this error when using .retrieve in llamaindex.

11 comments

BBioHacker

@Logan M Have you seen this error before

Have you seen this error before?
Calculated available context size -4316 was not non-negative
I just updated my llama-index and am getting this.

11 comments

BBioHacker

Is there notebook regarding source

Is there notebook regarding source retrieval for chunks? For example if my chunks are 512 tokens and my query engine returns 3 of the top chunks I can't return those to the user because 512 tokens is like multiple paragraphs.

12 comments

BBioHacker

V0.10 issue with Simple DirectoryReader

V0.10 issue with Simple DirectoryReader
I am loading a text document using

documents = SimpleDirectoryReader("/Users/sina/Downloads/Uflo Platform/extract_pdf/transcript-merge/transcript.txt").load_data()

Imported using
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

Getting the following error:
llama-index-readers-file package not found
ModuleNotFoundError: No module named 'llama_index.readers'

23 comments

BBioHacker

Prompt stuff

When running chat gpt models, whenver i try to run the same query twice, it says: In addition to the...(what is already mentioend)...
How do i make this behavior more like davinci where it does not have history of the chat everytime i ask it?
Also where can i see the history of chat? Lastly, how do i provide system message?

40 comments

BBioHacker

Document loading

that seems feasible as well though it assumes the docs have been paginated into single pages.

9 comments

BBioHacker

Hello I seem to have encountered a error

Hello, I seem to have encountered a error regarding the base GPT Index class. I am generating custom nodes through a for loop. As you can see I am including the doc_id for each node and have checked that all are filled. Yet, I get this error: ValueError: Reference doc id is None. It refers me to this file /site-packages/llama_index/indices/base.py and highlight the following line:

if index_struct is None:
...
--> 108 raise ValueError("Reference doc id is None.")
109 result_tups.append(
110 NodeEmbeddingResult(id, id_to_node_map[id], embed, doc_id=doc_id)

Code
nodes = []
#transcript_array refers to an array of phrases that Whisper outputs.
for index,phrase in enumerate(transcript_array):
#current obj index
node = Node(text=phrase['content'] + " " + str(phrase['start']), doc_id=index)
if index > 0 and index < len(transcript_array) - 1:
node.relationships[DocumentRelationship.PREVIOUS] = index - 1
node.relationships[DocumentRelationship.NEXT] = index + 1
elif index == 0:
node.relationships[DocumentRelationship.NEXT] = index + 1
elif index == len(transcript_array) - 1:
node.relationships[DocumentRelationship.PREVIOUS] = index - 1
nodes.append(node)
index = GPTSimpleVectorIndex(nodes)

Could it be from my custom nodes? I have attached a txt file of how they look like when i print(nodes)I am following the tutorial from here so some help would be really appreciated.https://gpt-index.readthedocs.io/en/latest/guides/primer/usage_pattern.html

5 comments

BBioHacker

Seeing source nodes

oh I get the ... When I use response.source_nodes I can see the whole text but not with .get_formatted_sources
So baesd on what you said, if I want the whole thing don't use get_formatted_sources? what if I don't wnat it truncated? Is there an option to get the whole source using .get_formatted_sources? or do i always need to use .source_nodes?

4 comments

BBioHacker

Hello what are the best resources to

Hello what are the best resources to learn about GraphRAG and put it into production? Also is Neo4j the best DB to deploy a graph rag solution? If so how does adding and removing documents work exactly? I can’t seem to find much info on document/chunk insertion and deletion.

4 comments

BBioHacker

@Logan M Hey Logan, thanks for your work

@Logan M Hey Logan, thanks for your work on RAPTOR. I have two questions as I try to deploy this for my product:
How would I save RAPTOR into a hosted vector DB like pinecone? Do I basically just load the vector DB and say in the raptor pack: pack = RaptorPack(..., vector_store=pinecone_vector_store)?
How do I add more documents to an already existing RAPTOR pack? Do I simply load the vector store and then fill the document parameter with more documents?

15 comments

BBioHacker

Question on TokenCountingHandler:

Question on TokenCountingHandler:

I am using this tutorial to create a hybrid retriever with re-ranking https://docs.llamaindex.ai/en/stable/examples/retrievers/bm25_retriever.html#advanced-hybrid-retriever-re-ranking
I am trying to count tokens via

Plain Text

token_counter = TokenCountingHandler(
tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode
)
callback_manager = CallbackManager([token_counter])
Settings.callback_manager = CallbackManager([token_counter])

I am able to count embedding tokens but not completion tokens. Is this because of using a custom retrieval?

6 comments

BBioHacker

I tried upgrading to alpha 0.9 using

I tried upgrading to alpha 0.9 using pip install llama_index --pre but my version is still at 0.8.68. My python is 3.11 and am using conda. Any thoughts on how to upgrade?

2 comments

Find answers from the community

Has anyone ever gotten an error like

@Logan M Have you seen this error before

Is there notebook regarding source

V0.10 issue with Simple DirectoryReader

Prompt stuff

Document loading

Hello I seem to have encountered a error

Seeing source nodes

Hello what are the best resources to

@Logan M Hey Logan, thanks for your work

Question on TokenCountingHandler:

I tried upgrading to alpha 0.9 using