thoraxe

Log inLog into community

Find answers from the community

Home

Members

thoraxe

Offline, last seen 6 months ago

Joined September 25, 2024

tthoraxe

is there a place to manipulate the cache

is there a place to manipulate the cache settings to prevent llamaindex from checking "upstream" to find a newer embedding model version?

5 comments

tthoraxe

hm... with `llama_index.set_global_

hm... with llama_index.set_global_handler("simple") I am not seeing any verbosity/debug messages for building the vector store index

16 comments

tthoraxe

Gpu

it looks like it's been asked previously but no clear answer and the docs aren't super clear either -- how do I make the embedding/indexing use GPU? or, how do I know if it used/might use GPU?

6 comments

tthoraxe

Huggingface

Plain Text

Traceback (most recent call last):
  File "/opt/app-root/src/llamaindex-rag-example/starter.py", line 8, in <module>
    embed_model = HuggingFaceEmbedding(model_name="Cohere/Cohere-embed-english-v3.0")
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/llama_index/embeddings/huggingface.py", line 82, in __init__
    model = AutoModel.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 526, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1132, in from_pretrained
    raise ValueError(
ValueError: Unrecognized model in Cohere/Cohere-embed-english-v3.0. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: ...

looks like you don't support all embeding models from the mteb leaderboard.

7 comments

tthoraxe

stupid question - looking at https://

stupid question - looking at https://docs.llamaindex.ai/en/stable/module_guides/observability/callbacks/token_counting_migration.html#token-counting-migration-guide and thinking about token counting.
the callback manager is explicitly using tiktoken, which is counting tokens for openai. but what if i'm not using openai? is it "close enough"?
also, how does the embedding model (eg: BAAI/bge-base-en-v1.5) relate? or does it maybe not relate?

5 comments

tthoraxe

shapes not aligned error

probably user error. I am using the HF TEI server to embed, but then when I try to do a lookup, I get this error:
ValueError: shapes (1024,) and (768,) not aligned: 1024 (dim 0) != 768 (dim 0)
https://gist.github.com/thoraxe/583ee9f8d2a21a562f42535da47cee0d

36 comments

tthoraxe

gist:14d34e2aa85f602f7af89813a13ce010

I'm having a problem using the TextEmbeddingInterface remotely:
https://gist.github.com/thoraxe/14d34e2aa85f602f7af89813a13ce010

When I index the same documents using local embedding with the same embedder even, I don't get this error.

100 comments

tthoraxe

milvus issues retrieving from existing store

sooooo i'm not sure what i'm doing wrong here with Milvus, but it seems like it just keeps throwing everything away that I recently indexed. if I index some documents, and then try to set up an index from that same vector store, i don't get any results

6 comments

tthoraxe

milvus + azure openai

i'm trying to follow the Milvus tutorial but I am using Azure OpenAI and not OpenAI, but it keeps trying to talk to OpenAI during the embedding step:

Plain Text

Retrying llama_index.embeddings.openai.base.get_embeddings in 0.275948059787665 seconds as it raised AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: xxx. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}.

7 comments

tthoraxe

are older versions of the llama docs

are older versions of the llama docs available on RTD or how can I see them? eg: 0.9.39

4 comments

tthoraxe

React

have you come across any ReAct-specific datasets? I've not found any of the open models to be good at it. I'm about to try Zephyr

40 comments

tthoraxe

interesting situation which is probably

interesting situation which is probably a weird edge case --
my index only has one document. I asekd it a stupid question ("how do I do it?") and it returned that one document. It's not REALLY relevant, but I suppose it's not entirely irrelevant either.
Is it a guarantee that with only one indexed document, you'll always get it? I used "chicken?" as the query and it still retrieved that document -- there is nothing to do with even animals in it.

3 comments

tthoraxe

direct LLM query (no context)

rather dumb question -- if I simply want to query the LLM directly (service_context) without sending any index/reference information, how do I do that?

2 comments

tthoraxe

Outpht

is there a convenient way to override the output formatting?
https://github.com/run-llama/llama_index/blob/v0.8.38/llama_index/callbacks/simple_llm_handler.py
Since I'm going to be post-processing stuff in python and then doing later things, I'd like to change the ** Prompt ** and ** Completion ** stuff

12 comments

tthoraxe

modifying low-level prompt

so it's unclear where this part of the prompt is coming from. it's not the system prompt, and it's not the query wrapper. when passing context, I am seeing this in the total prompt:

Plain Text

Context information is below.
---------------------
file_name: summary-docs/cluster-autoscaling.md

now, i understand that i have metadata on the file, so that likely explains why that is passed. But I'm curious about the context information is below part and how to alter that

8 comments

tthoraxe

storing multiple indexes locally

it looks like storing multiple indexes locally in the same folder is unsupported. despite defining multiple index ids, it looks like only the last written persisted index "wins"

6 comments

tthoraxe

redis index selection

hmm... llamaindex appears to be looking at ALL indices in redis and not restricting itself to only the specified index

59 comments

tthoraxe

jerryjliu0 Logan M FWIW llama2 13b chat

FWIW, llama2-13b-chat is terrible at being a ReAct agent

6 comments

tthoraxe

Update bm25_retriever.ipynb by thoraxe ·...

minor typo fix https://github.com/jerryjliu/llama_index/pull/7599

1 comment

tthoraxe

`UnboundLocalError local variable

UnboundLocalError: local variable 'default_template' referenced before assignment

hmm

24 comments

tthoraxe

Logan M so i m looking at the logs of

so i'm looking at the logs of the tgis server and the output of print(summary.response) and it looks like it's doing a double-summary

79 comments

tthoraxe

I m trying to use `SummaryIndex` via a

I'm trying to use SummaryIndex via a TGIS server (and not run the LLM locally) but llamaindex seems like it's ignoring the TGIS predictor. Maybe I'm using this wrong?

Plain Text

service_context = ServiceContext.from_defaults(chunk_size=512,
                                               llm=tgis_predictor, 
                                               context_window=2048,
                                               prompt_helper=prompt_helper,
                                               embed_model=embed_model)

# Load data
documents = SimpleDirectoryReader('private-data').load_data()

index = SummaryIndex.from_documents(documents)
summary = index.as_query_engine(response_mode="tree_summarize").query("Summarize the text, describing what it might be most useful for")

but then it tries to download an HF model:

Plain Text

Downloading url https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/llama-2-13b-chat.ggmlv3.q4_0.bin to path /tmp/llama_index/models/llama-2-13b-chat.ggmlv3.q4_0.bin
total size (MB): 7323.31

And ultimately blows up my machine trying to use this model via CPU

8 comments

tthoraxe

looking at https://docs.llamaindex.ai/en

looking at https://docs.llamaindex.ai/en/stable/examples/vector_stores/SimpleIndexDemoLlama-Local.html if I don't import torch or set the torch kwargs, does it default to using CPU, or will it automatically use GPU regardless?

9 comments

tthoraxe

task processing chains

More of a pure langchain question but llamaindex may also have its own solution here.
trying to wrap my head around how to do something:
I'm submitting a question via RAG which gets turned into a list of tasks:

do a thing
do some other thing

for each step I am trying to:

figure out if the original question has enough information to complete the task
if it does, perform the task (via the LLM)
pass the original question, the output of the task, and the next step along
see if there is enough information to complete the next task

and kind of stay in that loop until everything is complete. then put itall together and send back to the user.

33 comments

tthoraxe

so i m trying to think through how to

so i'm trying to think through how to effectively make this assistant. I think react out of the gate is probably not quite going to work.
What I'm envisioning is a chain where the first step is "Task breakdown", then each task is separately processed, and then the final result is either summarized or just spit out at the end.
For example, trying to set up cluster autoscaling in OpenShift (Kubernetes) involves 2 steps - creating a clusterautoscaler object and then creating a machineautoscaler object.

I have various forms of docs that can be queried/indexed to spit out that task list
I have docs that can be queried/indexed that describe both cluster and machine autoscaler objects

I haven't tried setting up multiple "tools" (task breakdown tool, documentation search) yet, but was just curious about people's thoughts here

5 comments