Wizboar

LlamaIndex

Log inLog into community

Find answers from the community

Wizboar

W

Wizboar

Offline, last seen 6 months ago

Joined September 25, 2024

·

Anyone have issues with the

Anyone have issues with the RagDatasetGenerator?

I get this when running it.

sys:1: RuntimeWarning: coroutine 'BaseQueryEngine.aquery' was never awaited

Looks like the run_jobs function takes the coroutine though so things dont'need to be awaited...

3 comments

L

W

·

Is there a way to avoid the async.run. I

Is there a way to avoid the async.run. I am running llama index with langchain and I am getting the following error.

RuntimeError: asyncio.run() cannot be called from a running event loop

For reference I am using the QueryFusionRetreiver

3 comments

E

W

L

·

Duplicates

How does PGVector store avoid duplicates? Does it need the docstore to do so?

I am setting the doc = Document(_id="some calculated unique id hash") and I am nott sure if the database will fail an integrity error if the _id is not unique for some doc_id and generated node as the node_id is a random id...

5 comments

W

L

·

Document Management

I have quite a large database. Is there a way i can iterate on changing some of my service_context (e.g. mix and change metadata extractors) without rebuilding the index?

Also, what would be the easist way to push NULL as the embedding. I would like to just do the embedding step in batch with a script on a rented GPU rather than as part of a pipeline.

4 comments

W

d

E

·

Is it possible to fuse two query engines

Is it possible to fuse two query_engines? I would like to use the SubQuestionQueryEngine but get outputs like the CitationQueryEngine? Or will these require a new custom class.

1 comment

d

·

Hey people

Hey people,

I am struggling to load an index that uses a PGVector store. The loaded index array is []

Plain Text

def setup_storage_context():
    from dotenv import load_dotenv
    from sqlalchemy import make_url
    from llama_index import StorageContext
    from llama_index.vector_stores import PGVectorStore
    from llama_index.storage.docstore import SimpleDocumentStore
    from llama_index.storage.index_store import SimpleIndexStore

    load_dotenv()
    url = make_url(os.getenv("DATABASE_URL"))
    storage_context = StorageContext.from_defaults(
        docstore=SimpleDocumentStore(),
        index_store=SimpleIndexStore(),
        vector_store=PGVectorStore.from_params(
            database="test",
            host=url.host,
            port=url.port,
            user=url.username,
            password=url.password,
            embed_dim=768,
            hybrid_search=True,
            debug=True
        )
    )
    return storage_context

storage_context = setup_storage_context()
indices = load_indices_from_storage(storage_context)
print(indices)

Storage persisted with index.storage_context.persist()

Not much data is being saved in the storage files.

1 comment

d

·

Given a task, how can you check what

Given a task, how can you check what tool the task is for?

4 comments

W

L

·

Does llama index have a LLM for the

Does llama index have a LLM for the TextGenerationInference server?

1 comment

L

·

Hey people.

Hey people.

How can I take advantage of vLLMs continous batching with the llama_index LLM? I want to do metadata extraction and summarization on a large number of documents on a rented A100 as fast as possible.

Using Mistral-7B-Instruct-v0.2 as my LLM with the following container.

docker run --gpus all -p 8000:8000 ghcr.io/mistralai/mistral-src/vllm:latest \
                   --host 0.0.0.0 \
                   --model mistralai/Mistral-7B-Instruct-v0.2 \
                   --tensor-parallel-size 1

6 comments

A

W

L

·

Any hunches on how perplexity.ai does

Any hunches on how perplexity.ai does its websearch? Have they crawled the web and vectorised everything above a pagerank score of x or are they using Agents / Tools with a web-browsing tool.

1 comment

T

·

So the bge embedding models are the

So the bge embedding models are the leading OS embedding models; however, they have a 512 context window. I can see that the defaults for llama index is 1024. I presume that they are just taking the first 512 tokens when embedding a 1024 chunks size with such a model.

Shouldn't upper bounds of the chunk size be equal to the max_position_embeddings of the embedding model.

3 comments

W

d

·

Hey

Hey,

Any reason why there isn't a Postgres KV store implementation? This could then have postgres working as the document store, index store and vector store (with pgvector).

New to llama index and want my existing PGVector RAG to work with how they do things here.

Jordan

1 comment

d

·

Does anyone know any services who host

Does anyone know any services who host vector datasets of public data? E.g. wikipedia vector db hosted? Would love to connect to some service providers with a query engine.

2 comments

W

W

·

Hey! Any one have any good techniques

Hey! Any one have any good techniques for handling this issue with memory?

I find that my assistant focuses too much on the memory and will sumarise some of the past conversation in the response. Any ways to avoid this or ensure memory is only used when it need to address something not addressed in the last user message?

2 comments

W

L

·

Split

I ran into a bug with the Pipeline and the SemanticNodeParser.

Can't pickle local object 'split_by_sentence_tokenizer.<locals>.split'

Seems to be a pickle error when running multiple workes on the pipeline.

1 comment

L

·

I am having issue with the compact and

I am having issue with the compact and refine prompt (the default) for the Faithfullness evaluator. It should be squishing my context into the context window and refining but it keeps throwing an error saying that it is breaching the context window. Is this a bug?

Plain Text

huggingface_hub.inference._text_generation.ValidationError: Input validation error: `inputs` tokens + `max_new_tokens` must be <= 4096. Given: 3896 `inputs` tokens and 256 `max_new_tokens`
make: *** [evals] Error 1

12 comments

L

W

·

This may be a dumb question....

This may be a dumb question....

Is ColBERTv2 able to use vector stores for storage or do you need to use the data structured provided and load it into RAM?

I obviously know very little about ColBERT

7 comments

L

W

j

·

Any one able to ellaborate on the

Any one able to ellaborate on the difference between the use cases of the SummaryIndex and the DocumentSummaryIndex? It look like the summary index is a linked list over the documents ndoes and then uses the refine query to run multiple LLM calls to summarise the document on the fly?

My RAG is poor at summaries atm because its just chuncked and vector based. Was thinking to pre-summarising all the documents with like a 7B model and then put them into a key word look up store.

What is the default store for these indexes? The document store?

5 comments

f

T

W

L

·

I cant seem to get streaming working

I cant seem to get streaming working with the OpenAIAgent chat.

Simple example

Plain Text

resp = agent.stream_chat("My name is Jordan")

for part in resp.chat_stream:
    print(part)

It recognises that its a generator but it doesn't loop and print anything... If I print the resp it returns the output...

8 comments

L

W

·

Langchain

Anyone got llama index and langchain working together with streaming?

1 comment

W