Find answers from the community

Home
Members
osiworx
o
osiworx
Offline, last seen 3 months ago
Joined September 25, 2024
I have a hard time using some different gguf models with llama-cpp in llama-index. Some work just find and some only answer garbage. So my first guess is like maybe its the prompting structure and I played with that but only with very little to none success... now I found when initializing the llm with llamacpp there is this:

# transform inputs into Llama2 format
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,

Ok so it is kinda "hard coded" to llama2 prompting, even tho it still does not work with llama-2-chat...

is there any other things I could add here to point to different types of models?
6 comments
L
o
W
Hello, I'm using a query engine to fetch data from qdrant and then generate a response. I found that the user input sometimes is very poor and so I try to enhance that by asking the llm to give a list of semantic keywords to the users input. that as such works well... now I wonder how can I manage to use the semantic keywords as search parameters while still use the users input as query. or is there a way where I prepare the context myself and then do the llm response in a second step ?
2 comments
o
W
I try to un a ingestionPipeline but it seems I miss something as the data does not show up in the vector store

pipeline = IngestionPipeline(
transformations=[
SentenceSplitter(chunk_size=100, chunk_overlap=10),
#TitleExtractor(), #braucht ein LLM
],
vector_store=vector_store,
)

pipeline.run(documents=docs)
13 comments
o
W
Is it possible to import a csv file and use every single line as a single document? So that after reading that file I would have a object with as many documents as there was lines in the CSV file?
4 comments
W
o
a
o
osiworx
·

Docs

Hello, I like to be able to add custom meta data to any of the documents I embed. The example I find looks like instead of the simple dataloader I just create manualy documents add my custom meta data and then run them into the embedding. Is that the right way or did I not find the sophisticated way?
4 comments
L
o
is there a build in way to handle conversation history? I have kind of build my own thing but I guess that's not how it is supposed to be 🙂 asked that, is there a way to count tokens so that history will not grow to big?
2 comments
o
T
Is there an nice example for using huggingface models in RAG ? I think I still do something wrong, maybe I can see something I do different
7 comments
o
L
How Do I prevent the file information to become part of the prompt context?
13 comments
o
L
Hi there, I learned that PDF parsing seems to be a very complex task, how is it about word parsing? Is that the same story in different cloth or is that less complex? what would be the easiest to parse to get the most best results beside pure text?
2 comments
o
W
Hi there, for my company project I try to ingest data from confluence to then create a RAG assistant helping to find information in the vast amount of data we got. the first tests show that we only got like very crappy data to ingest into RAG, its tables, many images and little text if any and its mostly just words. I learn the people are very very lazy writing useful documentation.. thats for the ranting part 😉 Is there any strategy how to deal with such messy data? Is there maybe tutorials or tipps around? I would guess its not just my company having such lazy people writing only crap that only themself understand the day they write it
8 comments
o
W
but here is another question, for a unusual use case I like to run a query against the vector store and only want result that are more far than a given distance. yes you read right I'm looking for results far away from my query 🙂 is it somehow possible to define what the minimum distance should be when doing a query?
3 comments
o
L
I'm running into errors with some of the embedding models

sentence-transformers/all-distilroberta-v1
it states it has 768 dimensions and then during generating the embeddings I get this error:
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [59,0,0], thread: [64,0,0] Assertion srcIndex < srcSelectDimSize failed.
it happens with all those embedding models stating to have 768 Dimensions I have tested so far 384, 512 dimensions was no issue and the embedding worked fine.

I run tests which embedding model would be best for our data and so I came across this issue. Do I do something wrong or is 768 Dimensions not working at all?
8 comments
o
L
o
osiworx
·

Retrieval

is it possible to see the vector store retrived context text after

self.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L12-v2")
self.vector_store = QdrantVectorStore(client=self.document_store, collection_name=self.index)
self.vector_index = VectorStoreIndex.from_vector_store( vector_store=self.vector_store, embed_model=self.embed_model)

self.query_engine = self.vector_index.as_query_engine(similarity_top_k=self.top_k,llm=self.llm)
response = self.query_engine.query(query)
6 comments
W
o
I like to embed HTML data from files how would I do that?
8 comments
o
W
I just found that huggingface offers embeddings to be stored there, that would be a good place for one like me who like to share the vector data, is there a integration yet or is there any plan to ever do that?
6 comments
o
L
We like to create a system that could create test questions for students based on documents we upload. Think of it like an automated school test generator. To help it a little it should be possible to add topics which are important to be asked about as well as give the number of questions to be created. Would something like this be possible with Llama index like out of the box or would you expect loads of custom code to be generated. Ignoring ui and stuff on the side that would be needed anyways.
4 comments
L
o
Ok otherone, that one is odd as the same import works in my cstom script but not here File "C:\Users\user\miniconda3\envs\llama_index_new\lib\site-packages\llama_index\core\prompts\base.py", line 500, in get_template
from llama_index.core.llms.langchain import LangChainLLM
ModuleNotFoundError: No module named 'llama_index.core.llms.langchain'
9 comments
o
L
W
ok seems to get better, I just started with afresh environment it seems that helps alot 😄
6 comments
W
o
I guess I found a way, I check if doc.text is != '' and then add it to a seperate docs array when I then path to the embedding process. so far it did get around the error
4 comments
o
W
Thanks to Logan M I found some good examples to get RAG working with huggingface models. Now I like to get this usecase working, I'm working on a system that should help making text2image prompts based on millions of prompts I stored in the vector. So I like to put in a simple text2image prompt let it search for matching prompts in the vector store and then have the llm make a nice prompt out of the context, my guess is for this I need to play with the prompting in some way, as with the sample from logan M it comes back and tells my prompt would not result in any possible answer based on the context, thats true as I had no chance to tell what to do with the context as there was just my query which in my example is like: sail to italy. thats mainly the search term, how I make it do create a prompt from the context now?
2 comments
o
L
o
osiworx
·

Llama-cpp

its maybe a off topic question but I somehow breaking my fingers here so I just ask 😉 I try to get llama-cpp running on my GPU instead of CPU, im on a windows 11 and I followed the instructions to get it working but maybe I got the wrong instructions, does anyone here get it working on a windows 11 with nvidia GPU?
4 comments
o
W