Find answers from the community

Home
Members
Tech explorer
T
Tech explorer
Offline, last seen 18 hours ago
Joined September 25, 2024
T
Tech explorer
·

Csv

Hi all , iam used pagedcsvreader along with chromadb and when iam querying with similarity_top_k as 10 but none of the documents are relevant to query. Although iam directly specifying important keywords in query. What can I do to improve rag with CSV data
10 comments
L
T
iam implementing a rag chatbot, but it's always answering from first retrieved documents only. If I can any other questions retrieved documents are same . It's not able to retrieve different set of documents
20 comments
k
T
how to use a chatengine with codellama to create coding assistant
38 comments
k
T
s
iam getting segmentation fault core dumped in my chat engine
8 comments
k
T
iam using document summary index for my context chat engine, now my chat chat engine only answers about summary or it can answer based on orginal document if some information is not captured in summary
43 comments
k
T
how can I use streaming response from chat engine in fastapi
54 comments
k
T
hey how can I use my local url , which is llamacpp python server hosted locally in 8000 port in llamaindex for llm inference
5 comments
k
T
how can I use llmlingua with chatengine
2 comments
k
Iam hosting a llamacpp server locally, how can I use this server for Rag implementation to make llm calls ?
12 comments
a
T
How to make llamacpp model inference faster. ? Iam using llamaindex rag with local gguf model and it's taking more than 2 min for single query
12 comments
T
L
h
It's there any memory tool to keep track of history , similar to langchain memory buffer ?
6 comments
T
L
how can I make chat_engine execution faster by using all cpu cores if needed
5 comments
k
T


I have multiple CSV files and data dictionary describing about each column . I want to use only open source llm and Rag to create conversational chatbot with memory. It should be able to perform aggregations also on my CSV data files
19 comments
k
T
https://docs.llamaindex.ai/en/stable/examples/pipeline/query_pipeline_pandas.html

Pandas query engine is not working properly with open source llms like Zephyr 7b. Any guide to work with CSV files , building rag chatbot
3 comments
T
L
Is continuous batching available in llamacpp cpu server?
3 comments
T
L
@kapa.ai hey how can I create seperate context engine in fastapi for different users in llamaindex.
14 comments
k
T
Hi how to use hosted llamacpp server in llamaindex for chat engine. Iam trying by importing OpenAiLike and passing model name and base_api still getting error
30 comments
L
T
@kapa.ai how to use llama CPP server in llamaindex . I launched openapi compatible llamacpp server. How can I use this url
11 comments
k
T
Hi All , do we have any llm server framework like vllm, open llm which can be run only on cpu and make use of multiple cpus in cluster like master and worker to serve multiple inferences parallel
8 comments
T
L
@kapa.ai llamaindex support for xinference cpu
9 comments
k
T
@kapa.ai I am doing a rag chatbot with chat engine, how can I limit my output to strictly context and provide as less output as possible. Right now iam getting answer and also full follow up information about my question
28 comments
k
T
@kapa.ai how to set prompt template for Microsoft phi-2 model and how to pass to llamacpp
6 comments
k
T
Hi , I seen llmlingua and tried with llamaindex , also using llamacpp for loading llm. For each question the time taken to get prefix match hit is too high . Llm inference time although reduced, time taken to hit llm is so high that without llmlingua my chat engine is giving faster response. Any idea on this. Iam using only cpu
4 comments
L
T
Hi , is there a way to make pandas query engine more contextual like taking previous input/output into context for follow up questions
8 comments
L
T
R