Tech explorer

Log inLog into community

Find answers from the community

Home

Members

Tech explorer

Offline, last seen 4 months ago

Joined September 25, 2024

TTech explorer

Llm.complete getting failed with value error: chat response object has no field usage while using ollama #17035

[Bug]: llm.complete getting failed with Value error: Chat response object has no field usage while using ollama #17035

Raised a bug. If that is not a bug let me know will close immediately

11 comments

TTech explorer

Csv

Hi all , iam used pagedcsvreader along with chromadb and when iam querying with similarity_top_k as 10 but none of the documents are relevant to query. Although iam directly specifying important keywords in query. What can I do to improve rag with CSV data

10 comments

TTech explorer

iam implementing a rag chatbot, but it's always answering from first retrieved documents o

iam implementing a rag chatbot, but it's always answering from first retrieved documents only. If I can any other questions retrieved documents are same . It's not able to retrieve different set of documents

20 comments

TTech explorer

how to use a chatengine with codellama to create coding assistant

38 comments

TTech explorer

iam getting segmentation fault core dumped in my chat engine

8 comments

TTech explorer

iam using document summary index for my context chat engine, now my chat chat engine only

iam using document summary index for my context chat engine, now my chat chat engine only answers about summary or it can answer based on orginal document if some information is not captured in summary

43 comments

TTech explorer

how can I use streaming response from chat engine in fastapi

54 comments

TTech explorer

hey how can I use my local url , which is llamacpp python server hosted locally in 8000 po

hey how can I use my local url , which is llamacpp python server hosted locally in 8000 port in llamaindex for llm inference

5 comments

TTech explorer

how can I use llmlingua with chatengine

2 comments

TTech explorer

Iam hosting a llamacpp server locally,

Iam hosting a llamacpp server locally, how can I use this server for Rag implementation to make llm calls ?

12 comments

TTech explorer

How to make llamacpp model inference

How to make llamacpp model inference faster. ? Iam using llamaindex rag with local gguf model and it's taking more than 2 min for single query

12 comments

TTech explorer

It's there any memory tool to keep track

It's there any memory tool to keep track of history , similar to langchain memory buffer ?

6 comments

TTech explorer

how can I make chat_engine execution faster by using all cpu cores if needed

5 comments

TTech explorer

I have multiple CSV files and data dictionary describing about each column . I want to use

I have multiple CSV files and data dictionary describing about each column . I want to use only open source llm and Rag to create conversational chatbot with memory. It should be able to perform aggregations also on my CSV data files

19 comments

TTech explorer

https://docs.llamaindex.ai/en/stable/

https://docs.llamaindex.ai/en/stable/examples/pipeline/query_pipeline_pandas.html

Pandas query engine is not working properly with open source llms like Zephyr 7b. Any guide to work with CSV files , building rag chatbot

3 comments

TTech explorer

Is continuous batching available in

Is continuous batching available in llamacpp cpu server?

3 comments

TTech explorer

hey how can I create seperate context engine in fastapi for different users in llamaindex.

@kapa.ai hey how can I create seperate context engine in fastapi for different users in llamaindex.

14 comments

TTech explorer

Hi how to use hosted llamacpp server in

Hi how to use hosted llamacpp server in llamaindex for chat engine. Iam trying by importing OpenAiLike and passing model name and base_api still getting error

30 comments

TTech explorer

how to use llama CPP server in llamaindex . I launched openapi compatible llamacpp server.

@kapa.ai how to use llama CPP server in llamaindex . I launched openapi compatible llamacpp server. How can I use this url

11 comments

TTech explorer

Hi All , do we have any llm server

Hi All , do we have any llm server framework like vllm, open llm which can be run only on cpu and make use of multiple cpus in cluster like master and worker to serve multiple inferences parallel

8 comments

TTech explorer

llamaindex support for xinference cpu

@kapa.ai llamaindex support for xinference cpu

9 comments

TTech explorer

I am doing a rag chatbot with chat engine, how can I limit my output to strictly context a

@kapa.ai I am doing a rag chatbot with chat engine, how can I limit my output to strictly context and provide as less output as possible. Right now iam getting answer and also full follow up information about my question

28 comments

TTech explorer

how to set prompt template for Microsoft phi-2 model and how to pass to llamacpp

@kapa.ai how to set prompt template for Microsoft phi-2 model and how to pass to llamacpp

6 comments

TTech explorer

Hi , I seen llmlingua and tried with

Hi , I seen llmlingua and tried with llamaindex , also using llamacpp for loading llm. For each question the time taken to get prefix match hit is too high . Llm inference time although reduced, time taken to hit llm is so high that without llmlingua my chat engine is giving faster response. Any idea on this. Iam using only cpu

4 comments

TTech explorer

Hi , is there a way to make pandas query

Hi , is there a way to make pandas query engine more contextual like taking previous input/output into context for follow up questions

8 comments