LoLiPoPMaN

Integration with nvidia nv-ingest that was recently published

Will there be integration with nvidia nv-ingest that was recently published ?

6 comments

Building a local parser similar to llama-parse

Are there any good resources for building a parser similar to Llama-Parse? My issue is that, due to data safety, we cannot use any cloud providers and must rely on a local LLM and embedding model. I'm watching Jerry Liu's Ray Summit 2024 video and am considering how I might replicate at least some of the basics. Cheers!

6 comments

LLoLiPoPMaN

Why specified hf embeddings do not work for indexing and retrieval

Can someone offer any insight why my specified HF embeddings do not work ? I use one .py file for indexing with the following content (relevant to the question) and the same pgvector DB for retrieving the data.
App part:

Plain Text

Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-m3")
vector_store = PGVectorStore.from_params(
    database=db_name,
    host=url.host,
    password=db_password,
    port=url.port,
    user=db_user,
    table_name=table_name,
    embed_dim=1024,  # HF embedding dimension
)
index = VectorStoreIndex.from_vector_store(vector_store=vector_store)

Separate indexer py :

Plain Text

vector_store_connection_string = f"postgresql://{db_user}:{db_password}@{db_host}:{db_port}/{db_name}"
url = make_url(vector_store_connection_string)

vector_store = PGVectorStore.from_params(
    database=db_name,
    host=url.host,
    password=db_password,
    port=url.port,
    user=db_user,
    table_name=table_name,
    embed_dim=1024,  # HF embedding dimension
)

# Create the storage context and index
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context, show_progress=True
)

print("Indexing complete. Data stored in PGVector.")

The error :

Plain Text

DataError: (psycopg2.errors.DataException) different vector dimensions 1536 and 1024 [SQL: SELECT public.data_building_permits_data.id, public.data_building_permits_data.node_id

Any insight is appreciated.

17 comments

LLoLiPoPMaN

I have a two examples of chabot

I have a two examples of chabot implementation where in first one i use GPT-3-turbo and in the second GPT-4-turbo. I am getting better results from GPT-3-turbo than from GPT-4-turbo. Context: I use Pinecone VectorDB storage and query the data using query engine. I did switch to a new embedding model like so :

Plain Text

"llm = OpenAI(model="gpt-4-turbo", temperature=0)
Settings.llm = llm

embed_model = OpenAIEmbedding(model="text-embedding-3-small")
Settings.embed_model = embed_model
#logging.info(" LLM MODEL OPENAI" + llm.model)
logging.info("Initialized OpenAI")
query_engine = loaded_index.as_query_engine(streaming=False) #, text_qa_template=text_qa_template, llm=llm)
logging.info("Initialized QueryEngine")"

The second more simpler implementation of my bot : 
# Initialize your index
pinecone_index = pc.Index(index_name)

vector_store = PineconeVectorStore(pinecone_index=pinecone_index)
loaded_index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
logging.info("Index loaded from Pinecone")

query_engine = loaded_index.as_query_engine()

Is the embedding model problematic ? I used llamacloud parse and then indexed the data... I am using the newer embeding model "text-embedding-3-small" in the GPT-4-turbo version. Could this be making my GPT-4-turbo results worse ? I do have langfuse connected to better analyze the retrieve process.

1 comment

LLoLiPoPMaN

Context

How to limit chatbot responses to questions that are not related to the topic and context in which the chatbot should operate. I am familiar with the Pydantic library but have not implemented it. I have implemented my own prompt for system and user based on the Llamaindex template. Despite the implementation I can for example, ask what is the size ratio between the moon and the earth and get an answer without a problem. Can i somehow mitigate that ? I am using basic setup of VectorStoreIndex and qa refinement template.

7 comments

LLoLiPoPMaN

Is there any predetermined format for

Is there any predetermined format for storing the queries and using it as context in the future usecases -> having some sort of answer base where you can use the prior given answers ? I would then use this as a part of indexing. Is this viable?

1 comment

LLoLiPoPMaN

I have found CitationQueryEngine - which

I have found CitationQueryEngine - which I presume is the closest I will get.

3 comments

LLoLiPoPMaN

Is there any chance to read the images

Is there any chance to read the images and tables in .pdf files ? For instance if i use GPT4 as my model for service context ? As of now I am using GPT 3.5-turbo.

13 comments

LLoLiPoPMaN

Is there any way llamaindex can serve

Is there any way llamaindex can serve links ? For instance suggest navigating on a specific subpage ? If llamaindex could serve as a navigator.... Clothing expert for instance.

6 comments

LLoLiPoPMaN

Does anyone runs ollama locally for

Does anyone runs ollama locally for llamaindex use ? Or the best practice is to just use OpenAI GPT models ?

46 comments

LLoLiPoPMaN

For my chat_engine i used this code : "

For my chat_engine i used this code : "vector_query_engine = vector_index.as_chat_engine(text_qa_template=text_qa_template, refine_template=refine_template, response_mode="condense_plus_context")"
The reponses from this query_engine are not condense. Each time i get different length. How can i standardize the reponse size ?

6 comments

LLoLiPoPMaN

How do i limit query response or define the output size.

@kapa.ai How do i limit query response or define the output size.

2 comments

LLoLiPoPMaN

OpenAIAssistantAgent

Hello everyone! About 3 months ago i started searching for tools and frameworks that would enable me to create a useful and knowledgeable chatbot for a potential client. I am impressed with what is capable with llama-index. Is there any possibility I can showcase the problems I am facing or get any insight on my problem and overall potential solution ? I am Information system second year Masters student. For starters : I cannot use "OpenAIAssistantAgent.from_new" as a custom agent. I saw example using it but it cannot be imported. Thanks in advance.

11 comments

Find answers from the community

Integration with nvidia nv-ingest that was recently published

Building a local parser similar to llama-parse

Why specified hf embeddings do not work for indexing and retrieval

I have a two examples of chabot

Context

Is there any predetermined format for

I have found CitationQueryEngine - which

Is there any chance to read the images

Is there any way llamaindex can serve

Does anyone runs ollama locally for

For my chat_engine i used this code : "

How do i limit query response or define the output size.

OpenAIAssistantAgent