Find answers from the community

Updated 3 months ago

LangchainLLM

I am trying to use this example (https://gpt-index.readthedocs.io/en/latest/examples/usecases/10k_sub_question.html) with LangChainLLM (https://gpt-index.readthedocs.io/en/latest/reference/llms/langchain.html) - I am not sure where do I pass the LLM name. for example, I am trying to use Bedrock from Langchain (https://python.langchain.com/docs/modules/model_io/models/llms/integrations/bedrock)

96 comments

LLogan M

I believe the correct method is something like this

Plain Text

llm = <create llm from langchain I.e. bedrock>
service_context = ServiceContext.from_defaults(llm=llm)

index = VectorStoreIndex.from_documents(documents, service_context=service_context)

eexplorer

i tried what you suggested, but its still expecting openai key.

eexplorer

AuthenticationError: No API key provided. You can set your API key in code using 'openai.api_key = <API-KEY>', or you can set the environment variable OPENAI_API_KEY=<API-KEY>). If your API key is stored in a file, you can point the openai module at it with 'openai.api_key_path = <PATH>'. You can generate API keys in the OpenAI web interface. See https://platform.openai.com/account/api-keys for details.

LLogan M

Ah, because there is two models in llama index that both default to openai -- the llm and the embed_model

LLogan M

https://gpt-index.readthedocs.io/en/latest/how_to/customization/embeddings.html#custom-embeddings

You can run local emeddings if you want to skip openai. This example shows how to use huggingface (if you don't provide a model_name, it defaults to mpnet-v2)

eexplorer

I want to use bedrock embeddings or some top ones from https://huggingface.co/spaces/mteb/leaderboard. eventually store in a vector store. so, for now I can't use VectorStoreIndex? (right?)

LLogan M

You can still use the vector store index. Just setup like this

Plain Text

from llama_index import ServiceContext, LangchainEmbedding
from langchain.embeddings.bedrock import BedrockEmbeddings

llm = <create llm from langchain I.e. bedrock>
embed_model = LangchainEmbedding(BedrockEmbeddings(...))
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)

index = VectorStoreIndex.from_documents(documents, service_context=service_context)

LLogan M

Or you can use the huggingface embeddings to use any embed model from huggingface

You can use any embeddings from langchain, just need to provide that wrapper

eexplorer

getting closer. I tried what you said, but it seems there is an issue with what bedrock embeddings expecting and how abstractiosn happening on langchain/llamaindex. I get below error

eexplorer

ValueError: Error raised by inference endpoint: An error occurred (ValidationException) when calling the InvokeModel operation: The provided inference configurations are invalid

LLogan M

What's the full error/traceback? I wonder if that's a langchain or llama index error

eexplorer

its definitely langchain, but wondering if we can override in LangchainEmbedding?!

LLogan M

It sounds like you just didn't initialize the bedrock embeddings properly? 🤔

LLogan M

Does this example work for you? https://python.langchain.com/docs/modules/data_connection/text_embedding/integrations/bedrock

eexplorer

yes, I have that working fine!

eexplorer

embeddings = BedrockEmbeddings(client=bedrockClient) works fine but embeddings = LangchainEmbedding(BedrockEmbeddings(client=bedrockClient)) doesn't!

eexplorer

i mean the error doesn't happen with those statements. but the moment i wrap embeddings in langchainbemdding and pass to service context, it has issues.

eexplorer

i tried embeddings.embed_query("This is a content of the document") with first statement and its fine.

LLogan M

hmmm, does this work?

Plain Text

embed_model = LangchainEmbedding(BedrockEmbeddings(...))
embed_model.get_text_embedding("test string")

eexplorer

yes that works!

eexplorer

this worked.

embeddings = LangchainEmbedding(BedrockEmbeddings(client=bedrockClient))
embeddings.get_text_embedding("test string")

LLogan M

😵

eexplorer

but when passing through service context it didnt work

LLogan M

🤔

Ok, one last attempt to make this work lol

Set a global service context, and then don't worry about passing it in

Plain Text

from llama_index import set_global_service_context

service_context = ServiceContext.from_defaults(embed_model=embed_model, ...)
set_global_service_context(service_context)

eexplorer

service_context = ServiceContext.from_defaults(llm=llm, embed_model=embeddings) is my service context

eexplorer

let me try set_global_service_context

eexplorer

I have a feeling that the path of the file I am passing is not valid in sagemaker jupyter notebook!

eexplorer

no, i am loading fine! its the VectorStoreIndex what fails!

eexplorer

Plain Text

if not index_loaded:
    # load data 
    
    seller_guide = SimpleDirectoryReader(input_files=["./aws-marketplace-ug.pdf"]).load_data()
    print(len(seller_guide)) # this did print the length fine
    # build index
    seller_index = VectorStoreIndex.from_documents(seller_guide, service_context=service_context) #this one failed

LLogan M

Yea seems like the embeddings are causing an issue for some reason 🤔

eexplorer

i am gonna try a different embedding model and see

LLogan M

I'm pretty stumped haha

LLogan M

yea that sounds good!

LLogan M

I know the huggingface one works well

eexplorer

thats what I am trying now. do you have any recommendation ? hkunlp/instructor-xl

LLogan M

Yea that one seems to perform pretty well on the leaderboard

LLogan M

I haven't actually tried to use it yet haha but I know others have!

eexplorer

when using SubQuestionQueryEngine, how do i pass llm? its expecting openai by default

eexplorer

btw, the huggingface embeddings worked

eexplorer

and when querying the engine, why the answers are super short (one sentence or word)

LLogan M

I would just set the global service context to avoid worrying about how to pass everything in -- it really simplifies a lot

LLogan M

Hmmm, not sure! You are using the bedrock llm right?

eexplorer

i am using bedrock llm and passing that in service context. but for embeddings i am using huggingface

eexplorer

individual engines worked fine

eexplorer

i am experimenting with QueryEngineTool

eexplorer

and trying to figure out how to ask a question that spans across docs and it should come up with answer

eexplorer

s_engine = SubQuestionQueryEngine.from_defaults(query_engine_tools=query_engine_tools)

eexplorer

i have that

eexplorer

now, need to ask question

eexplorer

when i ask it give error. AuthenticationError: No API key provided. You can set your API key in code using 'openai.api_key = <API-KEY>', or you can set the environment variable OPENAI_API_KEY=<API-KEY>). If your API key is stored in a file, you can point the openai module at it with 'openai.api_key_path = <PATH>'. You can generate API keys in the OpenAI web interface. See https://platform.openai.com/account/api-keys for details.

LLogan M

If you don't set the global service context, then you'll also need to pass it in there too (unclear if that's still an issue lol)

Not sure on the short responses though, I've never used bedrock. Do you know the max input size for bedrock?

eexplorer

i am going to re-run global context (maybe i didnt set it before) and see

eexplorer

it seems global context worked. but now getting another error sub query eventhough it says its generating sub questions

LLogan M

Yea that happens when using async in a notebook, easy fix

LLogan M

Plain Text

import nest_asyncio

nest_asyncio.apply()

LLogan M

run that first

eexplorer

ah i thought i have those. interesting it didnt say it couldn’t find the module. that fixed it

eexplorer

why do i get this error sometimes? OutputParserException: Got invalid return object. Expected markdown code snippet with JSON object, but got:

LLogan M

This is because the LLM did not generate a valid response when generating sub-questions

The LLM has to generate a json containing which sub-index to query and which question to ask. But if it doesn't write proper json, then that happens

eexplorer

my experience so far in generating sub questions is not that great. bit invalid questions. I will probably get better response if I switch to openai. I am using claude for now. tried titan. what is the ideal querying technique for multiple pdf document querying?!

eexplorer

do you know how to solve this using other LLMs? guidance (https://github.com/microsoft/guidance?)

LLogan M

yea guidance or using the openai function calling api will improve this somewhat 🙏

LLogan M

https://github.com/jerryjliu/llama_index/blob/main/docs/examples/output_parsing/guidance_sub_question.ipynb

eexplorer

i actually missed seeing guidance. this is the notebook i followed. i dont have much flexibility as I want to use models like claude. i see limited llms support on guidance. https://github.com/microsoft/guidance/tree/main/guidance/llms/transformers

eexplorer

well, i used a smilar one. not same! (the one does 10k analysis)

LLogan M

Yea guidance has not great llm support Sadly (booo microsoft)

eexplorer

ok, so here is my question. for simple RAG approach, without hallocinations, which type of querying is best for searching across multiple docs and coming up with an answer?! I understand LLM dictates what answer it comes up with (I will adjust temperature etc and do some prompt engineering to get more detailed/exact response)

eexplorer

I am looking at ContextRetrieverOpenAIAgent, but need a non openai alterative

eexplorer

right now it seems most of the features are influenced/dependent on openai.

LLogan M

Openai definitely pushes the features, because that's just what everyone uses 😅

Does it not work well just use a single vector index and go from there? Or does that not achieve what you expect?

The sub question engine or the router query engine would be the next "level up" from there I think

LLogan M

But how reliable the level up options are (or really any approach is) depends on how smart the LLM is that you are using 😅

LLogan M

You could also adjust the internal prompt templates for the normal index queries, to better match whichever LLM you are using

eexplorer

agreed on openai being the superior option here. but my plan is to evaluate multiple models (open, proprietary) to make a better decision for customers.

eexplorer

i have different types of docs (buyer, seller, api docs and youtube video transcripts etc) and created index for each instead of combining all into one. thats why i was thinking to create a query engine tool and pass it to llm to query appropriate doc, summarize in the end based on the consensus. I haven't tried the router yet. i will try that. I also want the source doc in the response apart from response. what param is that available ?!

LLogan M

You can access the list of node(s) that were used to create the response in response.source_nodes

The ID of the document that the node came from shows up in source_node.node.ref_doc_id

Additionally, any metadat set on the input documents gets inherited to the nodes created from that document

eexplorer

I see. I will explore the documentation on that. If you think there are any specific links that are beneficial for my use case, please pass it.

LLogan M

Mostly just these I think, not sure if you've seen these all

Multistep
https://gpt-index.readthedocs.io/en/latest/examples/query_transformations/SimpleIndexDemo-multistep.html

Router
https://gpt-index.readthedocs.io/en/latest/examples/query_engine/RouterQueryEngine.html#router-query-engine

SubQuestion
https://gpt-index.readthedocs.io/en/latest/examples/query_engine/sub_question_query_engine.html

Document/Node Customization
https://gpt-index.readthedocs.io/en/latest/how_to/customization/custom_documents.html

eexplorer

multistep is openai based. so that didnt work for me.

eexplorer

the Router one is throwing an error saying KeyError: 'choice'.

eexplorer

the subquestions are generating invalid questions and going offtrack.

eexplorer

what api gives the source doc reference?!

eexplorer

in my experience so far, i get best result if the query directly run on individual indexes. am I doing anything wrong in any of the advanced querying techniques to not get best results?! (it becomes worse)

eexplorer

I take that back. if i load all docs into one, it doesn't give me better results either. especially the docs started becoming big (500+ pages). need to figure out a way the similarity search works better!

eexplorer

nvm. found it.

LLogan M

Yea at that point all in a single index, you might need to increase the top k, and maybe also decrease the chunk size a bit at the same time?

eexplorer

for now, I am facing difficulty even with single file indexing. basically its kind of doing one page when the information is in multi page. so, i should increase chunk size? whats the default chunk size in service context?

LLogan M

the default chunk size is 1024, and the default top k is 2

eexplorer

are you saying about top_k or similarity_top_k?

eexplorer

what is the default overlap?!

eexplorer

where can we find default values in service context?!

LLogan M

similarity_top_k yes

LLogan M

overlap is not really important tbh. But the default is 20 tokens. the default chunk size is 1024

eexplorer

does the LLM summarize across all results returned or just top 1?

LLogan M

It will be all nodes returned

eexplorer

so in case of content being across multiple pages, i should be increasing chunk size right?! or decrease?! sometimes when i ask llm to provide list of things, it provides only half of what is present as the data is spread across multiple pages

LLogan M

Ok, let's take a step back and ensure we both understand the flow of how things work

When documents are put into llama-index, they are chunked into nodes that are 1024 tokens by default, with 20 tokens of overlap by default.

At query time, it depends on which index you used. There are two main indexes in llama-index, a VectorStoreIndex and a ListIndex

A VectorStoreIndex will embed your query, retrieve the top 2 (by default) matching nodes. Then it sends those nodes and your query to the LLM to answer the query. If you are missing information, you need to either increase the top k, or increase the chunk size.

A list index will not use embeddings, and instead return every node in the index. This is usually useful for queries that need to read everything in the index to answer, or for generating summaries with something like index.as_query_engine(response_mode="tree_summarize").query("Summarize the context.")

You can probably tell there are times when a query is best answered by a vector index, and other times by a list index. Using a router query engine, you can best support both use cases

Add a reply