Which keyword index are you using?
One uses a regex to extract keywords, the other asks the LLM to generate keywords, with the latter being more flexible
@Logan M This is how I built the graph:
graph = ComposableGraph.from_indices(
GPTSimpleKeywordTableIndex,
indices,
index_summaries,
max_keywords_per_chunk=50
)
And this is how I struct my query:
decompose_transform = DecomposeQueryTransform(
llm_predictor,
verbose=True
)
# set query config
query_configs = [
{
"index_struct_type": "simple_dict",
"query_mode": "default",
"query_kwargs": {
"similarity_top_k": 3
},
# NOTE: set query transform for subindices
"query_transform": decompose_transform
},
{
"index_struct_type": "keyword_table",
"query_mode": "simple",
"query_kwargs": {
"response_mode": "tree_summarize",
"verbose": True
},
},
]
query_str = (query)
query_configs[0]["query_transform"] = decompose_transform
query_configs[1]["query_kwargs"]["num_chunks_per_query"] = 3
response = index.query(
query_str,
query_configs=query_configs,
service_context=service_context,
)
I also saw another weird thing - I ran the code several times on a query that didn't result in a crash - but every time I get a different answer since the transformed query is not consistent between the calls.
It looks like there's some random logic when transforming the original query
The decompose transform might work differently if the LLM temperature is not zero I think
Where do I set the LLM of the decompose?
My llm predictor is defined as:
llm_predictor = LLMPredictor(llm=ChatOpenAI(model_name="gpt-4", temperature=0))
So I did set the temp to 0
Try using GPTKeywordTableIndex
instead, it should be better at extracting keywords (but might be a bit slower). The index you are using now is very simplistic and uses a regex and removes stopwords
Yea idk then haha there might be some non-deterministic thing happening here
Interesting, will try that. Still don't understand though why there's inconsistency in the transformed query
And also why it crashes on some queries
It crashes because it didn't extract keywords (because the simple approach did not find any keywords)
But there's still an LLM that transforms the query?
Not sure I understand the flow here with the regex and query transformation
Hey @jerryjliu0 π
Any idea where the randomness in query transformation comes from?