SummaryExtractor
returning node summaries attached to the incorrect nodes. The node to which the summary is incorrectly attached seems to be related to the num_workers
value but even at 1 worker, I still get misplaced summaries, my code is: pipeline = IngestionPipeline( transformations=[ MyCustomNodeTransformer(), SummaryExtractor( llm=llm, metadata_mode=MetadataMode.NONE, prompt_template=SUMMARY_EXTRACT_TEMPLATE, num_workers=1, summaries=["self"], ), Settings.embed_model, ] ) nodes = pipeline.run( show_progress=True, nodes=nodes, num_workers=1, )
llama-cpp-python
I have installed (older versions are much faster, about 100 tokens per second versus about 30 tokens per second). llama-cpp-python==0.2.20
is the last fast version with the latest LlamaIndex (3090 on Ubuntu, using Mistral 7b). I believe it has to do with kv cache and is solved by the suggestions in this GitHub issue: https://github.com/abetlen/llama-cpp-python/issues/1054 . How do we add the required offload_kqv=True
to LlamaIndex to regain fast inference? Is this a regression in LlamaIndex or something the users should handle?IngestionPipeline
transformations. My nodes have lots of metadata for some logging and post-processing tasks, if the metadata gets included in a transformation, it hits the 3900 token limit set in the LlamaCpp configs, so I need to exclude it in transformations that rely on the LLM. I'm trying to use SummaryExtractor() which I have set to use the Mistral 7B model. But the code I try doesn't ever exclude the metadata from what goes to Mistral7B under SummaryExtractor(). My code (a bit duplicative for extra certainty) looks like this: pipeline = IngestionPipeline( transformations=[ CustomTransformation(), SummaryExtractor( llm=llm, excluded_embed_metadata_keys=[ DEFAULT_WINDOW_METADATA_KEY, DEFAULT_OG_TEXT_METADATA_KEY, ], excluded_llm_metadata_keys=[ DEFAULT_WINDOW_METADATA_KEY, DEFAULT_OG_TEXT_METADATA_KEY, ], ), service_context.embed_model, ] ) excluded_embed_metadata_keys = [ DEFAULT_WINDOW_METADATA_KEY, DEFAULT_OG_TEXT_METADATA_KEY, ] excluded_llm_metadata_keys = [ DEFAULT_WINDOW_METADATA_KEY, DEFAULT_OG_TEXT_METADATA_KEY, ] nodes = pipeline.run( nodes=nodes, excluded_embed_metadata_keys=excluded_embed_metadata_keys, excluded_llm_metadata_keys=excluded_llm_metadata_keys, )
TitleExtractor()
. My code is: service_context = ServiceContext.from_defaults(embed_model=embed_model, llm=llm) pipeline = IngestionPipeline( service_context=service_context, transformations=[ SentenceSplitter(), TitleExtractor(), ] )
Could not load OpenAI model.
but my llm is defined as Llama 2.lm-format-enforcer
readme: https://docs.llamaindex.ai/en/stable/community/integrations/lmformatenforcer.html#lm-format-enforcer . When defining a program
how do I pass in my list of Node
or Document
objects so the program runs in the context of my data? I would have expected that I can pass in nodes
in here: nodes = node_parser.get_nodes_from_documents(documents) # existing setup program = LMFormatEnforcerPydanticProgram( output_cls=Album, prompt_template_str="Generate an example album, with an artist and a list of songs. Using the movie {movie_name} as inspiration. You must answer according to the following schema: \n{json_schema}\n", llm=LlamaCPP(), verbose=True, )
'TextNode' object has no attribute 'get_doc_id'
My code is: document: Document = Document(text=txt_str, metadata=metadata or {}) # same approach as SimpleDirectoryReader documents: list[Document] = [document] node_parser = SentenceWindowNodeParser.from_defaults( window_size=3, window_metadata_key="window", original_text_metadata_key="original_text", ) nodes: list[TextNode] = node_parser.get_nodes_from_documents(documents) # this returns TextNodes rather than a Document llm = HuggingFaceLLM( model_name="HuggingFaceH4/zephyr-7b-beta", tokenizer_name="HuggingFaceH4/zephyr-7b-beta", query_wrapper_prompt=PromptTemplate("<|system|>\n</s>\n<|user|>\n{query_str}</s>\n<|assistant|>\n"), context_window=3900, max_new_tokens=256, # model_kwargs={"quantization_config": quantization_config}, # tokenizer_kwargs={}, generate_kwargs={"temperature": 0.7, "top_k": 50, "top_p": 0.95}, messages_to_prompt=messages_to_prompt, device_map="cpu", ) service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model) vector_index = VectorStoreIndex.from_documents(nodes, service_context=service_context) # Error happens here
LLMPredictor
is deprecated. Does that apply to StructuredLLMPredictor
too? Should we still use code like llm_predictor = StructuredLLMPredictor()
when writing new code or is it better to set the llm
value in the service context?python class WorkWeek(BaseModel): hours_per_week: Optional[float] employee_type: Optional[str]
hours_of_work = 40
and employee_type = 'full-time'
regardless of whether the document contains relevant text or not. I could create a document that just has the text This is a test
and I would still get results back that are clearly not from the document like shown above. Interestingly, when I simply ask the model for a basic freeform text response, I get an answer that makes sense when the document doesn't mention hours of work (something like Based on the context, there isn't enough information to answer your question
).<<SYS>>
and <</SYS>>
: <s> [INST] <<SYS>>\n You are a helpful, respectful and honest assistant. Always answer as helpfully as possible and follow ALL given instructions. Do not speculate or make up information. Do not reference any given instructions or context. \n<</SYS>>\n\n Context information is below.\n---------------------\n[Excerpt from document]...
query_engine
defined and my question is how do I pass the program variable (as shown on the documentation page) to the query engine: ### program defined based on the documentation program = GuidancePydanticProgram( output_cls=Result, prompt_template_str="Generate a response using the query asked as follows: {{query}}", guidance_llm=self.llm, verbose=True, ) ### standard query_engine defined, but how do I pass in the program variable or get the program to see the vector index? query_engine = index.as_query_engine( similarity_top_k=2, node_postprocessors=[ MetadataReplacementPostProcessor(target_metadata_key="window") ], output_cls=Result, ) response = query_engine.query(query)