theta

Log inLog into community

Find answers from the community

Home

Members

theta

Offline, last seen 5 months ago

Joined September 25, 2024

ttheta

What does the create-llama CLI tool create and how to integrate it into an existing app?

hi gurus! I'm looking to try out the create-llama CLI tool but I'm not sure what it creates or how to integrate what it generates into an existing app. I already have a FastAPI backend and vector db and graph db setup and now I want to build the llama index layer to manage chunking and inserting/retrieval etc. I'm using Neo4j for graph db and have all the nodes and relationships specified and I want to use LLM and llama index to extract the nodes and properties. Can someone point to a tutorial or blog post that might present the process I should adopt? Thanks kindly

6 comments

ttheta

Morning llama ninjas! Looking for some

Morning llama ninjas! Looking for some guidance on how to setup the KnowledgeGraphRAGRetriever for an existing neo4j database. All the examples I've found start from raw documents and then use Llama Index to build a graph index, but I already have a neo4j database I want to start using directly. I got the Neo4jGraphStore class to connect to my database correctly but I'm not sure what to do next. Any tips greatly appreciated! cheers 😄

44 comments

ttheta

question about the new pipeline

question about the new pipeline structure. In the example code, you create a custom class to perform a preprocessing step. what happens when we have 10 steps? do we add all those steps to the custom class or do we make 10 custom classes? what is the recommended approach?

12 comments

ttheta

help getting output classes to work.

help getting output classes to work. Hoping that someone can take a peek at my code and give me some guidance as to how to get output classes to format the response. I'm getting key errors and attribute not found errors. In the prompt I instruct the LLM to generate a key value pair for its classification. How to debug this? thank you!

class TechnicalNoteClassifcations(Enum):
    CRITICAL = "Critical"
    NEW_FEATURE = "New feature"
    SOLUTION_PROVIDED = "Solution provided"
    INFORMATION_ONLY = "Information only"
class TechnicalNoteResponseData(BaseModel):
    classification: TechnicalNoteClassifcations
    summary: str
response = index.as_query_engine(
                    text_qa_template=qa_prompt_templates[item],
                    similarity_top_k=num_k,
                    output_cls=TechnicalNoteResponseData
                    ).query(f"{title}")

❱ 51 │   │   │   │   │   ).query(f"{title}")  
KeyError: 'classification'

24 comments

ttheta

llama peeps, can someone explain the

llama peeps, can someone explain the relationship between the 'max_tokens' argument of the llm class versus the 'context_window' and 'num_output' arguments of the PromptHelper? I keep getting the error:

InvalidRequestError: This model's maximum context length is 4097 tokens. However, you requested 4529 tokens (529 in
the messages, 4000 in the completion). Please reduce the length of the messages or completion.

my llm definition:
llm = OpenAI(temperature=0.1, model="gpt-3.5-turbo", max_tokens=3000)
my prompt_helper

prompt_helper = PromptHelper(
    context_window = 4097,
    num_output = 1000,
    tokenizer = tiktoken.encoding_for_model('text-davinci-002').encode,
    chunk_overlap_ratio = 0.01
)

I dont understand the '529' number or where the '4000' is coming from. thanks kindly!

7 comments

ttheta

Prompts

hi all, does anyone have any reading materials on building prompts for index construction? In the docs (https://docs.llamaindex.ai/en/stable/core_modules/model_modules/prompts.html#modify-prompts-used-in-index-construction) what exactly do these prompts look like?
Second, is it possible to create prompt templates for query_engines where you can pass a system and a user instruction or is that only for chat completions? thanks kindly for any guidance!

6 comments

ttheta

Hey all is the prompt framework in Llama

Hey all, is the prompt framework in Llama Index compatible with LangChain prompt solutions? I'm just learning these things and I was wondering if its possible to use both at the same time? LangChain has a library tool for prompts that looks neat and I'm curious to know what is involved in working with the two frameworks (if at all) and any considerations I should have going into this phase. I'm currently building a query engine tool that doesn't chat with end users, I just need to build prompts to guide the LLM's answers and output. Thanks kindly for any guidance getting this going!

2 comments

ttheta

Do we need to specify the embedding

Do we need to specify the embedding function when loading persisted collections from chromadb? Based on the guidance from here and the docs, I was using the following to load chroma collections for use with vector_stores.

vectordb = chromadb.PersistentClient(path="some/path/here")
chroma_collection = vectordb.get_collection('collection_name') # <-- can we/should we specify an embedding function here? I hadn't noticed in docs
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
vector_store_index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context)

The reason I'm asking, is I tried to access the chroma collection to run some queries to try and figure out why the query_engine is doing so poorly for me and when I tried to run the query from the 'chroma_collection' object, it defaulted to the chromadb default embedding which is not OpenAIEmbedding. For example, I tried:

data = chroma_collection.query(query_texts = 'some string',  n_results=5, where_document={'$contains': 'CVE-2023-4351'},  include=['metadatas', 'distances'])

Running the above generated an error indicating that the embedding dimensions between the query and the collection didn't match (350 vs 1536). So I next loaded the chroma collection and then passed an embedding function to the chroma "get_collection()" function. Once I did that, I was able to query the chroma collection as expected.

from chromadb.utils import embedding_functions
openai_ef = embedding_functions.OpenAIEmbeddingFunction(api_key=openai.api_key, model_name="text-embedding-ada-002")
collection = vectordb.get_collection(name='msrc_security_update', embedding_function=openai_ef)
data = chroma_collection.query(query_texts = 'some string',  n_results=5, where_document={'$contains': 'CVE-2023-4351'},  include=['metadatas', 'distances'])

Normally, the embedding function is set by the service_context..

3 comments

ttheta

Question about retrievers and metadata

Question about retrievers and metadata filters. I'm trying to use metadata filters to get the correct nodes with a query engine because I'm finding the results are wrong a lot. All of my source documents have a metadata key "source" that contains the URL to the document. I tried the following code to implement it in conjunction with a RetrieverQueryEngine but the results don't appear filtered because I'm still getting back nodes from the wrong documents. Can someone let me know if the code is implemented correctly?

filters = MetadataFilters(filters=[ExactMatchFilter(key="source", value="https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-4351")])
retriever = VectorIndexRetriever(
    index=vector_store_indicies['msrc_security_update'],
    similarity_top_k=5,
    metadata_filters=filters
)
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    node_postprocessors=[metadata_replace]
)
response = query_engine.query(
    "fully explain with details 'CVE-2023-4351'",
)

8 comments

ttheta

llama_docs_bot/3_eval_baseline/3_eval_ba...

Question on using QueryEngineTools reference -> https://github.com/run-llama/llama_docs_bot/blob/main/3_eval_baseline/3_eval_basline.ipynb
In the sample, all the indices are VectorStoreIndex and the description just references the content topics. What if we have a collection of VectorStoreIndex and a collection of SummaryIndex? Does putting "Vector Index" or "Summary Index" in the description mean anything to the LLM is it a bad idea to mix index types in this way?

2 comments

ttheta

hi llamatonians anyone got

hi llamatonians, anyone got pineconevectorstore working? I ran the init() with my api_key and environment and I can list_indexes() and describe_index() the index that exists on pinecone, but when I try to use

vector_store_index = VectorStoreIndex.from_documents(docs, storage_context=storage_context, service_contenxt=service_context)

I always get the errors:

ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host, ProtocolError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)), PineconeProtocolError: Failed to connect; did you specify the correct index name?

active_indexes = pinecone.list_indexes() -> ['report-vector-store']
pinecone.describe_index("report-vector-store") ->IndexDescription(name='report-vector-store', metric='cosine', replicas=1, dimension=1536.0, shards=1, pods=1, pod_type='starter', status={'ready': True, 'state': 'Ready'}, metadata_config=None, source_collection='')

48 comments

ttheta

Got a simple question about using the

Got a simple question about using the storage context for multiple indices. So far, I've only created/used a single index but now I want to try the SummaryIndex -alongside my VectorStoreIndex. In the storage context I'm using, I define the docstore, indexstore, and vectorstore. so I understand all that relates to the vectorstoreindex. but now I want to create a SummaryIndex and I'm unclear if I create a new storage context or if I just pass my existing storage context to the constructor of the SummaryIndex. When I persist the storage context what controls the file name of the SummaryIndex data structure? For example, for the docstore I can pass a file path to what ever I want the docstore to be called, but there is no 'summary_store' argument. Or is there? Right now, I'm using the same documents as the vector_store I'm using. THanks for any clarity!

21 comments

ttheta

anyone successfully persist and load a

anyone successfully persist and load a chroma vector_store+docstore? I've tried everything I can think of but can never load anything. Please help! I'm totally stuck. I can see the "chroma.sqlite3" file and the docstore/index_store and after running vector_index.storage_context.persist(persist_dir=persist_dir). but I can never get any of it to load back. I've tried the following to no avail.


persist_dir2 = "C:/projects/technical-notes-llm-report/data/06_models/"
chroma_client2 = chromadb.PersistentClient(path=persist_dir2) 
chroma_collection2 = chroma_client2.get_or_create_collection(collection_name) 
vector_store2 = ChromaVectorStore(chroma_collection=chroma_collection2) 
storage_context2a = StorageContext.from_defaults(
    docstore=SimpleDocumentStore.from_persist_path("C:/projects/technical-notes-llm-report/data/06_models/docstore.json"),
    vector_store=vector_store2,
    index_store=SimpleIndexStore.from_persist_path("C:/projects/technical-notes-llm-report/data/06_models/index_store.json"),
) 
vector_index2 = VectorStoreIndex.from_vector_store(vector_store2, storage_context=storage_context2a, service_context=service_context, store_nodes_override=True)
vector_index3a = VectorStoreIndex([], storage_context=storage_context2a, store_nodes_override=True)
vector_index3a.ref_doc_info -> {}
vector_index3b.ref_doc_info -> {}
docstorea = storage_context2a.docstore
docstorea.get_all_ref_doc_info() -> {}

thanks for any insight!

6 comments

ttheta

Hi everyone I was wondering if someone

Hi everyone, I was wondering if someone can point me to tutorials or videos that walk through how to manage a vector store and index store once we've created it and then need to maintain it as new data comes in? The part I dont really get right now is, documents are chunked into nodes, if data changes in the parent document, how do we propagate those changes to the nodes. in my case, only the metadata of the documents change. When we add new documents what are the steps required? I'm using a chromadb to store the index and the embeddings and I'm trying to build pipelines to handle the maintenance. any tips or wisdom greatly appreciated!

23 comments

ttheta

Azure

Hi everyone! I'm getting started with LlamaIndex and really liking it so far. I was wondering if anyone has been successful creating a storage_context with an azure container blob? I was able to persist() the index no problem using an fsspec instance and can list the files using the fsspec instance. however, when I try to create a storage context with the fsspec instance it only returns an HttpError (no resource). fs1 = fsspec.filesystem("abfs", account_name="name", account_key="key") AZURE_CONTAINER = "report-stores" sentence_index.storage_context.persist(persist_dir=f'{AZURE_CONTAINER}', fs=fs1) print(fs1.ls(AZURE_CONTAINER))

['report-stores/docstore.json', 'report-stores/graph_store.json', 'report-stores/index_store.json', 'report-stores/vector_store.json']

sc = StorageContext.from_defaults(persist_dir=f'{AZURE_CONTAINER}', fs=fs1) <-- HttpResponseError Does anyone have any recommendations? cheers!

6 comments

ttheta

Hi Logan, I built a neo4j graph database

Hi Logan, I built a neo4j graph database separately from using the KnowledgeGraphIndex so I inserted all the cypher on my own. Now I want to use a SubQuestionQueryEngine to merge a VectorStoreIndex and my graph store, but the documentation says I have to pass query engines. But as far as I understand it, I need to use the KnowledgeGraphRAGRetriever to access my graph database. Can I use a retriever as a tool for SubQuestionQueryEngine? If I need to pass a query engine, is there an easy way I can use my existing graph data to populate an Index? PS. I know you're refactoring the knowledge graph stuff so can wait until that is released. Thanks kindly!

8 comments

ttheta

Anyone had experience using

Anyone had experience using VectorIndexAutoRetriever and getting it to work? I have the syntax correct, I think, but when I try to run the query_engine I get an opaque error I'm not sure how to debug.

for item in collection_names:
retriever = VectorIndexAutoRetriever(
            index=vector_store_indicies[item], 
            vector_store_info=index_infos['vector_index'][item],
            prompt_template_str = retriever_prompt_strings[item],
            similarity_top_k=num_k,
            max_top_k=5
        )
response_synthesizer = TreeSummarize(summary_template=qa_prompt_templates[item])
query_engine = RetrieverQueryEngine(
            retriever=retriever,
            response_synthesizer=response_synthesizer,
            node_postprocessors = [metadata_replace]
        )

That code appears to be valid, but when the interpreter gets to the following, it chokes:
`
response = query_engine.query(
f"{query_str}",
)
-> in pydantic.main.BaseModel.parse_obj:522
TypeError: 'NoneType' object is not iterable
│ ❱ 63 │ │ │ response = query_engine.query( │
│ 64 │ │ │ │ f"{query_str}", │
│ 65 │ │ │ │ )

2 comments

ttheta

hi all llama peeps 🙂 any chromadb

hi all llama peeps 🙂 any chromadb masters out here? What do we do when the nodes that are returned by a query_engine are all from the wrong documents? I'm creating Chromadb backed VectorStoreIndices and when I build the query_engine and pass some sample queries, sometimes the results are from the correct document and sometimes they are from wrong documents. All my documents have a metadata key 'source' with the full url which contains a unique code "CVE-2023-36898" for example. And the very first few sentences at the top of each document that code is mentioned again. I don't understand how with such specific strings in each document, so many returned nodes can be so incorrect.

Do I need to increase the top_k and then use a Reranker?
when I created the chroma collections, I set them to cosine.
is there a way to improve Chroma's accuracy?

I'm using the following:

query_engine = vector_store_indicies['msrc_security_update'].as_query_engine(
    similarity_top_k=5, node_postprocessors=[metadata_replace], response_mode="tree_summarize"
)

1 comment

ttheta

oh man 😄 So I m trying to create a

oh man 😄 So I'm trying to create a basic SummaryIndex and create a query_engine from it. As far as I can tell, everything is there thats required, but the query_engine only returns Empty response. Can anyone spot anything?
docstore = MongoDocumentStore.from_uri(db_name="report_docstore", namespace=f"docstore_{item}", uri="mongodb+srv://")

index_store = MongoIndexStore.from_uri(db_name="report_docstore", namespace=f"index_store_{item}", uri="mongodb+srv://")

storage_context=StorageContext.from_defaults(docstore=docstore, index_store=index_store)

service_context=ServiceContext.from_defaults(embed_model=OpenAIEmbedding(), callback_manager=callback_manager, node_parser=node_parser)

When I use:

index = SummaryIndex.from_documents(docs_for_collection, storage_context=storage_context, service_context=service_context)

The documents are correctly chunked by node_parser and both the nodes and index data are stored in Mongo.

The Llama chatbot says I'm supposed to use the following:
nodes = node_parser.get_nodes_from_documents(docs_for_collection)
docstores.add_documents(nodes)

I've tried both approaches. It doesn't seem like I need to do both the when running .from_documents() the documents are pushed to Mongo...

regardless, when I try to create a query engine from the index, I always get an empty response.

I can confirm that there are TextNodes in the docstore (docstore.docs returns a dict) and when I look at the index_store/data collection in Mongo, I see:
{"index_id": "msrc_security_update", "summary": null, "nodes": ["9e2fa74a-cc21-411c-9d6b-bc4b15aba5b7", "efb84b1c-499a-47e3-abc3-7316722e262c", "39df1ae1-b3fe-440b-ade4-56b225ac3f74", "05ecc315-065d-4850-8db8-19382daaf52b"]}

but when I run:
response = index.as_query_engine().query("What documents do you have access to?")
print(response)

its always:
**
Trace: query
|_query -> 0.0 seconds
|_retrieve -> 0.0 seconds
**
Empty Response

24 comments

ttheta

is it possible to pull out the nodes

is it possible to pull out the nodes from a chroma vector store that were created when we use either .from_documents() or .refresh_ref_docs() for use in another type of Index? I want to save on computing embeddings and just insert the nodes+embeddings from first indexstore into another. So for example, when creating a basic SummaryIndex or a TreeIndex. Is it better to compute embeddings manually and store them separately ahead of time then make various indices? cheers!

9 comments

ttheta

How do I inspect the reference docs that

How do I inspect the reference docs that I'm attempting to insert into a chroma vector store? I'm creating a persistent chroma vector store and then loading it and refreshing the ref docs with the following, but I can't tell if the docs I'm passing to refresh are actually being stored and saved to disk properly.

vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(
            docstore=SimpleDocumentStore.from_persist_dir(storage_params["persist_dir"]),
            vector_store=vector_store,
            index_store=SimpleIndexStore.from_persist_dir(storage_params["persist_dir"]),
        )
service_context = ServiceContext.from_defaults(callback_manager=callback_manager, llm=llm, embed_model=OpenAIEmbedding(embed_batch_size=50), node_parser=node_parser)
vector_index = VectorStoreIndex([], storage_context=storage_context, service_context=service_context, store_nodes_override=vector_index_params["store_nodes_override"])
results = vector_index.refresh_ref_docs(data)

Now that I've run refresh_ref_docs() how do I verify the refresh worked and persisted the docs/nodes/embeddings? thanks kindly!

4 comments

ttheta

Hi everyone where is a good tutorial on

Hi everyone, where is a good tutorial on how to use a docstore + vector db and an LLM to generate reports instead of chatbot? Each week, we gather documents from various sources and insert those documents into our chromadb. now I want to create a report that describes/summarizes/elevates the posts that came in over the week. So I'd like to have the LLM go over the posts for the week and generate a summary and then I want to output the top 5 or 10 posts for the week. Is this where something like a pydantic data mode is used? Do I query chroma for the posts of the week and then pass those document ids to the LLM or does the LLM do that? any wisdom or recommendations are greatly appreciated! thanks!

3 comments

ttheta

Chroma

I am trying to use Chromadb to store the vectors and documents. I was reading the chroma documentation and it discusses that chroma uses a default sentence transformer to chunk and a default embedding function to create embeddings when you add documents. My question is, when we use a Llama Index service_context that specifies, for example, to use the WindowNodeParser to create the chunks/nodes, does that mean Llama Index chunks the documents into nodes and computes the embeddings and then passes the embeddings of the nodes, original documents and the Llama Index created nodes to chroma without chroma performing its own processing? Does the embed_model argument of the Llama Index service_context override chroma's embedding function specification? Sorry if that sounds convoluted! I'm just not clear where the work is happening and who is doing it... LOL! 😆
PS. according to the chroma docs you're supposed to pass an embedding function when you create or load a collection, does Llama Index manage this on our behalf?
collection = client.get_collection(name="my_collection", embedding_function=emb_fn) <- from chroma documentation

2 comments

ttheta

Is there a way to control the message

Is there a way to control the message length when using one of the metadata extractors? I'm trying to use the summary extractor on a set of nodes I've created and everytime I try to process them, I get the following error: InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4513
tokens. Please reduce the length of the messages.
I'm using a sentence node parser with a window of 3 so the nodes are already small. Nevertheless, I keep wasting a lot of tokens because I'll about 300 nodes in and the process will terminate. how do I control the length of the message for metadata extractors? Cheers!

15 comments