Here the code:
embed_model = LangchainEmbedding(OpenAIEmbeddings(query_model_name="text-embedding-ada-002"))
chunk_len = 256
chunk_overlap = 32
splitter = TokenTextSplitter(chunk_size=chunk_len, chunk_overlap=chunk_overlap)
node_parser = SimpleNodeParser(text_splitter=splitter, include_extra_info=True, include_prev_next_rel=False)
llm_predictor_gpt3 = LLMPredictor(llm=ChatOpenAI(temperature=0.2, model_name='gpt-3.5-turbo', max_tokens=2000))
prompt_helper_gpt3 = PromptHelper.from_llm_predictor(llm_predictor=llm_predictor_gpt3)
service_context_gpt3 = ServiceContext.from_defaults(llm_predictor=llm_predictor_gpt3, prompt_helper=prompt_helper_gpt3, embed_model=embed_model, node_parser=node_parser, chunk_size_limit=chunk_len)
Vector:
reader = JSONReader()
documents = SimpleDirectoryReader('/content/drive/Shareddrives/AI/docs').load_data()
index_conf_vec = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context_gpt3)
Keyword
reader = JSONReader()
documents = SimpleDirectoryReader('/content/drive/Shareddrives/AI/docs').load_data()
index_conf_kw = GPTSimpleKeywordTableIndex.from_documents(documents, service_context=service_context_gpt3)
Knowledge Graph
reader = JSONReader()
documents = SimpleDirectoryReader('/content/drive/Shareddrives/AI/docs').load_data()
index_conf_kg = GPTKnowledgeGraphIndex.from_documents(documents, max_triplets_per_chunk=3, service_context=service_context_gpt3)
index_conf_kg_embedded = GPTKnowledgeGraphIndex.from_documents(documents, max_triplets_per_chunk=3, service_context=service_context_gpt3, include_embeddings=True)
Your chunk size is pretty small. Maybe try decreasing the triplets per chunk to 1, or increasing the chunk size?
Each chunk is a call to the LLM, so if you have a large index, it can take some time (especially when openai might already be slow)
makes sense. Is increasing the triples per chunk also increasing the time?
a little bit, but mostly it's the number of chunks
can you briefly explain what the difference is between GPTKnowledgeGraphIndex.from_documents(documents)
&
GPTKnowledgeGraphIndex.from_documents(documents, include_embeddings=True) ?
One will generate embeddings for each triplet, the other just extracts triplets themselves. And extracting triplets is done with LLM calls, which can be slow-ish.
Then at query time, it extracts keywords from the query and uses those keywords to find triplets that overlap with the query keywords. If you use embeddings, it will also return triplets that have similar embeddings
If include_text=True
is in your query call, it will use the text where those triplets were found to generate an answer. If it's false, then it will use only the triplets themselves to generate an answer
so without include_text=True it wont use the text those triples were generated from? sounds like it defeats my purpose π
the doc that took tree, vector and KW around 1min to index is not done with KG after 30m still running
Yea, because all those other indexes don't make as many LLM calls as the KG lol I wonder if openAI is throttling the requests too
would you recommend KG for really small docs only? still running 1h35m first doc, i have 3 and do each with and without embedding, i guess ill cancel that lol
I've really only experimented with the paul graham essay example + the nyc wikipedia page lol
In my opinion, I think it's more useful when you have a dedicated model that extracted the triplets for you or you have an existing ontology. Then you don't have to rely on the LLM to extract triplets (slow and uses tokens), and you can insert the triplets directly
Gotcha, skipping KG for now π have enough to playground-around with