city_docs = {}
for wiki_title in wiki_titles:
city_docs[wiki_title] = SimpleDirectoryReader(
input_files=[f"data/{wiki_title}.txt"]
).load_data()
llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
service_context = ServiceContext.from_defaults(llm=llm)
2. **Build Document Agent for each Document**: Next, you need to define "document agents" for each document. You can define both a vector index (for semantic search) and summary index (for summarization) for each document. The two query engines are then converted into tools that are passed to an OpenAI function calling agent. This document agent can dynamically choose to perform semantic search or summarization within a given document. Here's the code snippet for this step:
python
from llama_index.agent import OpenAIAgent
from llama_index import load_index_from_storage, StorageContext
from llama_index.node_parser import SimpleNodeParser
import os
node_parser = SimpleNodeParser.from_defaults()
agents = {}
query_engines = {}
all_nodes = []
for idx, wiki_title in enumerate(wiki_titles):
nodes = node_parser.get_nodes_from_documents(city_docs[wiki_title])
all_nodes.extend(nodes)
if not os.path.exists(f"./data/{wiki_title}"):
vector_index = VectorStoreIndex(nodes, service_context=service_context)
vector_index.storage_context.persist(persist_dir=f"./data/{wiki_title}")
else:
vector_index = load_index_from_storage(
StorageContext.from_defaults(persist_dir=f"./data/{wiki_title}"),
service_context=service_context,
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------