Find answers from the community

Home
Members
HABBYMAN
H
HABBYMAN
Offline, last seen 2 months ago
Joined September 25, 2024
If I have multiple indexes to query against is advisable to compose a graph or merge them into one index?
6 comments
b
L
H
Hey guys - i'm looking to leverage the GoogleDriveReader. The application i'm building is multi-tenanted, and I have a service that indexes each organisations drive folder that they have selected. I've just noticed that this loader requires a credentials.json to index. I already have the users access_token, is there a way to pass this directly without the need to create the credentials file every time? I figure if two people make a request at the same time this file is just going to be overwritten?
5 comments
H
L
Does anyone have any ideas around best practices to ingest documents with Slack?

I want to retain as much metadata as possible for individual messages, but I want the document retrieval to maintain context of the entire conversation. I've tried this two ways:

  1. Each message is its own document, can sucessfully store permalink, timestamp, user in the metadata, but conversational context is lost.
  1. Store multiple messages in documents, cannot link to individual messages, user, timestamp metadata isn't stored due to multiple messages in the doc.
Any ideas how to solve this problem?
3 comments
L
H
Looks like it can't find an id for the metadata?
4 comments
H
L
Hi all, I’m attempting to build a tool that allows users to upload various documents to an S3 bucket, and then an API and front end that can allow a user to query those documents after they have been stored and processed.

My understanding of AI / LlamaIndex is limited, I’m coming from a backend Golang discipline and trying to learn the ropes. My proposed architecture is this:

API Upload
  • Upload documents to a backend server - forward on to upload in S3.
Upload Processing
  • An AWS S3 event triggers a python script (example below) to process the documents and store the nodes + indexes. This point is where my lack of knowledge comes in, how can I make this storage process happen in a way that users do not need to re-index these documents, and to speed everything up?
Process Completed Notifications
  • Alert users that their documents are now queryable
Front-end
  • Query documents
Firstly, is my understanding of the LlamaIndex project accurate?
Secondly, is my application of these technologies correct?
1 comment
V
Hi all, I have a question about the SlackReader. Currently, it ingests multiple messages per document. Is there a built in method to have one document per message? I need to add metadata to each document, such as perma link, author etc.
2 comments
H
W
I've got the confluence loader successfully pulling documents down, but when i attempt to create a vector store index I get the following error:
Plain Text
ERROR:root:error: 'tuple' object has no attribute 'get_doc_id'


here is the code before that:
Plain Text
c = download_loader('ConfluenceReader')
reader = c(base_url=r["base_url"])
documents = reader.load_data(space_key=r["space_key"], include_attachments=False, page_status="current")
logging.info("downloading documents")
for documents in documents:
    # TODO: fix this with actual values
    logging.info("adding confluence link to document")

logging.info("storing in pinecone")
logging.info("pinecone index name: " + os.environ['PINECONE_INDEX_NAME'])
logging.info("pinecone environment: " + os.environ['PINECONE_ENVIRONMENT'])
pinecone.init(api_key=os.environ['PINECONE_API_KEY'], environment=os.environ['PINECONE_ENVIRONMENT'])
pinecone.Index("astoria").delete(delete_all=True, namespace=workspace_id + "-confluence")

vector_store = PineconeVectorStore(
    index_name=os.environ['PINECONE_INDEX_NAME'],
    environment=os.environ['PINECONE_ENVIRONMENT'],
    namespace=workspace_id + "-confluence",
)


Any ideas what's breaking?
13 comments
H
L
W
👋 I'm trying to get the confluence loader to work with a an OAuth 2.0 (3LO) confluence app.

I've set up the application with the correct callback URL and scopes, but when I use the access token with the loader, I get the following error:
Plain Text
{"message":"Current user not permitted to use Confluence","statusCode":403}


I'm setting up the loader as follows:
Plain Text
        token = {
            "access_token": result["access_token"],
            "token_type": "Bearer",
        }
        oauth2_dict = {
            "client_id": os.environ['CONFLUENCE_CLIENT_ID'],
            "token": token,
        }

        logging.info("oauth2 dict: " + str(oauth2_dict))
        c = download_loader('ConfluenceReader')
        reader = c(base_url=r["base_url"], oauth2=oauth2_dict)

Does anyone have any pointers as to why this doesn't work?
6 comments
H
W
H
HABBYMAN
·

Top_K

When I build my index, i'm adding a URL to the document to the extra info.

When a response is returned, it returns multiple source_nodes, one of which is the node i need to pull the link from. Is there a way I can select this node only? Or have it only return this as the source node?
10 comments
H
W
Is anyone using the google drive reader in an API?

My workflow is:
  • User auths with google and I store their access_token and refresh token
  • User sends some folder_ids over to the API
  • I refresh credentials using the refresh token if necessary
  • GoogleDriveReader attempts to download and index the documents
My problem with this process is that the google drive reader launches it's own callback process, waiting for OAuth callbacks... This means my API is launching web pages rather than just processing the data.
4 comments
H
S
b
Would love some insight from the community here:

I've got a use case in which a user can send multiple chat messages to a chatbot - its end goal is to solve their problem, and if it can't it will raise an issue in my platform.
What's the best approach to indicate the start and end of new conversations with the bot? The conversations will live in slack and therefore there is no defined start and end to them. Are there any tools within LlamaIndex that I can leverage that I'm unaware of?
2 comments
b
this is useful, thank you, but it doesn't quite solve what I'm trying to do. I have a Vector Index saved in Redis, but when I'm trying to figure out how to load that vector index in another function call - when i connect to redis in the storage context, its saying there are no indexes, yet I can see them in redis
8 comments
H
L
V
I'm querying 2 documents that are indexed and stored in mongo db. When I use a ListIndex, I get a solid response, when I use a Vector Index over the same docs, i get nothing in the response. Why does this happen?
1 comment
L
hey guys, struggling to get the MongoDocument/IndexStores working as expected. Here's some code if anyone has any tips?: 🧵
6 comments
H
L