I want to create a corporate support bot. I have 10-20 ...

At a glance

The community member wants to create a corporate support bot and has 10-20 gigs of documents to index for alpaca. They are wondering if LlamaIndex can handle that large amount of data. The comments suggest that LlamaIndex is designed to work with large external knowledge bases, but the maximum amount of data it can handle is not specified. The comments provide general guidance on how to use LlamaIndex to create a corporate support bot, including loading documents, parsing them into nodes, constructing an index, and querying the index. To handle a large amount of data, the community member may need to customize the storage layer of LlamaIndex. The comments recommend reaching out to the LlamaIndex team or checking their documentation for more specific guidance on handling large datasets.

GGeorge Kozyrev

I want to create a corporate support bot. I have 10-20 gigs of documents to index for alpaca.
Can llama index handle that big amount of data?

3 comments

kkapa.ai

Please stand by @George Kozyrev as I search the knowledge base. We are experiencing especially high traffic right now and might take longer than usual. :wave:

kkapa.ai

----------------------
Based on the provided extracts, LlamaIndex is designed to make it easier to use large external knowledge bases with LLMs. It provides a set of data structures for this purpose. However, the extracts do not specify the maximum amount of data that LlamaIndex can handle.

To create a corporate support bot, you would typically follow these steps:

Load in your documents. This can be done manually or through a data loader. LlamaIndex provides a variety of data loaders which will load in Documents through the load_data function. You can also construct documents manually.

Plain Text

from llama_index import SimpleDirectoryReader

documents = SimpleDirectoryReader('./data').load_data()

Parse the Documents into Nodes.
Construct Index (from Nodes or Documents).
Optionally, you can build indices on top of other indices.
Query the index.

For more detailed instructions, you can refer to the LlamaIndex Usage Pattern () and How to Build a Chatbot () guides.

To handle a large amount of data, you might need to customize the storage layer of LlamaIndex. By default, LlamaIndex hides away the complexities and lets you query your data in under 5 lines of code. However, it also supports a swappable storage layer that allows you to customize where ingested documents (i.e., Node objects), embedding vectors, and index metadata are stored. You can find more information about this in the Customizing Storage () guide.

kkapa.ai

If you need to handle 10-20 gigs of documents, I would recommend reaching out to the LlamaIndex team or checking their documentation for more specific guidance on handling large datasets.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

Add a reply

Find answers from the community

I want to create a corporate support bot. I have 10-20 gigs of documents to index for alpa