Find answers from the community

Updated 11 months ago

I have a documents object made with

I have a documents object made with SimpleDirectoryReader, from multiple files, and I am passing it to VectorStoreIndex.from_documents, like so:

Plain Text
from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents, 
                                        storage_context=storage_context,
                                        service_context=service_context, 
                                        show_progress=True)

and I get

Plain Text
AttributeError: 'str' object has no attribute 'get_doc_id'


Why does it see my documents object as a string? It's a dictionary structure.
R
b
14 comments
How did you create or load the documents?
they should be of list of Documenttype
Hi Rohan and thank you for your response. I am reading them from a folder as :

Plain Text
def load_documents(self, file_paths):
        """
        Load documents from a list of file paths.

        This function takes a list of file paths, reads each file, and stores its contents in a dictionary.
        The keys of the dictionary are the file paths, and the values are the contents of the files.

        Parameters:
        file_paths (list): A list of file paths to load.

        Returns:
        dict: A dictionary where each key is a file path and each value is the content of the corresponding file.
        """
        documents = {}
        for file in file_paths:
            documents[file] = SimpleDirectoryReader(input_files=[file]).load_data()
        return documents
and this returns a Documents object which is essentially a dictionary. It looks like so :

Plain Text
{'file_path/file.txt' : [Document(id=..., embedding=None, metadata={... }}
and it has 3 entries, for my 3 text files
is this not how Documents should look ?
the last loop is not required I reckon, you can do something like this instead:
Plain Text
reader = SimpleDirectoryReader(
    input_files=file_paths,
)

return reader.load_data()
this will load data from the files in 'file_paths'
Trying it now , but file_paths is a list of strings
yeah, it'll check if those strings are valid file paths or not, and then load those file contents
looks like it works
thanks a lot πŸ™‚
Glad it worked. Cheers
Add a reply
Sign up and join the conversation on Discord