I have a documents object made with

At a glance

I have a documents object made with SimpleDirectoryReader, from multiple files, and I am passing it to VectorStoreIndex.from_documents, like so:

Plain Text

from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents, 
                                        storage_context=storage_context,
                                        service_context=service_context, 
                                        show_progress=True)

and I get

Plain Text

AttributeError: 'str' object has no attribute 'get_doc_id'

Why does it see my documents object as a string? It's a dictionary structure.

14 comments

RRohan

How did you create or load the documents?

RRohan

they should be of list of Documenttype

bbixqu

Hi Rohan and thank you for your response. I am reading them from a folder as :

Plain Text

def load_documents(self, file_paths):
        """
        Load documents from a list of file paths.

        This function takes a list of file paths, reads each file, and stores its contents in a dictionary.
        The keys of the dictionary are the file paths, and the values are the contents of the files.

        Parameters:
        file_paths (list): A list of file paths to load.

        Returns:
        dict: A dictionary where each key is a file path and each value is the content of the corresponding file.
        """
        documents = {}
        for file in file_paths:
            documents[file] = SimpleDirectoryReader(input_files=[file]).load_data()
        return documents

bbixqu

and this returns a Documents object which is essentially a dictionary. It looks like so :

Plain Text

{'file_path/file.txt' : [Document(id=..., embedding=None, metadata={... }}

bbixqu

and it has 3 entries, for my 3 text files

bbixqu

is this not how Documents should look ?

RRohan

the last loop is not required I reckon, you can do something like this instead:

Plain Text

reader = SimpleDirectoryReader(
    input_files=file_paths,
)

return reader.load_data()

RRohan

this will load data from the files in 'file_paths'

bbixqu

Trying it now , but file_paths is a list of strings

RRohan

yeah, it'll check if those strings are valid file paths or not, and then load those file contents

bbixqu

looks like it works

bbixqu

thanks a lot 🙂

RRohan

Glad it worked. Cheers

bbixqu

🙂

Add a reply

Find answers from the community

I have a documents object made with