Find answers from the community

Home
Members
Ali | Tali AI
A
Ali | Tali AI
Offline, last seen 3 months ago
Joined September 25, 2024
I am running into an issue creating documents out of a an array of articles.

I am attempting to load in a set of docs and then create LI documents out of them, create embeddings and then add then to a DeepLake dataset via
Plain Text
DeepLakeVectorStore
. It seems that
Plain Text
    vector_store = DeepLakeVectorStore(dataset_path=dataset_path, ingestion_batch_size=1024).add(nodes)
expects a node structure different structure then what i currently have.

The error:
Plain Text
AttributeError: 'Node' object has no attribute 'node'


code:
Plain Text
    if not medium_input or medium_input == '':
            print("The string is empty.")
    else:
            print("The string is not empty.")
            print(medium_input)
            
            publication = medium.publication(publication_slug=medium_input)
            
            medium_articles = medium.publication(publication_id=str(publication._id)).get_articles_between(_from=datetime.now(),_to=datetime.now() - timedelta(days=70))

            docs = []
            texts = ''
            # print("medium_articles bool", medium_articles[0].content)
            for article in medium_articles:
                document = article.content
                document = Document(article.content)
                new_dict = {key: article.info[key] for key in ['url', 'published_at', 'title']}
                document.extra_info = new_dict
                docs.append(document)


    parser = SimpleNodeParser()

    nodes = parser.get_nodes_from_documents(docs)

    print('nodes',nodes)


    dataset_path = f"hub://tali/{deeplake_datasets}"

    vector_store = DeepLakeVectorStore(dataset_path=dataset_path, ingestion_batch_size=1024).add(nodes)
14 comments
L
A
how can i configure service_context?
2 comments
k
Hey llama gang. I noticed something very strange today. I was using a ComposableGraph to query against a set of docs generated from a BeautifulSoupWebReader data loader. Now mind you previously (like two days ago) i was getting beautiful results. Now my results are seriously dumbed down. I cant fathom why this is. The docs have not changed. I tried playing with my versions of langchan and llama to no luck. Im going to leave an example here of the same questions asked a few days apart. Note the bottom result is the "dumber" version. Also some code snippets.


Plain Text
index1 = GPTSimpleVectorIndex.from_documents(documents);

Plain Text
graph = ComposableGraph.from_indices(GPTListIndex, [index1], index_summaries=[index1_summary])


example of query:
Plain Text
query_configs = [
    {
        "index_struct_type": "tree",
        "query_mode": "embedding",
        "query_kwargs": {
            "child_branch_factor": 5
        }
    }
]

response = graph.query("Provide a detailed answer the following quesiton based on the context. Never use the word context. If you dont know say i dont know. What is a Filecoin storage provider?", query_configs=query_configs)
print(response)
4 comments
L
A
Hey yall I thought this might be helpful for the community.
4 comments
A
j
Plain Text
AttributeError: 'dict' object has no attribute 'index_structs'
8 comments
j
L
j
Yes i understand that part. but how do i create the storage context. for example:

Plain Text
storage_context = StorageContext.from_defaults(docstore=json_data_docstore)

does not work

Plain Text
storage_context = StorageContext.from_defaults(docstore=json_data_docstore,
    vector_store=json_data_vector,
    index_store=json_data_index)

does not work

Plain Text
storage_context = StorageContext.from_defaults(json_data_docstore,json_data_index,json_data_vector)

does not work

what am i missing?
3 comments
L
h
Hey Llama Gang,

I keep getting the following error when using "GithubRepositoryReader":

Plain Text
ConnectTimeout


any suggestions?
4 comments
H
A
I think its an issue on my end. I am stuck in some kind of dependency hell.
1 comment
A