Find answers from the community

B
BP
Offline, last seen 3 months ago
Joined September 25, 2024
B
BP
Β·

Recordings

what software does the team use to make the youtube tutorials? I love the circular overlay of the host's face. I'm attending a tech conference in Toronto in a few days and I'm gonna record a POC of my app to show to new connections ❀️

Many blessings πŸ™
2 comments
B
L
B
BP
Β·

Pdfs

hey guys, my mvp deals with cases (in law) before a human rights board.

I have two cases in my 'cases/' directory and the first is 12 pages with the second being 7 pages long. I notice that the simple loader here:

Plain Text
reader = SimpleDirectoryReader(
    input_dir="cases/"
)

documents = reader.load_data()


is loading the pdfs into a list of Document objects, but the thing is--it's loading 1 page as a single Document object. I turn each Document object into a node and put all 19 nodes into my vector store.
Unfortunately, gpt-4 is mixing facts from each case and giving wrong answers.

I think I'll get better results if each case was its own Document object, and subsequently it's own Node. Does one of the default loaders have the ability to load an entire pdf as one Document object?
I swear I watched a tutorial on this but I've been looking for it and can't find it for the life of me. Send halp please πŸ™ ❀️
2 comments
B
L
B
BP
Β·

Llama Hub

hey everyone, out of all the webpage loaders from here: https://llamahub.ai/ which one is the best at scraping all the text from a URL?

Alternatively, is there one that can convert a web page to its pdf format before scraping/loading it?

Many thanks πŸ™
2 comments
B
L
Hi guys, I'm following the Text to SQL video (https://www.youtube.com/watch?v=ZIvcVJGtCrY) and I think it may be out of date.

At this part of the code:

Plain Text
# Insert documents into vector index
# Each document has metadata of the city attached
for city, wiki_doc in zip(cities, wiki_docs):
    nodes = node_parser.get_nodes_from_documents([wiki_doc])
    # add metadata to each node
    for node in nodes:
        node.extra_info = {"title": city}
    vector_index.insert_nodes(nodes)

I get this error:

Plain Text
python3 main.py
dict_keys(['city_stats'])
[('Toronto', 2930000, 'Canada'), ('Tokyo', 13960000, 'Japan'), ('Berlin', 3645000, 'Germany')]
Traceback (most recent call last):
  File "/home/bi-ai/ai/txt-to-sql/main.py", line 107, in <module>
    node.extra_info = {"title": city}
  File "pydantic/main.py", line 357, in pydantic.main.BaseModel.__setattr__
ValueError: "TextNode" object has no field "extra_info"


I think extra_info has been deprecated because when I hover over it, VSCode says "TO DO: DEPRECATED"

but I'm having trouble finding what it was replaced with. What should I use instead?

Sorry for the dumb question, I'm really new to the whole python ecosystem.

Please halp. Many thanks.
26 comments
B
L
has any gotten this error:

Plain Text
Traceback (most recent call last):
  File "/home/bi/ai/llama_docs_bot/3_eval_baseline/main.py", line 110, in <module>
    response = query_engine.query(query)
  File "/usr/local/lib/python3.10/dist-packages/llama_index/indices/query/base.py", line 23, in query
    response = self._query(str_or_query_bundle)
  File "/usr/local/lib/python3.10/dist-packages/llama_index/query_engine/sub_question_query_engine.py", line 126, in _query
    sub_questions = self._question_gen.generate(self._metadatas, query_bundle)
  File "/usr/local/lib/python3.10/dist-packages/llama_index/question_gen/openai_generator.py", line 77, in generate
    question_list = self._program(query_str=query_str, tools_str=tools_str)
  File "/usr/local/lib/python3.10/dist-packages/llama_index/program/openai_program.py", line 101, in __call__
    chat_response = self._llm.chat(
  File "/usr/local/lib/python3.10/dist-packages/llama_index/llms/base.py", line 134, in wrapped_llm_chat
    CBEventType.LLM, payload={EventPayload.MESSAGES: args[0]}
IndexError: tuple index out of range


While working with part 3 of the bottoms-up code along?

https://www.youtube.com/watch?v=LQy8iHOJE2A&t=209s
21 comments
B
L
B
BP
Β·

Agent

I've followed the chatbot tutorial, and now I'm changing it to use my own txt documents. It's having a very hard time distinguishing whether to use a tool or not--I believe this happens because some of the information in my txt document is too generic and isn't esoteric enough, i.e., the LLM is confusing my documents with information in its own database

Is there a way to force the agent_chain to use my tools on every run?
23 comments
B
L
B
BP
Β·

Rig

Hi all,

Looking to buy better hardware so I can use LlamaCPP instead of making openAI calls. I just tried it with my humble hp and I swear it was going to blue screen πŸ˜‚

M2 Studio or Custom RIG with phat GPU?

Input from anyone is much appreciated ❀️
20 comments
L
B
H
can someone explain what the consequence of setting 'in_place' to 'true' would be?

Plain Text
class MetadataExtractor(BaseExtractor):
    """Metadata extractor."""

    ...other code...

    in_place: bool = Field(
        default=True, description="Whether to process nodes in place."
    )


I've been using this class in my pipeline a lot and it's fantastic, but don't quite know what this field does. A low level explanation would be blessed ❀️

Thank you much in advance πŸ™
4 comments
B
L
B
BP
Β·

Embeddings

hey all,

I ran the custom embeddings code here as is: https://gpt-index.readthedocs.io/en/stable/examples/embeddings/custom_embeddings.html

and I got this:

Plain Text
Traceback (most recent call last):
  File "/home/bi-ai/ai/bottoms-up-embeddings/main.py", line 40, in <module>
    embed_model=InstructorEmbeddings(embed_batch_size=2), chunk_size=512
TypeError: Can't instantiate abstract class InstructorEmbeddings with abstract methods _aget_query_embedding, class_name


then I put in stub implementations:

Plain Text
def class_name(self) -> str:
        return "InstructorEmbeddings"
    
async def _aget_query_embedding(self, query: str) -> List[float]:
    return self._get_query_embedding(query)



and got this:

Plain Text
Traceback (most recent call last):
  File "/home/bi-ai/ai/bottoms-up-embeddings/main.py", line 46, in <module>
    embed_model=InstructorEmbeddings(embed_batch_size=2), chunk_size=512
  File "/home/bi-ai/ai/bottoms-up-embeddings/main.py", line 19, in __init__
    self._model = INSTRUCTOR(instructor_model_name)
  File "pydantic/main.py", line 357, in pydantic.main.BaseModel.__setattr__
ValueError: "InstructorEmbeddings" object has no field "_model"
25 comments
B
L
B
BP
Β·

Hey All

Hey All!

Is the MongoDB guide broken? here:
https://gpt-index.readthedocs.io/en/latest/examples/vector_stores/MongoDBAtlasVectorSearch.html

My code is here:
Plain Text
# Provide URI to constructor, or use environment variable

from markdown import Markdown
import pymongo
from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch
from llama_index.indices.vector_store.base import VectorStoreIndex
from llama_index.storage.storage_context import StorageContext
from llama_index.readers.file.base import SimpleDirectoryReader

mongo_uri = "mongodb+srv://<username>:<password>@<host>/?retryWrites=true&w=majority"

mongodb_client = pymongo.MongoClient(mongo_uri)
store = MongoDBAtlasVectorSearch(mongodb_client)
storage_context = StorageContext.from_defaults(vector_store=store)
silva_docs = SimpleDirectoryReader(input_files=["data/Anderson_Silva.pdf"]).load_data()

index = VectorStoreIndex.from_documents(silva_docs, storage_context=storage_context)

response = index.as_query_engine().query("When was Anderson Silva born?")
print(f"<b>{response}</b>")


But all I got is this:

Plain Text
python3 main.py 
<b>None</b>
10 comments
B
L
hey all I think 0.7.24 broke something. I got:

Plain Text
bi@bi:~/ai/quiz-maker-be$ python3 main.py 
Traceback (most recent call last):
  File "/home/bi/ai/quiz-maker-be/main.py", line 5, in <module>
    from llama_index import SimpleDirectoryReader
  File "/home/bi/.local/lib/python3.10/site-packages/llama_index/__init__.py", line 12, in <module>
    from llama_index.data_structs.struct_type import IndexStructType
  File "/home/bi/.local/lib/python3.10/site-packages/llama_index/data_structs/__init__.py", line 3, in <module>
    from llama_index.data_structs.data_structs import (
  File "/home/bi/.local/lib/python3.10/site-packages/llama_index/data_structs/data_structs.py", line 14, in <module>
    from llama_index.schema import BaseNode, TextNode
  File "/home/bi/.local/lib/python3.10/site-packages/llama_index/schema.py", line 9, in <module>
    from llama_index.bridge.langchain import Document as LCDocument
  File "/home/bi/.local/lib/python3.10/site-packages/llama_index/bridge/langchain.py", line 21, in <module>
    from langchain.embeddings import HuggingFaceEmbeddings, HuggingFaceBgeEmbeddings
ImportError: cannot import name 'HuggingFaceBgeEmbeddings' from 'langchain.embeddings' (/home/bi/.local/lib/python3.10/site-packages/langchain/embeddings/__init__.py)


and then down graded to 0.7.23 and it worked fine.
```
4 comments
M
L
B