BP

Recordings

what software does the team use to make the youtube tutorials? I love the circular overlay of the host's face. I'm attending a tech conference in Toronto in a few days and I'm gonna record a POC of my app to show to new connections ❤️

Many blessings 🙏

2 comments

BBP

Pdfs

hey guys, my mvp deals with cases (in law) before a human rights board.

I have two cases in my 'cases/' directory and the first is 12 pages with the second being 7 pages long. I notice that the simple loader here:

Plain Text

reader = SimpleDirectoryReader(
    input_dir="cases/"
)

documents = reader.load_data()

is loading the pdfs into a list of Document objects, but the thing is--it's loading 1 page as a single Document object. I turn each Document object into a node and put all 19 nodes into my vector store.
Unfortunately, gpt-4 is mixing facts from each case and giving wrong answers.

I think I'll get better results if each case was its own Document object, and subsequently it's own Node. Does one of the default loaders have the ability to load an entire pdf as one Document object?
I swear I watched a tutorial on this but I've been looking for it and can't find it for the life of me. Send halp please 🙏 ❤️

2 comments

BBP

Llama Hub

hey everyone, out of all the webpage loaders from here: https://llamahub.ai/ which one is the best at scraping all the text from a URL?

Alternatively, is there one that can convert a web page to its pdf format before scraping/loading it?

Many thanks 🙏

2 comments

BBP

Discover LlamaIndex: Joint Text to SQL a...

Hi guys, I'm following the Text to SQL video (https://www.youtube.com/watch?v=ZIvcVJGtCrY) and I think it may be out of date.

At this part of the code:

Plain Text

# Insert documents into vector index
# Each document has metadata of the city attached
for city, wiki_doc in zip(cities, wiki_docs):
    nodes = node_parser.get_nodes_from_documents([wiki_doc])
    # add metadata to each node
    for node in nodes:
        node.extra_info = {"title": city}
    vector_index.insert_nodes(nodes)

I get this error:

Plain Text

python3 main.py
dict_keys(['city_stats'])
[('Toronto', 2930000, 'Canada'), ('Tokyo', 13960000, 'Japan'), ('Berlin', 3645000, 'Germany')]
Traceback (most recent call last):
  File "/home/bi-ai/ai/txt-to-sql/main.py", line 107, in <module>
    node.extra_info = {"title": city}
  File "pydantic/main.py", line 357, in pydantic.main.BaseModel.__setattr__
ValueError: "TextNode" object has no field "extra_info"

I think extra_info has been deprecated because when I hover over it, VSCode says "TO DO: DEPRECATED"

but I'm having trouble finding what it was replaced with. What should I use instead?

Sorry for the dumb question, I'm really new to the whole python ecosystem.

Please halp. Many thanks.

26 comments

BBP

Discover LlamaIndex: Bottoms-Up Developm...

has any gotten this error:

Plain Text

Traceback (most recent call last):
  File "/home/bi/ai/llama_docs_bot/3_eval_baseline/main.py", line 110, in <module>
    response = query_engine.query(query)
  File "/usr/local/lib/python3.10/dist-packages/llama_index/indices/query/base.py", line 23, in query
    response = self._query(str_or_query_bundle)
  File "/usr/local/lib/python3.10/dist-packages/llama_index/query_engine/sub_question_query_engine.py", line 126, in _query
    sub_questions = self._question_gen.generate(self._metadatas, query_bundle)
  File "/usr/local/lib/python3.10/dist-packages/llama_index/question_gen/openai_generator.py", line 77, in generate
    question_list = self._program(query_str=query_str, tools_str=tools_str)
  File "/usr/local/lib/python3.10/dist-packages/llama_index/program/openai_program.py", line 101, in __call__
    chat_response = self._llm.chat(
  File "/usr/local/lib/python3.10/dist-packages/llama_index/llms/base.py", line 134, in wrapped_llm_chat
    CBEventType.LLM, payload={EventPayload.MESSAGES: args[0]}
IndexError: tuple index out of range

While working with part 3 of the bottoms-up code along?

https://www.youtube.com/watch?v=LQy8iHOJE2A&t=209s

21 comments

BBP

Agent

I've followed the chatbot tutorial, and now I'm changing it to use my own txt documents. It's having a very hard time distinguishing whether to use a tool or not--I believe this happens because some of the information in my txt document is too generic and isn't esoteric enough, i.e., the LLM is confusing my documents with information in its own database

Is there a way to force the agent_chain to use my tools on every run?

23 comments

BBP

Rig

Hi all,

Looking to buy better hardware so I can use LlamaCPP instead of making openAI calls. I just tried it with my humble hp and I swear it was going to blue screen 😂

M2 Studio or Custom RIG with phat GPU?

Input from anyone is much appreciated ❤️

20 comments

BBP

can someone explain what the consequence

can someone explain what the consequence of setting 'in_place' to 'true' would be?

Plain Text

class MetadataExtractor(BaseExtractor):
    """Metadata extractor."""

    ...other code...

    in_place: bool = Field(
        default=True, description="Whether to process nodes in place."
    )

I've been using this class in my pipeline a lot and it's fantastic, but don't quite know what this field does. A low level explanation would be blessed ❤️

Thank you much in advance 🙏

4 comments

BBP

Embeddings

hey all,

I ran the custom embeddings code here as is: https://gpt-index.readthedocs.io/en/stable/examples/embeddings/custom_embeddings.html

and I got this:

Plain Text

Traceback (most recent call last):
  File "/home/bi-ai/ai/bottoms-up-embeddings/main.py", line 40, in <module>
    embed_model=InstructorEmbeddings(embed_batch_size=2), chunk_size=512
TypeError: Can't instantiate abstract class InstructorEmbeddings with abstract methods _aget_query_embedding, class_name

then I put in stub implementations:

Plain Text

def class_name(self) -> str:
        return "InstructorEmbeddings"
    
async def _aget_query_embedding(self, query: str) -> List[float]:
    return self._get_query_embedding(query)

and got this:

Plain Text

Traceback (most recent call last):
  File "/home/bi-ai/ai/bottoms-up-embeddings/main.py", line 46, in <module>
    embed_model=InstructorEmbeddings(embed_batch_size=2), chunk_size=512
  File "/home/bi-ai/ai/bottoms-up-embeddings/main.py", line 19, in __init__
    self._model = INSTRUCTOR(instructor_model_name)
  File "pydantic/main.py", line 357, in pydantic.main.BaseModel.__setattr__
ValueError: "InstructorEmbeddings" object has no field "_model"

25 comments

BBP

Hey All

Hey All!

Is the MongoDB guide broken? here:
https://gpt-index.readthedocs.io/en/latest/examples/vector_stores/MongoDBAtlasVectorSearch.html

My code is here:

Plain Text

# Provide URI to constructor, or use environment variable

from markdown import Markdown
import pymongo
from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch
from llama_index.indices.vector_store.base import VectorStoreIndex
from llama_index.storage.storage_context import StorageContext
from llama_index.readers.file.base import SimpleDirectoryReader

mongo_uri = "mongodb+srv://<username>:<password>@<host>/?retryWrites=true&w=majority"

mongodb_client = pymongo.MongoClient(mongo_uri)
store = MongoDBAtlasVectorSearch(mongodb_client)
storage_context = StorageContext.from_defaults(vector_store=store)
silva_docs = SimpleDirectoryReader(input_files=["data/Anderson_Silva.pdf"]).load_data()

index = VectorStoreIndex.from_documents(silva_docs, storage_context=storage_context)

response = index.as_query_engine().query("When was Anderson Silva born?")
print(f"<b>{response}</b>")

But all I got is this:

Plain Text

python3 main.py 
<b>None</b>

10 comments

BBP

hey all I think 0 7 24 broke something I

hey all I think 0.7.24 broke something. I got:

Plain Text

bi@bi:~/ai/quiz-maker-be$ python3 main.py 
Traceback (most recent call last):
  File "/home/bi/ai/quiz-maker-be/main.py", line 5, in <module>
    from llama_index import SimpleDirectoryReader
  File "/home/bi/.local/lib/python3.10/site-packages/llama_index/__init__.py", line 12, in <module>
    from llama_index.data_structs.struct_type import IndexStructType
  File "/home/bi/.local/lib/python3.10/site-packages/llama_index/data_structs/__init__.py", line 3, in <module>
    from llama_index.data_structs.data_structs import (
  File "/home/bi/.local/lib/python3.10/site-packages/llama_index/data_structs/data_structs.py", line 14, in <module>
    from llama_index.schema import BaseNode, TextNode
  File "/home/bi/.local/lib/python3.10/site-packages/llama_index/schema.py", line 9, in <module>
    from llama_index.bridge.langchain import Document as LCDocument
  File "/home/bi/.local/lib/python3.10/site-packages/llama_index/bridge/langchain.py", line 21, in <module>
    from langchain.embeddings import HuggingFaceEmbeddings, HuggingFaceBgeEmbeddings
ImportError: cannot import name 'HuggingFaceBgeEmbeddings' from 'langchain.embeddings' (/home/bi/.local/lib/python3.10/site-packages/langchain/embeddings/__init__.py)

and then down graded to 0.7.23 and it worked fine.
```

4 comments

Find answers from the community

Recordings

Pdfs

Llama Hub

Discover LlamaIndex: Joint Text to SQL a...

Discover LlamaIndex: Bottoms-Up Developm...

Agent

Rig

can someone explain what the consequence

Embeddings

Hey All

hey all I think 0 7 24 broke something I