Find answers from the community

Home
Members
thomoliverz
t
thomoliverz
Offline, last seen 2 weeks ago
Joined September 25, 2024
trying to build a customer/ prospect feedback slackbot for my company

prototyping based on dummy transcript/ notes data in notion

but one issue rn with the notion reader is that we only load the text and the page id

but would be quite helpful to also load other fields from the page as metadata e.g. if we can load organization/ date, then we can possibly get better answers e.g. give me top customer pain points from last week or give me top frustrations from 'Big Corporation'.

any ideas on how to do this?
2 comments
L
t
hello - is this page of the docs re chat engine up to date?

i put it in my code and seemed to not work...

thanks for any help!

https://docs.llamaindex.ai/en/stable/module_guides/deploying/chat_engines/usage_pattern/
6 comments
L
t
G
t
thomoliverz
·

Filtering

hi all ( & @Logan M ) - i hope everything is going well. so cool to see the success of llama index since the early days

i have one q

is it possible to filter with vector store index post indexing?

I want a user to select a value and for the response to take only docs/ nodes with that value attached in the metadata..

thanks for any help
4 comments
L
t
Anyone know if it is possible to stream in flask app? e.g. right now I generate a response in flask app and post it back to my front end

is it possible stream that response into the front end?
1 comment
W
Hi all - when using the Chat Engine, is it possible to still configure a retriever e.g. topk = X? and, importantly, filters?

& is it possible also do print the source nodes for each chat question/ response?

I used to do this with query engine but now want to do the same with chat engine.

thank you!
5 comments
t
W
Hello - have been building with llama index for a while but am still beginner level coding/ engineering

Right now I have a script that loads data from Airtable, builds a Vector Store Index over them, and then which creates a chat engine for a user to ask questions over the data. The data is a series of quotes/ advice about productivity. User asks e.g. how can I sleep better and gets response based on quotes by several people.

There are a few things I am now looking to improve, specifically:

  • I want to have the index stored so that when a user asks a question, the response from their pov is quicker - I understand I can do this relatively easily using Llama Index or another store e.g. chroma (but could I not just use default? & is it possible to have the index constantly loaded so that when a user queries the index doesn't have to be built again?)
  • Users can also add to the database, so I also want to be able to refresh the index regularly or when an action is taken by a user - is there a classic way to do this?
  • I would like to allow users to apply some filters to their queries & then either load specific data based on those filters or retrieve only some data based on those filters - when is the right moment to do this? I will input some UI features for users to select different filters in the front end.
Grateful for any help/ advice
3 comments
W
t
super grateful for any help..!
31 comments
L
t
t
thomoliverz
·

Import

is there a reason this now does not work?
2 comments
W
are there any best practices on formatting text to optimise retrieval
3 comments
k
So sorry for yet another question! Is there a problem with the airtable reader? I am getting this error when I use the reader... I am not sure why..

Traceback (most recent call last): File "main.py", line 27, in <module> documents = reader.load_data(table_id = AIRTABLE_TABLE_ID, base_id = AIRTABLE_BASE_ID) File "/home/runner/focus-tech-V1/venv/lib/python3.10/site-packages/llama_index/readers/llamahub_modules/airtable/base.py", line 33, in load_data return [Document(f"{all_records}", extra_info={})] File "pydantic/main.py", line 332, in pydantic.main.BaseModel.__init__ TypeError: __init__() takes exactly 1 positional argument (2 given) 
12 comments
L
t
Hi - sorry for another 1.

I am trying to use GPT-4 via llama index and have made the changes in my code, but doesn't seem to be working. Any idea why?

See below my code and then the OpenAI usage data.

llm = OpenAI(model="gpt-4", temperature=0.1, max_tokens=256) # editing prompt & building index QA_PROMPT_TMPL = ( "XXX." "---------------------\n" "{context_str}" "\n---------------------\n" "Given this information, please answer the question: {query_str}\n" ) QA_PROMPT = QuestionAnswerPrompt(QA_PROMPT_TMPL) index = GPTVectorStoreIndex.from_documents(documents) retriever = VectorIndexRetriever( index=index, similarity_top_k=5, ) query_engine = RetrieverQueryEngine.from_args( retriever, response_mode="compact", text_qa_template=QA_PROMPT, )
4 comments
t
L
t
thomoliverz
·

Quotes

Hi gang... major long shot.

Does anyone know why the airtable reader seemingly loads some text with " and some with ' ?

I am loading text data and then querying it.

And - weirdly - some of my airtable records are loaded with " and some with ' despite being entered without either in the table itself.

E.g. 'Quotes': 'Not eating a lot in the few hours before sleep helps. Not drinking alcohol helps a lot, though I’m not willing to do that all the time.\n'

vs

'Quotes': "Copying is a good way to learn, but copy the right things. When I was in college I imitated the pompous diction of famous professors. But this wasn't what made them eminent — it was more a flaw their eminence had allowed them to sink into. Imitating it was like pretending to have gout in order to seem rich.\n"

Even though in the airtable records, neither have any quotation mark.

This is frustratingly important because I am using regex to manipulate the doc text for the nodes.

not sure if you know as you did the airtable reader?
3 comments
4
L
hey team - would love some help here...

the airtable loader is giving my documents like this.

', 'Areas of Improvement': ['Making and changing plans'], 'Source': 'Elon Musk by Ashlee Vance\n\n', 'Quotes': 'Musk also trained employees to make the right trade-offs between spending money and productivity… ‘He would say that everything we did was a function of our burn rate and that we were burning through a hundred thousand dollars per day… Sometimes he wouldn’t let you buy a part for two thousand dollars because he expected you to find it cheaper or invent something cheaper. Other times, he wouldn’t flinch at renting a plane for ninety thousand dollars to get something to Kwaj because it saved an entire workday, so it was worth it. He would place this urgency that he expected the revenue in ten years to be ten million dollars a day and that every day we were slower to achieve our goals was a day of missing out on that money.’\n', 'People (Raw)': ['Elon Musk']}},

I want to get each node to be just the quote. Anyone got any idea how to do that in python? I am trying to do it but am being told documents is not subscriptable..
22 comments
L
t
hi all (also cc )

I am investigating an enterprise use case whereby I could use a chat interface to search quite a large database of companies.

Can anyone give me a high level explanation on what you would need to do to enable this? How would we actually be able to retrieve information on specific companies and their characteristics? Is it loading and embedding all of the text data? Or is there a way to build an API that exposes the contents of the table and makes it retrievable/ searchable in natural language?

Really grateful for any help esp if there are any great tutorials.
2 comments
t
L
hi team - does anyone know when reading a table whether (1) we are reading columns simultaneously e.g. if I have one column for quote and one column for person who said it, I can prompt: what did X person say about Y? & (2) alter the reader dependent on what a user inputs e.g. if a user only wanted results returned from quotes from a certain person? really grateful for any help
5 comments
t
j
Hi! Totally ignorant question I’m sure (and indicative of my lack of tech expertise). I’m trying to build a bot using my own data as shown by Dan Shipper (link to follow). I’m getting an error saying: TypeError: BaseGPTIndex.init() got an unexpected keyword argument ‘verbose’. I wonder if anyone has encountered anything similar and knows how I can fix?? Grateful for help!
2 comments
t
j
t
thomoliverz
·

Node

hi @Logan M - I wonder if you or anyone can help with a problem.

I load data from airtable - i end up with records that look like this (like 100s of pages).

{'id': 'recaLU8tMlV72bae8', 'createdTime': '2023-01-04T13:53:52.000Z', 'fields': {'Lesson': 'Blow your own glass', 'Areas of improvement': ['Executing a plan'], 'Source': 'http://www.paulgraham.com/marginal.html\', 'Industry': ['Technology'], 'Quotes': "So if you want to beat those eminent enough to delegate, one way to do it is to take advantage of direct contact with the medium. In the arts it's obvious how: blow your own glass, edit your own films, stage your own plays. And in the process pay close attention to accidents and to new ideas you have on the fly.\n", 'People': 'Paul Graham', 'Tool Type': 'Technique', 'Collection': 'Paul Graham Essays'}},

The aim is to split each quote into a node.

& then also to be able to extract the author and the URL when returning sources, so that I can display it in my app.

I was splitting the docs up like this -

nodes = [] for document in documents: text = document.text matches1 = re.findall(pattern1, text) matches2 = re.findall(pattern2, text) matches3 = re.findall(pattern3, text) # takes quotes & people from matches & puts them into list of nodes for quote, person, url in zip(matches1, matches2, matches3): node_name = f"node{len(nodes) + 1}" node_text = f"{quote}" extra_info = {"Person": person, "url": url} exec(f"{node_name} = Node(text=node_text, extra_info=extra_info)") nodes.append(eval(node_name))

then the person & url were appearing in the response source nodes metadata & I was displaying in the app.

But I think there is now a different way to do this, as I am being told: NameError: name 'Node' is not defined. Did you mean: 'nodes'?

Wdyt?
8 comments
L
t
hi gang is there any info on exactly how extracted chunks are put into prompts

in response_source_nodes is it just the text part that goes into the prompt?
2 comments
T
g
actually no I am still getting embedding=none in the response source nodes..? is that normal?
2 comments
t
L
follow up question!

where is the best tutorial on how to use multiple data sources? i am using notion reader to answer qs based on notion docs. i would also like to feed a slack channel into that using the slack reader?
4 comments
t
E
hi team - re the airtable loader (https://llamahub.ai/l/airtable), I wonder whether anyone any experience loading views, rather than tables? 1 level more granular than table & would be immensely helpful for my use case

is this something we woud consider looking into? this might be a #💡feature-requests ..

grateful for help!
2 comments
t
4
Hi team - hope all are well. I have a slackbot at my company using llama-index to read notion pages and answer questions. This works well but 1 issue. The info in the pages is dynamic and constantly updated, as we improve our knowledge base.

Anyone can make a change to a notion page but I am required to re-run my script (on Replit) whenever a change is made in order for those changes to be updated in the index.

Is there a way to have the script re-run/ load at regular intervals?

Ty for any help.
7 comments
L
r
t
hi all - anyone else unable to import some things? has there been an update to naming?
3 comments
t
L
Hi @Logan M

Am updating my code based on docs and am currently at the screenshot.

Rn I am getting an error saying that the 'default' mode is unknown - ValueError: Unknown mode: default

I wonder wyt and if the rest of my code should be doing what it did before, which is:
  • list index (embedding mode, default response mode)
  • custom prompt
44 comments
t
L
D
Hi! I am doing a simple use case of loading data from airtable and vector indexing it.

but when I print the source_nodes with my response I basically always see the same chunks of text regardless of the question.

i had expected the nodes loaded in when querying to be similar to my query.

but that doesn't seem to be the case..

am i missing something?
16 comments
L
t