Find answers from the community

Updated 5 months ago

Ah ok…i think openai embeddings are the

At a glance

AAndreaSel93

Ah ok…i think openai embeddings are the ones with the best performance, no?

11 comments

LLogan M

I think so! At least, they like to brag about them being the best/fastest/cheapest lol

LLogan M

Usually the flow in llama index is to generate embeddings with openai, and then store them in a vector store like pinecone/qdrant

AAndreaSel93

Yep! Im using pinecone now and it’s ok. I hoped a bit faster.

AAndreaSel93

Just a question: what do you think are the percentages of relevance for obtaining a good response? Eg imo:

65% prompt engineering
30% chunk size
5% text splitter

Also the choice of the index I suppose could generate a relevant impact, though never tried anything but simple vector index and gpt pinecone index

AAndreaSel93

What do you think? @Logan M

LLogan M

I definitely agree with your percentages!

Prompt engineering is important for following instructions and minimizing hallucinations

Chunk size is important for how your embeddings work . But also, if you can pre-split the documents yourself as much as possible (I.e into clear sections), this helps too

LLogan M

I'm actually working on a little demo using the help channel messages haha using bert to split the chat threads into topics and then building a composable index

I hope it works, we will see

AAndreaSel93

This is cool! Never tried Bert, where do you find it’s more beneficial? Ohhh everything runs too fast and there’s a lot of material😂

LLogan M

Definitely lots to learn!

I'm using a python package called BerTopic to cluster the discord messages, and then creating an index per topic.

AAndreaSel93

Im still not convinced of composed indices. Did you try them and in your experience they work better than a single vector index?
I mean, a single index with related materials (not one about medicine and the other about economics to be clear😅)

LLogan M

Yea if your data is focused on a single clear topic, a single index will work great 💪

Add a reply