Friction

At a glance

Mini rant, but hopefully outside of my slightly disgruntled language the devs see something useful in my frictionful experience:

Is llama index going the same way as langchain now. I've been trying for multiple days now trying to implement what I would think is stupidly simple pipeline but between the discord, the documentation, GitHub, chatgpt, and just online resources have not been able to figure out.

I have endpoints, embedding and chat, served via databricks. I have a pipeline that uses a small GUI interface to allow users to input documents and add tags which them gets chunked, embedded and loaded into milvus. I would have thought it was then trivial to use llamaindex, point to the pre-loaded milvus db (because why on earth would the workflow be to have the user generate the embeddings at runtime before you can use them?!). But between the API changes, deprecations, lack of fully fleshed out community integration examples it has been a pain to do this. I've been developing with these tools since gpt 3 had its first beta API release, before langchain was at more than a few hundred stars. So I may be dumb but I'm not a noob.

I have a bad feeling that llamaindex is an all or nothing package, meaning I have to use all its custom objects, documents, nodes, and use it and only it load the db and then to use it etc. which I'm not a huge fan of. I should be able to use it where I find it strong and not have to whole kitchen sink my applications.

Rant over. Love you.

6 comments

LLogan M

Is the main point of friction here not being able to use your existing milvus db?

LLogan M

(Tbh I agree that it's either not supported well, or not at all in that case, I'm forgetting the state of the milvus integration
It's definitely something that could be improved in that respect across vector stores)

LLogan M

Apologies for the rough experience in any case

LLogan M

My opinion is that, after working on building the framework for the last 1.5 years, it can be hard to build things that are useful as a standalone thing (like a vector store class), because as a framework, things are inherently coupled and work well together

That being said, this is why stuff like our new workflows module excites me.

It's essentially decoupled from the library and self contained. It's focused purely on event driven worklfows
It promotes lower level usage of stuff like llms, embeddings, retrievers, tool calling, etc. Which makes examples and implementations way less black box (as a new user, you would have no idea what a chat engine or query engine does, which imo is bad as soon as you need to debug or customize)

LLogan M

(Oh, milvus does support this well, nice https://discord.com/channels/1059199217496772688/1292595715951824917/1292680523516481536)

DDaBoi

Ah okay that would be amazing. Thank you for the response. I need to dive into that. Yeah because I'm building an enterprise rag service for my company and I'm trying to decouple things so that I can take full advantage of pre-built components, while being framework agnostic, so I have a thin layer for abstract base classes. For example chunkers must have some common methods etc. But then with it seemed like lllamaindex had quite a lot of dependency on its native object types. Like a parser needs a Document etc. But what you're describing sounds exactly like what I need! Thank you again for taking the time to reply as well as replying to my other question about the milvus vector store. I get that that is kind of on them to keep it up to date.

Add a reply

Find answers from the community

Friction