Find answers from the community

Updated last year

Rig

At a glance
Hi all,

Looking to buy better hardware so I can use LlamaCPP instead of making openAI calls. I just tried it with my humble hp and I swear it was going to blue screen πŸ˜‚

M2 Studio or Custom RIG with phat GPU?

Input from anyone is much appreciated ❀️
L
B
H
20 comments
Tbh if you have the money, something with a 4090 is probably a good bet

That way you have some headroom to move beyond llama.cpp and into quanitized huggingface models
you'd recommend a RIG with 4090 over an M2 studio?
oh right I need a GPU for the quanitized model hahaha
sounds good ser 🫑
Many blessings as always ❀️
I agree. The m2 studio is a really powerfull machine. But most ML tasks are not optimized for apple architecture and cannot make use of the power.
A pc with a 4090 is easily recommended.

I myself am an apple fanboi running a m1 pro max, but for most ML tasks it just doesn't cut it and I find myself running code on google colab.

If you write your own models in tensorflow the silicon chips are supported fairly well. But most 3rd party libs (llama index and others) are typically not as up to date on the tf versions, or just plain incompatibale with tensorflow-metal and tensorflow-macos.
Also pytorch has no support currently.

again, If you are looking to implement your own tensorflow models and are happy with some setbacks and debugging. the m2 can be also set up to compete with powerfull machines.
But once you go into A100 teritory there is no competition
interesting, apple always shills itself as being ideal for "AI work loads"--I never looked deeper into that though.

what is A100 sorry?

I'm thinking of using a GPU API service, or even OVH baremetal with GPU to start. Could be a good starting point instead of buying a gigantic noisy RIG off the jump. Do you recommend doing this? Cheers! ❀️
Yea renting a GPU service also works!

(An A100 is an enterprise-level chad GPU)
which GPU service would you recommend that's best suited for Llama-Index and all its related stacks?
enterprise-level chad GPU πŸ˜‚
dies of laughter
hmm tbh I haven't explored this too much. There's google cloud or AWS, but the setup can be a little obnoxious haha

I've heard good things about Modal, but I'm not sure if that's more for training or if hosting works too
yeah it's the overheard of wrestling with these providers' docs and API calls that's anxiety inducing. It's the price you pay for not buying your own RIG I guess
I'll look into it--cheers fam! ❀️
maybe OVH bare metal is the way to go πŸ‘€ πŸ‘€ πŸ‘€
Plain Text
Write Python code and execute it in the cloud in seconds
Deploy autoscaling inference endpoints on GPUs (A100s, A10Gs, T4s, L4s)
Run large-scale batch jobs on thousands of containers
Turn your function into a cron job, or serve it as an web endpoint, with one line of code
Define images, hardware and persistent storage intuitively in Python


Bruh on first glass it looks like it's for hosting too πŸ‘€

https://modal.com/blog/general-availability-and-series-a

and they became generally available literally 9 days ago

this space is so young πŸ˜‚
Looks like these are the guys, I'm gonna check out their free version first
Thanks again πŸ‘‘ 🫑
Good luck! I really like how modal works, very user friendly πŸ™
Add a reply
Sign up and join the conversation on Discord