Rig

At a glance

Hi all,

Looking to buy better hardware so I can use LlamaCPP instead of making openAI calls. I just tried it with my humble hp and I swear it was going to blue screen 😂

M2 Studio or Custom RIG with phat GPU?

Input from anyone is much appreciated ❤️

20 comments

LLogan M

Tbh if you have the money, something with a 4090 is probably a good bet

That way you have some headroom to move beyond llama.cpp and into quanitized huggingface models

BBP

you'd recommend a RIG with 4090 over an M2 studio?

LLogan M

10000%

BBP

oh right I need a GPU for the quanitized model hahaha

BBP

sounds good ser 🫡

BBP

Many blessings as always ❤️

HHeidi

I agree. The m2 studio is a really powerfull machine. But most ML tasks are not optimized for apple architecture and cannot make use of the power.
A pc with a 4090 is easily recommended.

I myself am an apple fanboi running a m1 pro max, but for most ML tasks it just doesn't cut it and I find myself running code on google colab.

If you write your own models in tensorflow the silicon chips are supported fairly well. But most 3rd party libs (llama index and others) are typically not as up to date on the tf versions, or just plain incompatibale with tensorflow-metal and tensorflow-macos.
Also pytorch has no support currently.

again, If you are looking to implement your own tensorflow models and are happy with some setbacks and debugging. the m2 can be also set up to compete with powerfull machines.
But once you go into A100 teritory there is no competition

BBP

interesting, apple always shills itself as being ideal for "AI work loads"--I never looked deeper into that though.

what is A100 sorry?

I'm thinking of using a GPU API service, or even OVH baremetal with GPU to start. Could be a good starting point instead of buying a gigantic noisy RIG off the jump. Do you recommend doing this? Cheers! ❤️

LLogan M

Yea renting a GPU service also works!

(An A100 is an enterprise-level chad GPU)

BBP

which GPU service would you recommend that's best suited for Llama-Index and all its related stacks?

BBP

enterprise-level chad GPU 😂

BBP

dies of laughter

LLogan M

hmm tbh I haven't explored this too much. There's google cloud or AWS, but the setup can be a little obnoxious haha

I've heard good things about Modal, but I'm not sure if that's more for training or if hosting works too

BBP

yeah it's the overheard of wrestling with these providers' docs and API calls that's anxiety inducing. It's the price you pay for not buying your own RIG I guess

BBP

I'll look into it--cheers fam! ❤️

BBP

maybe OVH bare metal is the way to go 👀 👀 👀

BBP

Plain Text

Write Python code and execute it in the cloud in seconds
Deploy autoscaling inference endpoints on GPUs (A100s, A10Gs, T4s, L4s)
Run large-scale batch jobs on thousands of containers
Turn your function into a cron job, or serve it as an web endpoint, with one line of code
Define images, hardware and persistent storage intuitively in Python

Bruh on first glass it looks like it's for hosting too 👀

https://modal.com/blog/general-availability-and-series-a

and they became generally available literally 9 days ago

this space is so young 😂

BBP

Looks like these are the guys, I'm gonna check out their free version first

BBP

Thanks again 👑 🫡

LLogan M

Good luck! I really like how modal works, very user friendly 🙏

Add a reply

Find answers from the community

Rig