Log in
Log into community
Find answers from the community
View all posts
Related posts
Was this helpful?
π
π
π
Powered by
Hall
Active
Updated 3 weeks ago
0
Follow
Storage
Storage
Active
0
Follow
R
Ryan
3 weeks ago
Β·
For those who are creating datasets for evaluating their LLMs, where are you storing typically those datasets?
L
R
6 comments
Share
Open in Discord
L
Logan M
3 weeks ago
S3, git lfs, huggingface datasets
L
Logan M
3 weeks ago
Could also be any sql-ish db, or nosql, depending on what's in the dataset
R
Ryan
edited 3 weeks ago
If s3, or lfs, are you storing as json typically?
L
Logan M
3 weeks ago
yea itd just be a json blob. You could compress it if its really huge, but JSON is nice for less complexity
L
Logan M
3 weeks ago
(plus then you can just dump/load a pydantic model for example)
R
Ryan
edited 3 weeks ago
Okay thanks. For my use-case, the datasets are smaller so I'll probably stick to JSON + git for now until I need something more robust.
Add a reply
Sign up and join the conversation on Discord
Join on Discord