Find answers from the community

Updated 5 months ago

Hi all I built a vector from csv about

At a glance
Hi all, I built a vector from csv (about 20 rows) with each row as a Document, when I query such as "How many rows..." it only responds 2 rows, it depends on the similarity_top_k variable , I try with 3, and it returns 3. Can I get the exact row number of the csv file?

documents = []
with open(filename) as file_obj:
reader_obj = csv.reader(file_obj)
heading = next(file_obj)
header = list(heading.strip().split(','))

for row in reader_obj:
record = {}
for i, value in enumerate(row):
record[header[i]] = ' '.join(value.split())

doc_id = row[0]
content_from_csv = json.dumps(record)
documents.append(Document(text=content_from_csv, doc_id=doc_id))
return documents
W
E
n
3 comments
You can add details like this in Metadata.

Plain Text
Document(text=content_from_csv, doc_id=doc_id, metadata={"total_rows":N})
And now it's not more rows, it's turning on to documents, so like @WhiteFang_Jr said, you can store on the metadata each row number, to use it when needed
I will try with this solution, thanks everyone
Add a reply
Sign up and join the conversation on Discord