So, the examples would be stored in llama index actually.
Here's how I might approach this. I'll use a vector index as an example.
Index a bunch of data (in this, it would be a bunch of tweets). For this, I would make each tweet it's own "Document" object (i.e. something like
"Here's a tweet: <tweet text>"
). Each document would then get an embedding generated for it and saved.
Then, you query your index. i.e.
response = index.query("Given related tweets on startup founders, write a new tweet about XX", similarity_top_k=5, response_mode="compact")
Llama index will create an embedding of the query text and fetch the 5 closest matching tweets. Then, using all 5 top matching tweets, it will get the LLM to answer the query given the matched tweets.
Since I specified
response_mode="compact"
, it will stuff as many tweets as possible into each call to the LLM. Without this option, it would make 5 calls to the LLM, one for each matching tweet. If all the text doesn't fit in a single call, it will refine the answer across several calls.
Check out this page for some more inspiration 💪
https://gpt-index.readthedocs.io/en/latest/use_cases/queries.html