Find answers from the community

Updated 2 months ago

Hello

Hello,
First I would like to thank and all other contributors who created and maintain this great library!
I have two questions:
  1. In the "Embeddings" documentation it says that the default model is text-embedding-ada-002 which can be used for both text search and similarity. Are there any examples / tutorials on how to use it for similarity, or more specifically anomaly detection?
  2. When using embedding for Q&A, what is the actual query that's being sent to GPT-3 along with the matching documents retrieved from the index?
TIA!
j
y
3 comments
re: 1) you can set response_mode="no_text" during index.query, and adjust the similarity_top_k. Then you can parse the response object to obtain the source nodes, if you just want to fetch the underlying documents by similarity (see this section https://gpt-index.readthedocs.io/en/latest/guides/usage_pattern.html#parsing-the-response)

re 2): the question answer prompt is here: https://gpt-index.readthedocs.io/en/latest/reference/prompts.html#gpt_index.prompts.prompts.QuestionAnswerPrompt
re 1 - I guess that for anomaly detection I need to cluster the embedding vectors and then within each cluster find the "outliars". Does this make sense? if so, do you have any suggestion how to implement it?
re 2 - It seems the default values are None, so how are the query + paragraphs being sent to GPT 3 by default ? i.e "based on the following paragraphs: <retrieved documents here> , <query here>"
Hi @jerryjliu0 , just a kind reminder for the above questions. TIA!
Add a reply
Sign up and join the conversation on Discord