Right now the main use-case for this feature is to support HyDE (hypothetical document embeddings). You can take a look at this tweet thread for more explanation/examples: https://twitter.com/jerryjliu0/status/1626255140209717248
Currently the default logic is to embed each separately, and use the "mean" embedding for calculating similarity.
We support customizing the aggregation function from "mean" to something else. But that configuration is not exposed at the Index API level yet. It's possible to subclass BaseEmbedding to do implement your desired behavior though.
so in the Hyde twitter thread example, is the embeddings_strs[0] equivalent to custom_embedding_strs[0] - and hyde is hallucinating context to pass in for #1/k-nearest retreival?
I'm trying to create a legal assistant question & answer on ~1 million cases/legislation documents using GPT Index. e.g. "Summarize this law & cite relevant cases"
Any insight on what tools/classes to use to get the best answers per api token spend? e.g. GPTSimpleVectorIndex in combination with ___
I think we use Davinci for LLM call by default, which costs 0.0200 /β1K tokens. I'd recommend trying a cheaper LLM model and see if the quality is still acceptable.