Hi all another question I m trying to

ddrsashimi.

Hi all - another question: I'm trying to "augment" the default chatgpt with some domain specific knowledge by providing them as documents and creating my own index. I tried GPTSimpleVectorIndex and ChatGPTRetrievalPluginIndex. I also played with different response types. It either answers questions strictly based on the information I provide, and anything beyond will get a response like "context has no information about ...". Or with the "generation" response type, it completely ignores the information I provide, and responds with a generic answer.. Is there a way to have it do both, i.e. the "augmenting" behavior? Much appreciated!

13 comments

LLogan M

To do use either the knowledge in the index or general knowledge, you'll want to modify the prompt templates (both the text_qa_template and the refine_template):
https://gpt-index.readthedocs.io/en/latest/how_to/customization/custom_prompts.html

Another option is using llama index as a custom "tool" in a langchain agent. Then the agent can answer questions using llama index when it needs to

ddrsashimi.

Thanks a lot Logan! I'll try to give them a shot and let you know how it goes.

ddrsashimi.

It feels a little hacky but I think it works now! It's just that I noticed all these extra info I provide is passed to the llm predictor as the context, so it's subject to the token limit, e.g. 4096 for ChatGPT3? That doesn't sound right because I saw folks talking about feeding large amount of custom data (as docs) through llama. What am I missing here..? I do have tens of thousands of words to provide as the "context", and it'd be a bummer if it's all limited at 4096 tokens.

LLogan M

Each input is limited to 4096 tokens yes.

But if the index retrieves more text for a query than that limit, it refines an answer across multiple LLM calls so that the model has a chance to read all the text

ddrsashimi.

ah interesting.. so if I provide say N docs as custom context, and then ask a question where maybe only 2 docs are relevant. Will the index try to refine the answer by making many LLM calls and basically iterate through ALL the N docs, just so the model has a chance to see the 2 that are relevant? Then the answer would be appropriate I suppose; it's just going to be a scaling issue if N gets big..

ddrsashimi.

Going back to my other question I asked earlier re. differences between Llama index and Chat GPT Retrieval Plugin, I guess the use case above could be one of them? Where Llama index provide custom information as context and might be subject to limitations such as the one discussed above, and the plugin sort updates the model itself to reflect the new information, so the model kinda sees all the custom information "all at once".. (for a lack of better expression 🤦‍♂️ )

LLogan M

At least with a vector index, you can control how many docs are retrieved.

The knowledge graph index doesn't quite have that yet though 🤔

LLogan M

Yea exactly! Tbh relying on an LLM to tell you anything without providing any context is pretty risky haha who knows what it will say with anything to reference

ddrsashimi.

Ah right that's where the top k similar docs come into play. I guess that could be good enough. I'll need to play with it a bit more.

ddrsashimi.

Haha yes! I'm quite wary of blindly feeding everything to this black box LLM and just expects to spit out something reasonable. Well, this apparently comes from an old school programmer that doesn't know much about the power of AI 🤦‍♂️

ddrsashimi.

Thanks a lot Logan this has been super helpful; I really appreciate your response and guidance!

ddrsashimi.

Talk to you later and have a good night!

LLogan M

Yea have a good one!

Add a reply

Find answers from the community

Hi all another question I m trying to