Find answers from the community

Updated 6 months ago

I want to build a chatbot that can

At a glance

I want to build a chatbot that can reason over a fairly large codebase for a specific software system and answer support or developer related questions with the entire source code as context. With the current context length limits, what is the best way to build this? Is it still using a vector database to pull in relevant parts into the prompt? Or are there other, better options? I can use GPT-4 Turbo model for this project.

4 comments

TTeemu

Yeah if it's a large codebase using retrieval will make sense. You could take a look at the CodeSplitter: https://docs.llamaindex.ai/en/stable/module_guides/loading/node_parsers/modules.html#codesplitter

rrrva

got everything to work nicely. Is there some way to create an embedding where the context can be 128k?

rrrva

so I can create an index where the chunks are huge and then passed to gpt-4-turbo?

rrrva

... or alternatively the chunks are small but the retrieval fetches upto lets say 64k into the context window

Add a reply