Find answers from the community

Updated 2 years ago

So just did another test created a new

So just did another test, created a new env, and downloaded the latest. Used test content and GPT Turbo and for the default question it raised 3000+ tokens.
Attachment
image.png
1
j
M
L
43 comments
hey @Meathead , you can try adjusting chunk_size_limit when building the index
index = GPTSimpleVectorIndex(documents, chunk_size_limit=512)
Still, seems quite high no? The response token size was 133
Attachment
image.png
Plain Text
Package            Version
------------------ ----------
aiohttp            3.8.4     
aiosignal          1.3.1     
async-timeout      4.0.2
attrs              22.2.0
black              22.6.0
blobfile           2.0.1
certifi            2022.12.7
charset-normalizer 3.1.0
click              8.0.4
colorama           0.4.6
dataclasses-json   0.5.7
filelock           3.9.0
frozenlist         1.3.3
greenlet           2.0.2
idna               3.4
langchain          0.0.106
llama-index        0.4.24
lxml               4.9.2
marshmallow        3.19.0
marshmallow-enum   1.5.1
multidict          6.0.4
mypy-extensions    1.0.0
numpy              1.24.2
openai             0.27.1
packaging          23.0
pandas             1.5.3
pathspec           0.10.3
pip                23.0.1
platformdirs       2.5.2
pycryptodomex      3.17
pydantic           1.10.6
python-dateutil    2.8.2
pytz               2022.7.1
PyYAML             6.0
regex              2022.10.31
requests           2.28.2
setuptools         65.6.3
six                1.16.0
SQLAlchemy         1.4.46
tenacity           8.2.2
tiktoken           0.3.0
tomli              2.0.1
tqdm               4.65.0
typing_extensions  4.5.0
typing-inspect     0.8.0
urllib3            1.26.14
wheel              0.38.4
wincertstore       0.2
yarl               1.8.2
@Meathead 133 tokens in the response is actually not bad (openAI models like to be verbose)

In both your experiments, similarity_top_k was 3, which means 3 chunks + prompts are sent to the llm. So 512*3 plus the llama index prompts

You can also try response_mode="compact" in the query to use a bit less tokens
O ok, so I really I should not be worried about a few k tokens
So the reason it is higher within the LLM Tokens vs the Response Token count is because we are still relaying information from the data / json correct?
Correct, since the top k is 3, 3 seperate text chunks are sent to the llm. In the default response mode, this means 3 LLM responses are generated.

The compact response mode will stuff as many nodes as possible into each LLM call -> https://gpt-index.readthedocs.io/en/latest/guides/usage_pattern.html#setting-response-mode
So that is where Keywords would be good at play here as well? - "Scrolls down a bit" Ya Meathead!

I feel like I would want to separate my jsons per library/topic with examples per json. Less to call and more direct.

However I feel like that is what the keyword extraction is built for instead of having to do that.
You can do both! As long as organizing your data is relatively easy for you
@Meathead Are you sure your llm_predictor argument is being considered in index.query?
I did it like that before, but checking on the OpenAI platform, it said I was sending DaVinci requests.
Err not exactly sure, are you questioning it because it looks like its called one on top of another?
Yeah I looked on OpenAI and it was Turbo and Embs
I shat myself when I saw this..
Attachment
image.png
Thankfully Ada is cheap lol
Plain Text
    def query(
        self,
        query_str: Union[str, QueryBundle],
        mode: str = QueryMode.DEFAULT,
        query_transform: Optional[BaseQueryTransform] = None,
        use_async: bool = False,
        **query_kwargs: Any,
    ) -> Response:
        mode_enum = QueryMode(mode)
        if mode_enum == QueryMode.RECURSIVE:
            ...
        else:
            self._preprocess_query(mode_enum, query_kwargs)
            query_config = QueryConfig(
                index_struct_type=self._index_struct.get_type(),
                query_mode=mode_enum,
                query_kwargs=query_kwargs,
            )
            query_runner = QueryRunner(
                self._llm_predictor,
                self._prompt_helper,
                self._embed_model,
                self._docstore,
                self._index_registry,
                query_configs=[query_config],
                query_transform=query_transform,
                recursive=False,
                use_async=use_async,
            )

From my experience I had to pass llm_predictor to the index constructor. Since the code suggests it uses self._llm_predictor, and not what you passed into kwargs.
I guess I'll try passing to index.query again and hope it works this time lol
Yeah, I'm just looking at examples, trying, tweaking nothing fancy atm. I like to ask a lot of questions because I know it will help others. πŸ˜„

I am quite new to python. Old PHP/ JS/html/CSS programmer here.
Yea that should help @QuietRocket πŸ€”
Yeah I need to look at more examples and notebooks... My usual approach is to look at function signatures and figure out what to pass, but since there's configuration being passed around as large keyword argument objects it's pretty tough to follow the execution flow. Examples it is!
I do have to admit CHatGPT has helped me a few times trying to understand other's code/software. But no idea if its is "really correct" haha I only just found out about indexing a few days ago and was like "Waaa I can make chatgpt smarter and more focused!" lol
Yeah indexing and using embeddings changes everything. And I really like the idea of Agents and tools from Langchain. You can provide your AI with tools like functions to solve problems in multiple steps 🀯
A great example of this is kapa ai.
"You can provide your AI with tools like functions to solve problems in multiple steps" Really!?
Yup. That's the whole deal of langchain. And GPT Index has good integration with it.
you can define a function in Python with a string argument and return type, then basically tell ChatGPT... hey! you can use this function, it's pretty useful to do X. and ChatGPT (your agent) will use it and its return value to perform further steps.
🀯 🀯 🀯
You got examples or seen vids of this?
Jesus..Im just licking the peanut butter on the edge of the lid here. haha
No more junior programmers in a few years. πŸ˜›
They will be the new Q and A / Tester. lol
Speaking of langchain, I really wish GPT index had typescript support. Is it anywhere on the roadmap?
Thats a question for @jerryjliu0 (although I can't picture how that would look... maybe if your backend is written in typescript it makes sense)
Yeah that's what I mean. My main tech stack is TypeScript. With the current state of the project, I'll have to make a connector between a Python API with GPT Index and my actual services written in TypeScript.

If support comes one day, I'll be one happy contributor and early adopter!
@QuietRocket tbh not atm since we see llamaindex as more of a backend tool. That said, you're not the first person to ask for this, so maybe we'll figure something out πŸ™‚
Got it! I'll keep my fingers crossed 🀞 πŸ˜‰
Add a reply
Sign up and join the conversation on Discord