So just did another test created a new

At a glance

So just did another test, created a new env, and downloaded the latest. Used test content and GPT Turbo and for the default question it raised 3000+ tokens.

Attachment

43 comments

jjerryjliu0

hey @Meathead , you can try adjusting chunk_size_limit when building the index

jjerryjliu0

index = GPTSimpleVectorIndex(documents, chunk_size_limit=512)

MMeathead

Running now.

MMeathead

Attachment

MMeathead

Still, seems quite high no? The response token size was 133

Attachment

MMeathead

Attachment

MMeathead

Plain Text

Package            Version
------------------ ----------
aiohttp            3.8.4     
aiosignal          1.3.1     
async-timeout      4.0.2
attrs              22.2.0
black              22.6.0
blobfile           2.0.1
certifi            2022.12.7
charset-normalizer 3.1.0
click              8.0.4
colorama           0.4.6
dataclasses-json   0.5.7
filelock           3.9.0
frozenlist         1.3.3
greenlet           2.0.2
idna               3.4
langchain          0.0.106
llama-index        0.4.24
lxml               4.9.2
marshmallow        3.19.0
marshmallow-enum   1.5.1
multidict          6.0.4
mypy-extensions    1.0.0
numpy              1.24.2
openai             0.27.1
packaging          23.0
pandas             1.5.3
pathspec           0.10.3
pip                23.0.1
platformdirs       2.5.2
pycryptodomex      3.17
pydantic           1.10.6
python-dateutil    2.8.2
pytz               2022.7.1
PyYAML             6.0
regex              2022.10.31
requests           2.28.2
setuptools         65.6.3
six                1.16.0
SQLAlchemy         1.4.46
tenacity           8.2.2
tiktoken           0.3.0
tomli              2.0.1
tqdm               4.65.0
typing_extensions  4.5.0
typing-inspect     0.8.0
urllib3            1.26.14
wheel              0.38.4
wincertstore       0.2
yarl               1.8.2

LLogan M

@Meathead 133 tokens in the response is actually not bad (openAI models like to be verbose)

In both your experiments, similarity_top_k was 3, which means 3 chunks + prompts are sent to the llm. So 512*3 plus the llama index prompts

You can also try response_mode="compact" in the query to use a bit less tokens

MMeathead

O ok, so I really I should not be worried about a few k tokens

MMeathead

So the reason it is higher within the LLM Tokens vs the Response Token count is because we are still relaying information from the data / json correct?

LLogan M

Correct, since the top k is 3, 3 seperate text chunks are sent to the llm. In the default response mode, this means 3 LLM responses are generated.

The compact response mode will stuff as many nodes as possible into each LLM call -> https://gpt-index.readthedocs.io/en/latest/guides/usage_pattern.html#setting-response-mode

MMeathead

So that is where Keywords would be good at play here as well? - "Scrolls down a bit" Ya Meathead!

I feel like I would want to separate my jsons per library/topic with examples per json. Less to call and more direct.

However I feel like that is what the keyword extraction is built for instead of having to do that.

LLogan M

You can do both! As long as organizing your data is relatively easy for you

QQuietRocket

@Meathead Are you sure your llm_predictor argument is being considered in index.query?

QQuietRocket

I did it like that before, but checking on the OpenAI platform, it said I was sending DaVinci requests.

MMeathead

Err not exactly sure, are you questioning it because it looks like its called one on top of another?

MMeathead

Yeah I looked on OpenAI and it was Turbo and Embs

MMeathead

I shat myself when I saw this..

Attachment

MMeathead

lol

LLogan M

Thankfully Ada is cheap lol

QQuietRocket

Plain Text

    def query(
        self,
        query_str: Union[str, QueryBundle],
        mode: str = QueryMode.DEFAULT,
        query_transform: Optional[BaseQueryTransform] = None,
        use_async: bool = False,
        **query_kwargs: Any,
    ) -> Response:
        mode_enum = QueryMode(mode)
        if mode_enum == QueryMode.RECURSIVE:
            ...
        else:
            self._preprocess_query(mode_enum, query_kwargs)
            query_config = QueryConfig(
                index_struct_type=self._index_struct.get_type(),
                query_mode=mode_enum,
                query_kwargs=query_kwargs,
            )
            query_runner = QueryRunner(
                self._llm_predictor,
                self._prompt_helper,
                self._embed_model,
                self._docstore,
                self._index_registry,
                query_configs=[query_config],
                query_transform=query_transform,
                recursive=False,
                use_async=use_async,
            )

From my experience I had to pass llm_predictor to the index constructor. Since the code suggests it uses self._llm_predictor, and not what you passed into kwargs.

QQuietRocket

I guess I'll try passing to index.query again and hope it works this time lol

MMeathead

Yeah, I'm just looking at examples, trying, tweaking nothing fancy atm. I like to ask a lot of questions because I know it will help others. 😄

I am quite new to python. Old PHP/ JS/html/CSS programmer here.

LLogan M

Yea that should help @QuietRocket 🤔

QQuietRocket

Yeah I need to look at more examples and notebooks... My usual approach is to look at function signatures and figure out what to pass, but since there's configuration being passed around as large keyword argument objects it's pretty tough to follow the execution flow. Examples it is!

MMeathead

I do have to admit CHatGPT has helped me a few times trying to understand other's code/software. But no idea if its is "really correct" haha I only just found out about indexing a few days ago and was like "Waaa I can make chatgpt smarter and more focused!" lol

QQuietRocket

Yeah indexing and using embeddings changes everything. And I really like the idea of Agents and tools from Langchain. You can provide your AI with tools like functions to solve problems in multiple steps 🤯

MMeathead

A great example of this is kapa ai.

MMeathead

"You can provide your AI with tools like functions to solve problems in multiple steps" Really!?

QQuietRocket

Yup. That's the whole deal of langchain. And GPT Index has good integration with it.

QQuietRocket

you can define a function in Python with a string argument and return type, then basically tell ChatGPT... hey! you can use this function, it's pretty useful to do X. and ChatGPT (your agent) will use it and its return value to perform further steps.

QQuietRocket

🤯 🤯 🤯

MMeathead

You got examples or seen vids of this?

QQuietRocket

https://langchain.readthedocs.io/en/latest/modules/agents/tools.html

LLogan M

Here's another demo, with llama index as a tool https://github.com/jerryjliu/gpt_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

MMeathead

Jesus..Im just licking the peanut butter on the edge of the lid here. haha

MMeathead

No more junior programmers in a few years. 😛

MMeathead

They will be the new Q and A / Tester. lol

QQuietRocket

Speaking of langchain, I really wish GPT index had typescript support. Is it anywhere on the roadmap?

LLogan M

Thats a question for @jerryjliu0 (although I can't picture how that would look... maybe if your backend is written in typescript it makes sense)

QQuietRocket

Yeah that's what I mean. My main tech stack is TypeScript. With the current state of the project, I'll have to make a connector between a Python API with GPT Index and my actual services written in TypeScript.

If support comes one day, I'll be one happy contributor and early adopter!

jjerryjliu0

@QuietRocket tbh not atm since we see llamaindex as more of a backend tool. That said, you're not the first person to ask for this, so maybe we'll figure something out 🙂

QQuietRocket

Got it! I'll keep my fingers crossed 🤞 😉

Add a reply

Find answers from the community

So just did another test created a new