Bing chat

jjackson hole

🙋🏻‍♂️
Hello, I have a small question regarding how do the "Bing Chat" and "Bard" work behind the scenes.

When we ask something in their "chatbox" they typically search on the internet and I think they will "fetch" the webpages. Now, there are multiple pages, say 3 which may contain the information that I am looking for to be answered.

Does it summarize all information? Because then there will be the context length issue...

🤔
Will anyone please elgihten me on how do they work? Combining all information and generate the response? (with citation)?

2 comments

LLogan M

No one knows for sure how they work. They are probably doing something similar to the refine process in llama index, or maybe some kind of summarization process.

For the citations, I have a feeling they just present the data with a source and do some prompt engineering to get the LLM to include the citations in-text

jjackson hole

Right, Right... I was really curios about this, becuase many, I mean many user queries require them to "search" scrap and summarize the content to give the appropriate response. But this is costly process, In my trials, each page has around ~1600+ tokens, and often they have to scrape more than one page to get the relavant result.

This becomes costly process! And think of it, OpenAI DavinCi costs around 0.12$ per 1K tokens! 🤯

Add a reply

Find answers from the community

Bing chat