i saw that llama markdown and json

At a glance

The community members are discussing the use of large language models (LLMs) in parsing and processing data, specifically in the context of a Llama markdown and JSON parser. The main points are:

- The community member who posted the original question is unsure about the usefulness of using an LLM to parse data and then store it in a vector store.

- Other community members suggest that the LLM can be used to write a summary of extracted tables, and that it can help with retrieval of the data.

- There is a discussion about how to view the metadata extracted from the transformations when the nodes are retrieved, and whether this is the optimal way to do it.

- The community members also discuss the use of a retriever to retrieve the nodes without the chat, and the potential issues with a "fake OpenAI" model that seems too good to be true.

- There is no explicitly marked answer, but the community members provide suggestions and examples on how to work with LLMs and retrievers in this context.

ddrewskidang

i saw that llama markdown and json parser utlizes llm. is there a point in using llm to parse then use it in a vector store later down the line? Not sure what the llm does lol, and its kind of ass with other llms .. sigh openai moat

16 comments

LLogan M

The LLM is just there to write a summary of extracted tables

LLogan M

helps with retrieval

ddrewskidang

oh gotcha so if no tables not much use?

Would you know how to view only meta data extractions from the transformations. when the nodes are retrieved?

Not sure this the optimal way to do it

Plain Text

async def main(message: cl.Message):
    chat_engine = cl.user_session.get("chat_engine")  
    response = await cl.make_async(chat_engine.chat)(message.content)
    response_message = cl.Message(content=response.response)

    await response_message.send()
    
    
    elements = []  # Initialize an empty list to collect elements
    label_list = []
    
    for count, sr in enumerate(response.source_nodes, start=1):
        metadata = sr.node.get_content(metadata_mode=MetadataMode.LLM)
        content = sr.node.text  # Adjust based on actual method to get content

        # Create a cl.Text element for each source node and add it to the elements ist
        element = cl.Text(
            name="S" + str(count),
            content=f"Content: {content},\nMetadata: {metadata}",                   
            display="side",  # Adjust display as needed
            size='small',  # Adjust size as needed
            color='black',  # Adjust color as needed
            font_style='arial'  # Adjust font style as needed
        )
        elements.append(element)  # Add the created element to the list
        label_list.append("S" + str(count))

LLogan M

That looks right to me tbh. If you only want the nodes and not the chat, you could also use a retriever

ddrewskidang

gothca gotcha

ddrewskidang

i dont know if the fakeopenai comptabile is working. I'm seeing no calls made on hf backend. Also its way too good to be a 7 b hahaa. Pretty sure its hfs end

LLogan M

Are you piassing in the LLM to your chat engine?

ddrewskidang

I have to do that too?

ddrewskidang

setting.lm doesnt do that?

ddrewskidang

yup

ddrewskidang

it worked

ddrewskidang

i knew it was to good to be true

LLogan M

Settings.llm should do that tbh, but sure :p

ddrewskidang

The things i do for a chat bot

ddrewskidang

could you show example... i haven't touch retrivers lol been so focus on index and engines

LLogan M

Plain Text

retriever = index.as_retriever(similarity_top_k=2)

nodes = retriever.retrieve("query")

Add a reply

Find answers from the community

i saw that llama markdown and json