nbulkz

when you use Huggingface embeddings and

when you use Huggingface embeddings and download them, how does that work with attention mechanisms? Is there any over-the-wire transaction when I vectorize a document if I use hf?

13 comments

nnbulkz

my prompt template just ends in `return

my prompt template just ends in return json. I use the {context_str} and {query_str} correctly I think, they are being included in the prompt output that goes over the wire.

15 comments

nnbulkz

Conda

conflict between llama-index-vector-stores-chroma 0.1.15 depends on onnxruntime<2.0.0 and >=1.17.0

21 comments

nnbulkz

mrm8488/longformer-base-4096-finetuned-s...

The end goal is to use this model to do extraction using llama index for the convenience ... https://huggingface.co/mrm8488/longformer-base-4096-finetuned-squadv2

4 comments

nnbulkz

I'm just trying to pass text directly to

I'm just trying to pass text directly to a Longformer model using SummaryIndex I think.

5 comments

nnbulkz

Starter pack

does requirements.txt need to be updated to 0.6.X?

7 comments

nnbulkz

oh ok so it is unrelated to my data

oh ok so it is unrelated to my data?

3 comments

nnbulkz

Llm differences

Will I see a major difference using text-davinci-003 vs gpt-3.5-turbo?

1 comment

nnbulkz

Costs

any strategies people here would recommend using to mitigate this worry?

18 comments

nnbulkz

what if the model in question, say a

what if the model in question, say a small custom model, gets hung up trying to return a specific format implied by the Pydantic model?

4 comments

nnbulkz

some finetuned LLMs will not always

some finetuned LLMs will not always return a string. The one I am using returns

Plain Text

{'score': 0.85932', 'answer':'the answer'}

,

This is actually what I would prefer returned, since I am using the accumulate method over a SummaryIndex, so I can compare confidences and select the best answer. Is there a way to do this? Right now, when my CustomLLM's _call method returns anything other than a string, I get an error in langchain_core/language_models/llms.py

4 comments

nnbulkz

gettin this error with 6 10 0

gettin this error with 6.10.0

raise TypeError("Instance and class checks can only be used with"
2023-05-24 17:20:38 TypeError: Instance and class checks can only be used with @runtime_checkable protocols

6 comments

nnbulkz

Loading model

oh ok! how do I do that with the model that llama index downloads ( I can set a cache dir, etc.)

22 comments

nnbulkz

Yeah I see those but where do I plug

Yeah I see those, but where do I plug them in like I would … summary_template=SUMMARY_PROMPT

11 comments

nnbulkz

Post files

is there an easy way, and hopefully also a safe way, to read a document in through an API file POST and ask it a cached question?

3 comments

nnbulkz

Query engine

also, trying to follow this tutorial https://github.com/jerryjliu/llama_index/blob/main/examples/paul_graham_essay/TestEssay.ipynb

I get: GPTTreeIndex object has no attribute as_query_engine

17 comments

nnbulkz

Json prompting

how can I use {"foo":{}, "bar":{}} in a prompt?

8 comments

nnbulkz

Use case

My use case is I want to go through each document node and extract information that might be relevant to an arbitrary question/prompt. Then put anything relevant into a json object and return that, and keep concatenating to that

6 comments

nnbulkz

what is the link to the FAQ again Logan

what is the link to the FAQ again @Logan M ?

1 comment

nnbulkz

Llm pipeline

Plain Text

class CustomLLM(LLM):
    model_name = "facebook/opt-iml-max-30b"
    pipeline = pipeline("text-generation", model=model_name, device="cuda:0", model_kwargs={"torch_dtype":torch.bfloat16})

    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
        prompt_length = len(prompt)
        response = self.pipeline(prompt, max_new_tokens=num_output)[0]["generated_text"]

        # only return newly generated tokens
        return response[prompt_length:]

    @property
    def _identifying_params(self) -> Mapping[str, Any]:
        return {"name_of_model": self.model_name}

    @property
    def _llm_type(self) -> str:
        return "custom"

Find answers from the community

when you use Huggingface embeddings and

my prompt template just ends in `return

Conda

mrm8488/longformer-base-4096-finetuned-s...

I'm just trying to pass text directly to

Starter pack

oh ok so it is unrelated to my data

Llm differences

Costs

what if the model in question, say a

some finetuned LLMs will not always

gettin this error with 6 10 0

Loading model

Yeah I see those but where do I plug

Post files

Query engine

Json prompting

Use case

what is the link to the FAQ again Logan

Llm pipeline

Custom llm