davidp

Delete

hi, is there any way to delete documents from an index? but by specifying the documentId, instead what I'd need is to delete all documents from a given file. For instance to tell delete all documents with file_name= "Top Screwups...."

1 comment

ddavidp

Bedrock

Hi, anybody has tried AWS Bedrock with llamaindex? I have tried it and it does not give any error but it doesn't take the prompt template nor it interacting with the results:

this is a piece of code of how I use it up:

Plain Text

from llama_index.llms.bedrock import Bedrock

llm = Bedrock(model="meta.llama2-13b-chat-v1", profile_name="machineuser1")

embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

service_context = ServiceContext.from_defaults(
    llm = llm,
    embed_model = embed_model,
    chunk_size=256,
)

15 comments

ddavidp

DLAI - Learning Platform Beta

Hi, does somebody know if I can you llamaindex with Ollama right with the truLens recorder. I'm trying to evaluate the RAG but I get an error:

Plain Text

`
tru = Tru()
tru.reset_database()

llm = Ollama(model="wizard-vicuna-uncensored",base_url="http://192.168.1.232:11435")

response = completion(
    model="ollama/wizard-vicuna-uncensored", 
    messages=[{ "content": "respond in 20 words. who are you?","role": "user"}], 
    api_base="http://192.168.1.232:11435"
)

LiteLLM.set_verbose=True

litellm_provider = LiteLLM(model_engine="ollama/wizard-vicuna-uncensored", endpoint="http://192.168.1.232:11435")

grounded = Groundedness(groundedness_provider=litellm_provider)

f_groundedness = (
    Feedback(grounded.groundedness_measure_with_cot_reasons, name = "Groundedness")
    .on(Select.RecordCalls.retrieve.rets.collect())
    .on_output()
    .aggregate(grounded.grounded_statements_aggregator)
)

f_qa_relevance = (
    Feedback(litellm_provider.relevance_with_cot_reasons, name = "Answer Relevance")
    .on(Select.RecordCalls.retrieve.args.query)
    .on_output()
)
f_context_relevance = (
    Feedback(litellm_provider.qs_relevance_with_cot_reasons, name = "Context Relevance")
    .on(Select.RecordCalls.retrieve.args.query)
    .on(Select.RecordCalls.retrieve.rets.collect())
    .aggregate(np.mean)
)

from trulens_eval import TruLlama
from trulens_eval import FeedbackMode

tru_recorder = TruLlama(
    query_engine3,
    app_id="App_1",
    feedbacks=[
        f_qa_relevance,
        f_context_relevance,
        f_groundedness
    ]
)

for question in eval_questions:
    with tru_recorder as recording:
        print(question)
        query_engine3.query(question)

`
I'm adapting this from the example in the course:
https://learn.deeplearning.ai/building-evaluating-advanced-rag/lesson/3/rag-triad-of-metrics
where I want to use Ollama instead of OpenAI API

1 comment

ddavidp

load index

Hi, is there any way to make the index in RAM faster? I'm using this call:

index_finance = load_index_from_storage(storage_context)

but for a file of around 5GB it takes too long. I think it only uses one core and it changes the core for each node loaded

17 comments

ddavidp

Top_k

Hi, I'd like to know if it's possible to set a custom index for a chat_engine. I'd need to retrieve more than the default 2 documents for each interaction but it seems as it can't be done...

48 comments

ddavidp

Hi, I'm trying to use chroma with llama-

Hi, I'm trying to use chroma with llama-index. I'm loading some json documents into a documents object. The issue comes when I call the:
index_finance = VectorStoreIndex.from_documents( documents, storage_context=storage_context, service_context=service_context )
Any idea of what I'm missing for the jsons? if I do the same with chroma but with a simpledirectoryreader it works:
documentsNassim = SimpleDirectoryReader("/mnt/nasmixprojects/books/nassimTalebDemo").load_data()

23 comments

ddavidp

Hi, I'd like to expose my chat that I

Hi, I'd like to expose my chat that I use with repl chat_engine.chat_repl() to the Internet. The idea is to have a React app for the frontend and a node.js app in the middle that would make the petitions to the llamaindex python code. What's the best way to do it? I'd say there has to be some socket to support the interaction with the chatbot but I'm lost at how to serve the python code to the node.js or directly on over the Internet.

6 comments

ddavidp

any idea if how bing makes the

any idea if how bing makes the autogenerated next question proposals? is there any way to do it with llamaindex?

1 comment

ddavidp

hi, is there any way to load json

hi, is there any way to load json documents from a directory with for example:
documents = SimpleDirectoryReader("./transcriptions_test_json/").load_data()
and then say I only want to vectorize/create the index by taking one of the fields of each json?

for example a json is:

  "c49c7a9b-6a12-5f1f-ba76-b81d986e5bc7": {
        "video_name": "videoplayback2.mp4",
        "video_path": "/mnt/nas/videos/0-ops/videoplayback2.mp4",
        "original_text": " Good evening and welcome to T...",
        "length_characters": 7585,
        "original_lang": "en",
        "video_section": "0-ops"
    }

and I'd only like to vectorize the original_text file, but when I retrieve with the query before genrrating the final answer I'd like to use the rest of the files potentially for statistics.

The SimpleDirectoryReader can ingest the json and I can access to each of the ingested json inside of the documents read but it's getting the json as a string...

print(documents)
print ("\n")
for doc in documents:
    print (doc.text)
    print ("\n")

78 comments

ddavidp

Hi, I'm following the example of the

Hi, I'm following the example of the llamacpp in the documentation but I get an error when trying to use a Huggingfacemodel. I'm running on intel CPU

https://gpt-index.readthedocs.io/en/v0.9.2/examples/llm/llama_2_llama_cpp.html

model_url = "https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/llama-2-13b-chat.ggmlv3.q4_0.bin"

llm = LlamaCPP(
    # You can pass in the URL to a GGML model to download it automatically
    model_url=model_url,
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path=None,
    temperature=0.1,
    max_new_tokens=256,
    # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
    context_window=3900,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    model_kwargs={"n_gpu_layers":0},<------------I put this to 0 as I don't have GPU
    # transform inputs into Llama2 format
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=True,
)

20 comments

Find answers from the community

Delete

Bedrock

DLAI - Learning Platform Beta

load index

Top_k

Hi, I'm trying to use chroma with llama-

Hi, I'd like to expose my chat that I

any idea if how bing makes the

hi, is there any way to load **json**

Hi, I'm following the example of the

hi, is there any way to load json