Fran Piantoni

Vis

Hi, is there a way to create Workflows in a visual interface?
Something like:

https://logicstudio.ai/

i have a lot of workflows but i am trying to be able to create them visually. Or al least some part of them

6 comments

FFran Piantoni

Is there a way to accomplish multi-agent orchestration inside llama-index

Is there a way to accomplish what the new OpenAI Swarm multi-agent orchestration framwork does inside of llama-index.
The most similiar thing that i found was Workflows?

3 comments

FFran Piantoni

Hi, i am having some problems while

Hi, i am having some problems while trying to run my application. I am using python3.9 and for some time i had this problem running the code because of this problem:

Plain Text

File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/app/main.py", line 1, in <module>
    __import__('pysqlite3')
ModuleNotFoundError: No module named 'pysqlite3'

while trying to install the module i get the following error

Plain Text

ERROR: Could not find a version that satisfies the requirement pysqlite3-binary==0.5.2 (from versions: none)
ERROR: No matching distribution found for pysqlite3-binary==0.5.2

Do you recommend some version of python where i can stop having this kind of problem?
Some how running on Mac OS, is always bringing me this kind of errors

5 comments

FFran Piantoni

Hello i am having the following error, i

Hello i am having the following error, i know i am not making that many request per minute, it was working perfect until maybe 1 day and now it started failing;

Plain Text

2024-05-08 13:15:21,369 - openai._base_client - INFO - Retrying request to /embeddings in 6.947019 seconds (_base_client.py:927)
2024-05-08 13:15:28,721 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" (_client.py:1027)
2024-05-08 13:15:28,723 - openai._base_client - INFO - Retrying request to /embeddings in 7.414641 seconds (_base_client.py:927)
2024-05-08 13:15:36,494 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" (_client.py:1027)
2024-05-08 13:15:36,495 - openai._base_client - INFO - Retrying request to /embeddings in 6.268730 seconds (_base_client.py:927)
^C^C



2024-05-08 13:15:43,168 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" (_client.py:1027)
2024-05-08 13:15:43,169 - openai._base_client - INFO - Retrying request to /embeddings in 6.461716 seconds (_base_client.py:927)
2024-05-08 13:15:50,020 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" (_client.py:1027)
2024-05-08 13:15:50,021 - llama_index.llms.openai_utils - WARNING - Retrying llama_index.embeddings.openai.get_embeddings in 0.20855130440353753 seconds as it raised RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}. (before_sleep.py:65)

10 comments

FFran Piantoni

How can i limit the amount of characters

How can i limit the amount of characters inside the reponse of the Model?
I am using a OpenAI model

1 comment

FFran Piantoni

Hello I wanted to know why in some cases

Hello, I wanted to know why in some cases the chat does not respond to me with the information from the vector store even when I know that the information I am looking for is within the vector store. Only in some cases is it like it doesn't find that information and doesn't pass it on to the chat to give me a correct answer.

Thanks in advance

3 comments

FFran Piantoni

How can i prevent llama index from

How can i prevent llama_index from making up answers. Force it to only query from the local llm

7 comments

FFran Piantoni

Multiple responses

Does llama_index chat_engine can handle multiple request at the same time?
I have encountered that if i make two request at the exact same time the responses get messy. Seems like its returning pieces of texts from the different requests.

If not What would be the best possible solution to address this?

Thanks for the time

6 comments

FFran Piantoni

Logan M i had the response streaming

i had the response streaming perfectly before i updated my code to stop using SimpleVectorStore and start using ChromaVectorStore. Is there an extra config?

4 comments

FFran Piantoni

Token error

I am getting the following error (inidicating that i am exceding token limits):

Plain Text

INFO:openai:error_code=None error_message="This model's maximum context length is 4097 tokens, however you requested 4358 tokens (3334 in your prompt; 1024 for the completion). Please reduce your prompt; or completion length." error_param=None error_type=invalid_request_error message='OpenAI API error received' stream_error=False

I thought i had the llm to use a model with more available tokens (gpt-3.5-turbo-16k). This i how i started it :

Plain Text

num_outputs = 1024
llm_predictor = LLMPredictor(
    llm=OpenAI(
        temperature=0.1,
        model_name="gpt-3.5-turbo-16k",
        max_tokens=num_outputs,
        streaming=True,
    )
)


storage_context = StorageContext.from_defaults(persist_dir="indexstore/newnew")
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)
# print(storage_context)
index = load_index_from_storage(
    storage_context, service_context=service_context
)  # Load the index

I am initializing it wrong? I get this error sometimes only, not all the time.

2 comments

FFran Piantoni

Chat history

From time to time the chat performs a query with the question previously asked. I want it to only answer the current question without any prior context.
Could someone help me see why sometimes you use what was asked last time?

custom_prompt = Prompt(
"""\
Given a conversation (between a Lawyer and Legal Assistant) and a follow-up message from a Human, \
rewrite the message to make it an independent question. You must not make up information.

<Chat History>
{chat_history}

<Follow Up Message>
{question}

<Standalone question>
"""
)

4 comments

FFran Piantoni

Rate limit

Hi guys, i am really new on this type of technology, i have managed to create a VectorStore index with trained data using the OpenAI API. I am really interested on getting to know how can i create larger VectorStores indexes. The reason is i have now a lot of files and when i try to train the model, i get a OpenAI token limit error. So i was wondering how can i merge/load different VectorStores, or how can i load a lot more of files.

This i how i am loading the files (Failing code due to token limit):

Plain Text

def construct_index(directory_path):
    num_outputs = 1024

    llm_predictor = LLMPredictor(
        llm=OpenAI(
            temperature=0.1, model_name="text-davinci-003", max_tokens=num_outputs
        )
    )

    service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

    docs = SimpleDirectoryReader(directory_path).load_data()

    index = GPTVectorStoreIndex(nodes=docs, service_context=service_context)

    index.storage_context.persist(persist_dir="index")

    return index

Thanks in advance for the help 😉

12 comments

FFran Piantoni

What is the best way to tell the index

What is the best way to tell the index to bring me specific definitions and cited in the documents with which it was trained. I am having problems when it tells me definitions of specific articles within the texts, for example: tell me what article 19 says and it brings me a totally different article.

1 comment

FFran Piantoni

Are the responses using llama index

Are the responses using llama_index latest version going to be more accurate than a response from version 0.6.35 per say?

1 comment

FFran Piantoni

How can i know the extact name of the

How can i know the extact name of the models that i can use from open ai, i am updating the model from gpt-3.5-turbo i want to change it to 4. How can i see the the model names for chat-gpt-4.

Do i need to update to the latest version of llama-index to use newer versions of models?

8 comments

FFran Piantoni

How can I ensure that the context length

How can I ensure that the context length does not exceed the maximum content length when considering chat history?

20 comments

FFran Piantoni

Function calling

Is there another example (more complex) on function calling with OpenAi?
I only found this : https://gpt-index.readthedocs.io/en/stable/examples/llm/openai.html#function-calling

6 comments

FFran Piantoni

Is there any docs or example on how does

Is there any docs or example on how does chat_history works? Like for example how does it work if different users use a chat_engine.
Are chat_history going to be the same for both?
Are chat history going to be separeted by ID?

I can't find any relevant docs for that

And another thing i was curious about was if there is a way to add Citation to a chat_engine?

51 comments

Find answers from the community

Vis

Is there a way to accomplish multi-agent orchestration inside llama-index

Hi, i am having some problems while

Hello i am having the following error, i

How can i limit the amount of characters

Hello I wanted to know why in some cases

How can i prevent llama index from

Multiple responses

Logan M i had the response streaming

Token error

Chat history

Rate limit

What is the best way to tell the index

Are the responses using llama index

How can i know the extact name of the

How can I ensure that the context length

Function calling

Is there any docs or example on how does