Find answers from the community

Home
Members
Fran Piantoni
F
Fran Piantoni
Offline, last seen 2 months ago
Joined September 25, 2024
Is there a way to accomplish what the new OpenAI Swarm multi-agent orchestration framwork does inside of llama-index.
The most similiar thing that i found was Workflows?
3 comments
L
F
Hi, i am having some problems while trying to run my application. I am using python3.9 and for some time i had this problem running the code because of this problem:

Plain Text
File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/app/main.py", line 1, in <module>
    __import__('pysqlite3')
ModuleNotFoundError: No module named 'pysqlite3'


while trying to install the module i get the following error

Plain Text
ERROR: Could not find a version that satisfies the requirement pysqlite3-binary==0.5.2 (from versions: none)
ERROR: No matching distribution found for pysqlite3-binary==0.5.2


Do you recommend some version of python where i can stop having this kind of problem?
Some how running on Mac OS, is always bringing me this kind of errors
5 comments
L
J
Hello i am having the following error, i know i am not making that many request per minute, it was working perfect until maybe 1 day and now it started failing;

Plain Text
2024-05-08 13:15:21,369 - openai._base_client - INFO - Retrying request to /embeddings in 6.947019 seconds (_base_client.py:927)
2024-05-08 13:15:28,721 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" (_client.py:1027)
2024-05-08 13:15:28,723 - openai._base_client - INFO - Retrying request to /embeddings in 7.414641 seconds (_base_client.py:927)
2024-05-08 13:15:36,494 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" (_client.py:1027)
2024-05-08 13:15:36,495 - openai._base_client - INFO - Retrying request to /embeddings in 6.268730 seconds (_base_client.py:927)
^C^C



2024-05-08 13:15:43,168 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" (_client.py:1027)
2024-05-08 13:15:43,169 - openai._base_client - INFO - Retrying request to /embeddings in 6.461716 seconds (_base_client.py:927)
2024-05-08 13:15:50,020 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" (_client.py:1027)
2024-05-08 13:15:50,021 - llama_index.llms.openai_utils - WARNING - Retrying llama_index.embeddings.openai.get_embeddings in 0.20855130440353753 seconds as it raised RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}. (before_sleep.py:65)
10 comments
F
L
How can i limit the amount of characters inside the reponse of the Model?
I am using a OpenAI model
1 comment
T
Hello, I wanted to know why in some cases the chat does not respond to me with the information from the vector store even when I know that the information I am looking for is within the vector store. Only in some cases is it like it doesn't find that information and doesn't pass it on to the chat to give me a correct answer.

Thanks in advance
3 comments
F
T
How can i prevent llama_index from making up answers. Force it to only query from the local llm
7 comments
L
F
Does llama_index chat_engine can handle multiple request at the same time?
I have encountered that if i make two request at the exact same time the responses get messy. Seems like its returning pieces of texts from the different requests.

If not What would be the best possible solution to address this?

Thanks for the time
6 comments
L
F
i had the response streaming perfectly before i updated my code to stop using SimpleVectorStore and start using ChromaVectorStore. Is there an extra config?
4 comments
L
F
I am getting the following error (inidicating that i am exceding token limits):
Plain Text
INFO:openai:error_code=None error_message="This model's maximum context length is 4097 tokens, however you requested 4358 tokens (3334 in your prompt; 1024 for the completion). Please reduce your prompt; or completion length." error_param=None error_type=invalid_request_error message='OpenAI API error received' stream_error=False


I thought i had the llm to use a model with more available tokens (gpt-3.5-turbo-16k). This i how i started it :

Plain Text
num_outputs = 1024
llm_predictor = LLMPredictor(
    llm=OpenAI(
        temperature=0.1,
        model_name="gpt-3.5-turbo-16k",
        max_tokens=num_outputs,
        streaming=True,
    )
)


storage_context = StorageContext.from_defaults(persist_dir="indexstore/newnew")
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)
# print(storage_context)
index = load_index_from_storage(
    storage_context, service_context=service_context
)  # Load the index


I am initializing it wrong? I get this error sometimes only, not all the time.
2 comments
F
L
From time to time the chat performs a query with the question previously asked. I want it to only answer the current question without any prior context.
Could someone help me see why sometimes you use what was asked last time?

custom_prompt = Prompt(
"""\
Given a conversation (between a Lawyer and Legal Assistant) and a follow-up message from a Human, \
rewrite the message to make it an independent question. You must not make up information.


<Chat History>
{chat_history}


<Follow Up Message>
{question}

<Standalone question>
"""
)
4 comments
F
L
Hi guys, i am really new on this type of technology, i have managed to create a VectorStore index with trained data using the OpenAI API. I am really interested on getting to know how can i create larger VectorStores indexes. The reason is i have now a lot of files and when i try to train the model, i get a OpenAI token limit error. So i was wondering how can i merge/load different VectorStores, or how can i load a lot more of files.

This i how i am loading the files (Failing code due to token limit):

Plain Text
def construct_index(directory_path):
    num_outputs = 1024

    llm_predictor = LLMPredictor(
        llm=OpenAI(
            temperature=0.1, model_name="text-davinci-003", max_tokens=num_outputs
        )
    )

    service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

    docs = SimpleDirectoryReader(directory_path).load_data()

    index = GPTVectorStoreIndex(nodes=docs, service_context=service_context)

    index.storage_context.persist(persist_dir="index")

    return index


Thanks in advance for the help 😉
12 comments
L
F
What is the best way to tell the index to bring me specific definitions and cited in the documents with which it was trained. I am having problems when it tells me definitions of specific articles within the texts, for example: tell me what article 19 says and it brings me a totally different article.
1 comment
L
Are the responses using llama_index latest version going to be more accurate than a response from version 0.6.35 per say?
1 comment
L
How can i know the extact name of the models that i can use from open ai, i am updating the model from gpt-3.5-turbo i want to change it to 4. How can i see the the model names for chat-gpt-4.

Do i need to update to the latest version of llama-index to use newer versions of models?
8 comments
L
F
How can I ensure that the context length does not exceed the maximum content length when considering chat history?
20 comments
T
e
F
b
Is there another example (more complex) on function calling with OpenAi?
I only found this : https://gpt-index.readthedocs.io/en/stable/examples/llm/openai.html#function-calling
6 comments
L
F
Is there any docs or example on how does chat_history works? Like for example how does it work if different users use a chat_engine.
Are chat_history going to be the same for both?
Are chat history going to be separeted by ID?

I can't find any relevant docs for that

And another thing i was curious about was if there is a way to add Citation to a chat_engine?
51 comments
F
L