Vimwork1

Perfomance

@Logan M Extracting metadata from documents takes ~ 10 sec per document if I am using private openai. Is there a way to speed up the process? I have 76000 documents? I have business documents downloaded from confluence and would like to use llama_index to have the most accurate answers.

2 comments

VVimwork1

Logan M could you write example of

@Logan M could you write example of using Azure openai for metadata extractor. In this case it is necessary to provide model
I tried this : os.environ['OPENAI_API_KEY'] = ""
openai.api_type = "azure"
openai.deployment_id ="gpt-4"
openai.api_base = " "
openai.api_version = "2023-07-01-preview"

from llama_index.llms import AzureOpenAI
llm = AzureOpenAI(engine="GPT48K",model="gpt-4", temperature=0.5)

but still got error that deployment_id is missing

16 comments

VVimwork1

What is the best document processing strategy?

@kapa.ai What is the best document processing strategy?

14 comments

VVimwork1

While running examples from https gpt

While running examples from https://gpt-index.readthedocs.io/en/stable/examples/evaluation/TestNYC-Evaluation.html
on def evaluate_query_engine(query_engine, questions):
correct, total = evaluate_query_engine(vector_query_engine, eval_questions[:5])

Plain Text

Got error:
---------------
RuntimeError                              Traceback (most recent call last)
Cell In[71], line 2
      1 vector_query_engine = vector_index.as_query_engine()
----> 2 correct, total = evaluate_query_engine(vector_query_engine, questions)
      4 print(f"score: {correct}/{total}")

Cell In[69], line 6, in evaluate_query_engine(query_engine, questions)
      4 def evaluate_query_engine(query_engine, questions):
      5     c = [query_engine.aquery(q) for q in questions]
----> 6     results = asyncio.run(asyncio.gather(*c))
      7     print("finished query")
      9     total_correct = 0

File /usr/local/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py:186, in run(main, debug)
    161 """Execute the coroutine and return the result.
    162 
    163 This function runs the passed coroutine, taking care of
   (...)
    182     asyncio.run(main())
    183 """
    184 if events._get_running_loop() is not None:
    185     # fail fast with short traceback
--> 186     raise RuntimeError(
    187         "asyncio.run() cannot be called from a running event loop")
    189 with Runner(debug=debug) as runner:
    190     return runner.run(main)

RuntimeError: asyncio.run() cannot be called from a running event loop

33 comments

Find answers from the community

Perfomance

Logan M could you write example of

What is the best document processing strategy?

While running examples from https gpt