Evaluator
modules that LlamaIndex provides for both options. This works for the RAG option (just create a query engine from the vector store and then perform a query). With the no-RAG option this is not possible, since we don't have a VectorStore. Is there any other way to directly compare the two methods using the same benchmark in LlamaIndex?from llama_index.llms.openai import OpenAI from llama_index.core.evaluation import FaithfulnessEvaluator # create llm llm = OpenAI(model="gpt-4", temperature=0.0) # define evaluator evaluator = FaithfulnessEvaluator(llm=llm) eval_result = evaluator.evaluate( response=response_str, contexts=[TEXT_1,TEXT_2] )