Find answers from the community

Updated 3 months ago

Embeddings

Are the new openai embedding models supported in llamaindex query engine?
Plain Text
ValueError: shapes (1536,) and (3072,) not aligned: 1536 (dim 0) != 3072 (dim 0)
L
k
15 comments
They are.

The new large model has 3072 dimensions.

You cannot switch embedding models though without first re-embedding all your data with the new model
I have created new embeddings. I still face the issue.
I have used
Plain Text
text-embedding-3-small
for my documents. It still throws this error.
I even passed the same embedding model in
Plain Text
service_context
to embed the query.
Can you share the code?
I'm not able to reproduce this
Sure. I'll get back to you
Here's the code.
index=indexgenerator(indexPath,documentsPath)
I'm not sure what this function does. But it should be also using a service context
Plain Text
def indexgenerator(indexPath, documentsPath):

    # check if storage already exists
    if not os.path.exists(indexPath):
        print("Not existing")
        # load the documents and create the index
        
        entity_extractor = EntityExtractor(prediction_threshold=0.2,label_entities=False, device="cpu")

        node_parser = SentenceSplitter(chunk_overlap=200,chunk_size=2000)

        transformations = [node_parser, entity_extractor]

        documents = SimpleDirectoryReader(input_dir=r"Text_Files").load_data()

        pipeline = IngestionPipeline(transformations=transformations)

        nodes = pipeline.run(documents=documents)

        service_context = ServiceContext.from_defaults(llm=OpenAI(model="gpt-3.5-turbo", temperature=0),embed_model=embed_model)

        index = VectorStoreIndex(nodes, service_context=service_context)

        # store it for later
        index.storage_context.persist(indexPath)
    else:
        #load existing index
        print("Existing")
        storage_context = StorageContext.from_defaults(persist_dir=indexPath)
        index = load_index_from_storage(storage_context)
        
    return index
Did you mean it should use service context even while loading from storage?
Yes, you need it even when loading

load_index_from_storage(storage_context, service_context=service_context)
Thanks @Logan M
Add a reply
Sign up and join the conversation on Discord