Amardeep

Langchain

Hi, When I used the db.save_local("faiss_index") function to save my vectors, it deleted my old vector and stored the new created vector, but I want to save the new created vector in append mode so that I can utilise both my old and new created vectors. Can someone assist me with this?
code: using langchain
def StoreNewData():
all_text_data = extract_text_from_documents_in_directory(UPLOAD_DIR)
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
length_function=len
)
chunks = text_splitter.split_text(text=all_text_data)

embeddings = OpenAIEmbeddings()
db = FAISS.from_texts(chunks, embedding=embeddings)
db.save_local("faiss_index")

StoreNewData()
sys.exit()

5 comments

AAmardeep

Recreate the storage context

I ran across a problem with the following code: llama_index import StorageContext, load_index_from_storage

Recreate the storage context

StorageContext.from_defaults(persist_dir='./storage') storage_context
index load = load_index_from_storage(storage_context)

When I load a new document to generate a vector index, it creates a new vector index file, but I want to build a vector index in append mode, which stores my vector from the previous file as well as creates a vector index for the new file.

can some body help me with it

1 comment

AAmardeep

from llama index import

from llama_index import SimpleDirectoryReader, GPTVectorStoreIndex, LLMPredictor, PromptHelper, ServiceContext, StorageContext, load_index_from_storage
from langchain import OpenAI
import os
import openai
import gradio as gr

os.environ["OPENAI_API_KEY"] = "sjksajdkjadskcj"

openai.api_key = os.environ["OPENAI_API_KEY"]

def create_index(path):
max_input = 4096
tokens = 4096
chunk_size = 600
max_chunk_overlap = 1

promptHelper = PromptHelper(max_input, tokens, max_chunk_overlap, chunk_size_limit=chunk_size)
llmPredictor = LLMPredictor(llm=OpenAI(model_name="text-davinci-002", max_tokens=tokens))

docs = SimpleDirectoryReader(path).load_data()

service_context = ServiceContext.from_defaults(llm_predictor=llmPredictor, prompt_helper=promptHelper)

vectorIndex = GPTVectorStoreIndex.from_documents(documents=docs, service_context=service_context)

vectorIndex.storage_context.persist(persist_dir='Store') # issue in this line it over ride pre existing vector data i dont want to create vector of all document when a single document come in my Data directory.
return vectorIndex

create_index('Data')

def answerMe(question):
question = question
storage_context = StorageContext.from_defaults(persist_dir='Store')
index = load_index_from_storage(storage_context)
query_engine = index.as_query_engine()
response = query_engine.query(question)
return response

answer('my questions?')

19 comments

Find answers from the community

Langchain

Recreate the storage context

Recreate the storage context

from llama index import