We support tons of local LLMs
You can change the global defaults in settings (you'll need both an LLM and an embed_model)
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model
Or, you can override these models in local interfaces
VectorStoreIndex.from_documents(..., embed_model=embed_model)
index.as_query_engine(llm=llm)
@Logan M do I make the changes in my script or the package script? I'm confused what you meant by change global defaults specifically which script to edit
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, ServiceContext, settings
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
from InstructorEmbedding.instructor import SentenceTransformer
from llama_index.readers.file import DocxReader
import chromadb
Prompt user for input
input_dir = input("Enter the directory path to load documents from: ")
collection_name = input("Enter the name of the collection to create: ")
settings.embed_model = embed_model
define embedding function
embed_model = SentenceTransformer("hkunlp/instructor-base")
required_exts = [".docx"]
documents = SimpleDirectoryReader(
input_dir=input_dir,
required_exts=required_exts,
recursive=True,
)
docs = documents.load_data()
print(f"Loaded {len(docs)} docs")
Save to disk
db = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db.get_or_create_collection(collection_name)
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(embed_model=embed_model,
chunk_size=800,
chunk_overlap=20)
index = VectorStoreIndex.from_documents(
docs, storage_context=storage_context, service_context=service_context
)
this is what the script looks likeSo in your case, you are only configuring an embed model, so the LLM is defaulting to openai
If you remove the service context from your code (which is technically deprecated), and add that settings code to the top, it would work (well, it would work until you create a query engine, which needs an LLM)
so I'm trying to embed then save the embeddings locally by saving them to disk so wouldn't I need to have the service context for that matter? Also i just put llm none in the service context and it worked
but now I'm having problems with the embed model having issues function calling specific things. Do you think I should switch from instructorembeddings to huggingfaceembeddings?
I'm not sure what you mean by "having issues function calling specific things" π
i switched from the python package for instructorembeddings to llama-index package for instructorembeddings and it successfully worked
but now I'm having issues trying to use LM studio. this is what the code looks like for my model pulling
llm = OpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")
but maybe theres another way to do it. Just trying to avoid the route of using api key
Try using openai-like instead
pip install llama-index-llms-openai-like
from llama_index.llms.openai_like import OpenAILike
llm = OpenAILike(model="mymodel", base_url="http://localhost:1234/v1", api_key="not-needed")
**Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY.
Original error:
No API key found for OpenAI.
Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization.
API keys can be found or created at
https://platform.openai.com/account/api-keysTo disable the LLM entirely, set llm=None.
**this is the error im getting. However, I might switch to localai if its more efficient unless u have a solution?
if not i can just continue troubleshooting llm solutions for LM studio or like i said switch
Whats the full traceback? Seems like to me you aren't configuring the LLM/settings/embed model properly
and its defaulting to openai somewhere
import os
from llama_index.core import VectorStoreIndex, ServiceContext
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import VectorStoreIndex, get_response_synthesizer
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.postprocessor import SimilarityPostprocessor
from llama_index.embeddings.instructor import InstructorEmbedding
from llama_index.llms.openai_like import OpenAILike
import chromadb
llm = OpenAILike(
model="mistral-7b-instruct-v0.2.Q6_K.gguf",
is_chat_model=True
)
embed_model = InstructorEmbedding("hkunlp/instructor-large")
collection_name = input("Enter the name of the collection to pull: ")
question = input("Write a query for the chatbot: ")
load from disk
db2 = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db2.get_or_create_collection(collection_name)
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
service_context = ServiceContext.from_defaults(llm=llm, embed_model='local')
index = VectorStoreIndex.from_vector_store(
vector_store,
service_context=service_context,
)
Retrieves Documents
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=3,
)
response_synthesizer = get_response_synthesizer()
#Query code configuration
query_engine = RetrieverQueryEngine(
retriever=retriever,
response_synthesizer=response_synthesizer,
node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.7)],
)
#Query
response = query_engine.query(question)
print(response)Pass the service context here
response_synthesizer = get_response_synthesizer(service_context=service_context)
weird im still getting the same error
would it be something else?
its reponding with this error:
DeprecationWarning: Call to deprecated class method from_defaults. (ServiceContext is deprecated, please use llama_index.settings.Settings
instead.) -- Deprecated since version 0.10.0.
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
Empty Response
Thats not an error, thats a warning
Your similarity cutoff probably removed all nodes
try removing it or lowering it
(Service context is deprecated technically)
Do i need to change how my response prompt is formatted? I removed the node postprocessors so the cutoff doesn't exist but still getting the same error of empty response
okay nvm i had the wrong vectorstore chosen. Okay now im running into a new error. The terminal repeatively shoots out invaled openai key
You still need an api key, even if its fake
And a (real) api base
llm = OpenAILike(
model="mistral-7b-instruct-v0.2.Q6_K.gguf",
api_key="fake"
api_base="123"
is_chat_model=True
)