Find answers from the community

Home
Members
WhiteFang_Jr
W
WhiteFang_Jr
16 minutes ago, reading Create-llama
Joined September 25, 2024
Do we have any example of Finetuned GPT3 model being used to generate response?
As I'm trying with an finetuned model but getting this error

Plain Text
Unknown model: davinci:ft-finetuned-2023-07-08-13-19-29. Please provide a valid OpenAI model name.Known models are: gpt-4, gpt-4-0314, gpt-4-32k, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-0301, text-ada-001, ada, text-babbage-001, babbage, text-curie-001, curie, davinci, text-davinci-003, text-davinci-002, code-davinci-002, code-davinci-001, code-cushman-002, code-cushman-001


This occurs in modelname_to_contextsize from langchain OpenAI class. I'm using 0.6.15 for llamaIndex. I know the version is old. lol 😅
Are we allowing using finetuned model to be used in newer versions of llamaIndex?
1 comment
W
I'm using HuggingFaceLLMPredictor to use StabilityAI/stablelm-tuned-alpha-3b
I'm currently using this prompt as stablelm requires it this way.
Plain Text
query_wrapper_prompt = SimpleInputPrompt(
    "<|SYSTEM|>Below is an instruction that describes a task."
    "Write a response that adequately completes the request.\n\n"
    "<|USER|>{query_str}\n<|ASSISTANT|>"
)


My question is can we have more inputs like query_str here? Like providing context separately and then user query?
Also If I'm using HuggingFaceLLMPredictor then if I pass the text_qa_template while creating query_engine instance. Will it make any difference?
1 comment
L
I'm trying but getting error
ValueError: shapes (1536,) and (768,) not aligned: 1536 (dim 0) != 768 (dim 0)
27 comments
A
W
L
Hi, need feedback on the following
If I were to use any opensource LLM for generating responses. GPU Size constraint for the LLM is 24GB. I'm looking into using opensource LLM in place of OpenAI.
Tried the following
  • Camel 5B
  • Stable LM 3B
  • dolly-v2-3B
What do you guys suggest.
Feedback highly appreciated! Thanks!
8 comments
C
L
They don't support recursive calling yet I think, but @Emanuel Ferreira has built something that can help you with recursive crawling.
Here's the GitHub repo: https://github.com/EmanuelCampos/monorepo-llama-index
3 comments
K
E
Yes try with putting your query with @kapa.ai. And there's Mendable on official documentation site
2 comments
k
I have set the Tokensplitter with the following parameters
text_splitter = TokenTextSplitter(separator=" ", chunk_size=512, chunk_overlap=20)

My service context also contains the same params chunk_size and chunk_ovrlap

Now when I create two document object using text splitter and insert them. I check the docstore and find there are three doc object for the two of them. One of them got chunked one more time.

Tokensplitter and service_context if contain the same values for chunk_size and overlap. Then why an extra doc is being created ?
2 comments
L
W
You can create ChatOpenAI object and pass it to the service_context like this
Plain Text
from langchain.chat_models import ChatOpenAI
llm = LLMPredictor(llm=ChatOpenAI(openai_api_key=OPENAI_API_KEY,temperature=0, max_tokens=1024, model_name="gpt-3.5-turbo"))


Then pass this into
Plain Text
service_context = ServiceContext.from_defaults(chunk_size_limit=512, llm=llm)
6 comments
k
L
W
I have set my chunksize to 1024 in the service context step, but when I'm querying
Getting this error
Token indices sequence length is longer than the specified maximum sequence length for this model (1043 > 512). Running this sequence through the model will result in indexing errors

Printed the chunk size limit as well while starting the server
Plain Text
print(service_context.chunk_size_limit)

Output: 1024


It looks like Chunk size limit is getting overrided with some default value, But default value set for chunk size is 1024
15 comments
L
W
You can do it directly now

chat_engine = index.as_chat_engine()
default value is CondenseQuestion mode only
28 comments
W
V
L
I was checking index.as_chat_engine() Had some doubts

What will happen if history context gets bigger? Will it remove some previous conversation from it so that the openAI is able to predict on it?

If not I'd be happy to work on it to create a PR
7 comments
L
W