The community member tried to use custom LLM API and embedding API, but encountered a ConnectionError caused by tiktoken in ChatMemoryBuffer. Other community members suggested using a custom tokenizer function and setting it in the memory buffer. However, the community member faced an issue where they needed to import ChatMemoryBuffer before setting the tokenizer, but their development environment could not connect to OpenAI. The community member has set up LLM and embedding services in the cloud, as well as Milvus, and is now looking for suggestions on how to utilize llama_index with their own resources.
I tried to use custom LLM API and embedding API (not via openai, or huggingface). I implemented custom LLM class and custom embedding class. However, when I tried to use it, I got ConnectionError, which is caused by tiktoken in ChatMemoryBuffer
I found an issue! I need to import ChatMemoryBuffer before setting the tokenizer. However, as my enviroment cannot connect to openai, But my development environment cannot connect to OpenAI, so I receive an error immediately after importing llama_index.
My current situation is that I have established a LLM service and embedding service in the cloud, and I have also deployed Milvus in the cloud. Now, I want to use my own resources to utilize llama_index. Do you have any suggestions?