It seems like for some reason Gemini just didn't like the unicode characters. Claude/OpenAI were able to read around them, but Gemini just seemed to short circuit. In addition, it seems to prioritize short entries (so was pulling back like 4 character page number entries) for some reason. I had to clear out both fo those (The unicode characters and the short entries) before it started pulling back relevant chunks. Even then it seemed to be doing a relatively poor job so I just went back to Claude/OpenAI embeddings. π
Also Gemini was "soft" limiting me to like 16000 tokens (even on Gemini 1.5 Pro which I was very excited to get access to in the API), which is annoying since it advertises a 1 million token context window. It just returns a 500 server error if I sent anything more than 16000 tokens. I suspect it's still in testing.