You can remove the symbols and small nodes by iterating over the nodes and removing them and then insert them into your index.
# parse nodes
parser = SentenceSplitter()
nodes = parser.get_nodes_from_documents(documents)
for node in nodes:
# perform the operation of removing here...
for similar, you'll have to do it manually as well by comparing. Not sure if there is any method on this!