I have a dataframe with a column of strings mixed with these weird symbols inside. How can I use regex to preprocessing them and just to keep english words and remove all