I\'m trying to do some text classification using a dataset of my own, but the vectorization tool has issues with some characters that I can\'t fix. I\'m pretty much followin