Currently I\'m training with single words only which are converted to a numerical vector. The problem is that all the classification is based on the occurrence of single wor