I\'m using NLTK word_tokenizer to split a sentence into words.
word_tokenizer
I want to tokenize this sentence:
في_بيتنا كل شي لما تحت
I always recommend using nltk.tokenize.wordpunct_tokenize. You can try out many of the NLTK tokenizers at http://text-processing.com/demo/tokenize/ and see for yourself.
nltk.tokenize.wordpunct_tokenize