Keras Tokenizer num_words doesn't seem to work

后端 未结 3 1898
日久生厌
日久生厌 2020-12-15 05:17
>>> t = Tokenizer(num_words=3)
>>> l = [\"Hello, World! This is so&#$ fantastic!\", \"There is no other world like this one\"]
>>> t.f         


        
3条回答
  •  甜味超标
    2020-12-15 06:00

    Just a add on Marcin's answer ("it will keep the counter of all words - even when it's obvious that it will not use it later.").

    The reason it keeps counter on all words is that you can call fit_on_texts multiple times. Each time it will update the internal counters, and when transformations are called, it will use the top words based on the updated counters.

    Hope it helps.

提交回复
热议问题