Difference in padding integer and string in keras

自古美人都是妖i 提交于 2019-12-14 02:36:33

问题


I'm trying to pad a text for a seq2seq model.

from keras_preprocessing.sequence import pad_sequences

x=[["Hello, I'm Bhaskar", "This is Keras"], ["This is an", "experiment"]]
pad_sequences(sequences=x, maxlen=5, dtype='object', padding='pre', value="<PAD>")

I encounter following error:

ValueError: `dtype` object is not compatible with `value`'s type: <class 'str'>
You should set `dtype=object` for variable length strings.

However, when I try to do same for integer it works well.

x=[[1, 2, 3], [4, 5, 6]]
pad_sequences(sequences=x, maxlen=5, padding='pre', value=0)

Output:
array([[0, 0, 1, 2, 3],
       [0, 0, 4, 5, 6]], dtype=int32)

I hope to get output as:

[["<PAD>", "<PAD>", "<PAD>", "Hello, I'm Bhaskar", "This is Keras"], ["<PAD>", "<PAD>","<PAD>", "This is an", "experiment"]]

回答1:


As suggested by the Error, change dtype to object(not string but to an object itself), It will do the job for you.

from keras.preprocessing.sequence import pad_sequences

x=[["Hello, I'm Bhaskar", "This is Keras"], ["This is an", "experiment"]]
pad_sequences(sequences=x, maxlen=5, dtype=object, padding='pre', value="<PAD>")

Output

array([['<PAD>', '<PAD>', '<PAD>', "Hello, I'm Bhaskar", 'This is Keras'],
       ['<PAD>', '<PAD>', '<PAD>', 'This is an', 'experiment']],
      dtype=object)


来源:https://stackoverflow.com/questions/55220072/difference-in-padding-integer-and-string-in-keras

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!