deprecation error in sklearn about empty array without any empty array in my code

匿名 (未验证) 提交于 2019-12-03 08:46:08

问题:

I am just playing around encoding and decoding but I get this error from sklearn:

Warning (from warnings module): File "C:\Python36\lib\site-packages\sklearn\preprocessing\label.py", line 151 if diff: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use array.size > 0 to check that an array is not empty.

Here is the full code, you can run it yourself in python 3+

My question is why is it saying I use an empty array as I clearly don't in my code, thanks for taking your time to answer my question.

### label encoding ###  import numpy as np from sklearn import preprocessing  # Sample input labels input_labels = ["red", "black", "red", "green",\                 "black", "yellow", "white"]  # Create label encoder abd fit the label encoder = preprocessing.LabelEncoder() encoder.fit(input_labels)  # Print the mapping print("\nLabel mapping:") for i, item in enumerate(encoder.classes_):     print(item, "-->", i)  # Encode a set of labels using encoder test_labels = ["green", "red", "black"] encoded_values = encoder.transform(test_labels) print("\nLabels =", test_labels) print("Encoded values =", list(encoded_values))  # Decode a set of values using the encoder encoded_values = [3, 0, 4, 1] decoded_list = encoder.inverse_transform(encoded_values) print("\nEncoded values =", encoded_values) print("Decoded labels=", list(decoded_list)) 

回答1:

TLDR: You can ignore the warning. It is caused by sklearn doing something internally that is not quite ideal.


The warning is actually caused by numpy, which deprecated truth testing on empty arrays:

The long and short is that truth-testing on empty arrays is dangerous, misleading, and not in any way useful, and should be deprecated.

This means one is not supposed to do something like if array: to check if array is empty. However, sklearn does this in the 0.19.1 release:

    diff = np.setdiff1d(y, np.arange(len(self.classes_)))     if diff:         raise ValueError("y contains new labels: %s" % str(diff)) 

Because your version of numpy is recent enough it complains and issues a warning.

The problem has been fixed in sklearn's current master branch, so I'd expect the fix to be included in the next release.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!