问题
I want to scatter some data points.
I wrote:
sumy = sum(np.unique(y_train))+1
yy = y_train/sumy
plt.scatter(
X_lda.iloc[:,0],
X_lda.iloc[:,1],
c=yy,
cmap='rainbow')
/usr/local/lib/python3.6/dist-packages/matplotlib/colors.py in to_rgba_array(c, alpha)
277 result[mask] = 0
278 if np.any((result < 0) | (result > 1)):
--> 279 raise ValueError("RGBA values should be within 0-1 range")
280 return result
281 # Handle single values.
ValueError: RGBA values should be within 0-1 range
y_train is an integer between 0.0 to 9.0 and is considered to be the class of each data which I want to use as the color of that point. As you see, I even tried to normalize it between 0-1 as requested but it still throws an error.
回答1:
If you divide by the sum, in general you get very small values, much smaller than 1. Also, dividing by the sum doesn't work when there are negative values involved.
A better approach is to subtract the minimum, which sets the lowest negative (or positive) to zero. And then divide by (maximum - minimum + some epsilon). So, yy = (ytrain - np.min(ytrain)) / (np.max(ytrain)) - np.min(ytrain) + 1e-6).
There is a matlibfunction norm = mpl.colors.Normalize(vmin,vmax) which takes care of this. It can be added as an extra parameter to most plotting functions that work with a colormap, so you don't need to create a separate array.
Usage:
norm = mpl.colors.Normalize(np.min(ytrain), np.maxn(ytrain))
plt.scatter(x, y, c=ytrain, norm=norm, cmap='rainbow')
回答2:
The following solved the problem:
matplotlibcan’t handle categorical variables directly. Thus, we encode every class as a number so that we can incorporate the class labels into our plot.
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y = le.fit_transform(y_train)
plt.xlabel('LD1')
plt.ylabel('LD2')
plt.scatter(
X_lda.iloc[:,0].real,
X_lda.iloc[:,1].real,
c=y,
cmap='rainbow',
alpha=0.7)
来源:https://stackoverflow.com/questions/59466325/matplotlib-rgba-values-should-be-within-0-1-range