I am new in tensoflow and I want to adapt the MNIST tutorial https://www.tensorflow.org/tutorials/layers with my own data (images of 40x40). This is my model function :
I already had this problem in my first time using tensorflow, i figured out that my problem was forgetting to add the attribute class_mode='sparse'
/ class_mode='binary'
to the function that uploads tha training data and validation data :
So try to look after the class_mode option
image_gen_val = ImageDataGenerator(rescale=1./255)
val_data_gen = image_gen_val.flow_from_directory(batch_size=batch_size,
directory=val_dir,
target_size=(IMG_SHAPE, IMG_SHAPE),
class_mode='sparse')
I had a similar problem and it turned out that one pooling layer was not reshaped correctly. I was incorrectly using in my case tf.reshape(pool, shape=[-1, 64 * 7 * 7])
instead of tf.reshape(pool, shape=[-1, 64 * 14 * 14])
, which lead to a similar error massage about logits and labels. Altering the factors, e.g. tf.reshape(pool, shape=[-1, 64 * 12 * 12])
lead to a completely different, less misleading error message.
Perhaps this is also the case here. I recommend going through the code checking the shapes of the nodes, just in case.
The problem is in your target shape and is related to the correct choice of an appropriate loss function. you have 2 possibilities:
1. possibility: if you have 1D integer encoded target, you can use sparse_categorical_crossentropy
as loss function
n_class = 3
n_features = 100
n_sample = 1000
X = np.random.randint(0,10, (n_sample,n_features))
y = np.random.randint(0,n_class, n_sample)
inp = Input((n_features,))
x = Dense(128, activation='relu')(inp)
out = Dense(n_class, activation='softmax')(x)
model = Model(inp, out)
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)
2. possibility: if you have one-hot encoded your target in order to have 2D shape (n_samples, n_class), you can use categorical_crossentropy
n_class = 3
n_features = 100
n_sample = 1000
X = np.random.randint(0,10, (n_sample,n_features))
y = pd.get_dummies(np.random.randint(0,n_class, n_sample)).values
inp = Input((n_features,))
x = Dense(128, activation='relu')(inp)
out = Dense(n_class, activation='softmax')(x)
model = Model(inp, out)
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)
Your logits shape looks right, batch size of 3, and output layer of size 2, which is what you defined as your output layer. Your labels should be shape [3, 2] also. Batch of 3, and each batch has 2 [1,0] or [0,1].
Also note that when you have a boolean classification output you shouldn't have 2 neurons on the output/logits layer. You can just output a single value which takes on 0 or 1, you can probably see how 2 outputs of [1,0], and [0,1] is redundant and can be expressed as a simple [0|1] value. Also you tend to get better results when you do it this way.
Thus, your logits should end up being [3,1] and your labels should be an array of 3 values, one for each of the samples in your batch.
I resolved it changing from sparse_categorical_crossentropy
to categorical_crossentropy
and is now running fine.