I am training a CNN-model to recognize only one keyword, for example, "Hi, Foo".
now I have around 2000 waves as training dataset, which is fed into a tiny