Generating adversarial data from cleverhans attack models

若如初见. 提交于 2019-12-24 11:56:21

问题


I want a code example to how to generate train data from clever hans' adversarial attacks.

adv_x = fgsm.generate_np(X_test, **fgsm_params)

This generates adversarial x data but how can I get y?

adv_pred = model.predict_classes(adv_x)

And this will give the "fooled" results right?

What I want is to correctly show generated x, y, fooled y (by which I mean results of models predictions that may be false because of the attack). I'm using Mnist btw, if it helps.


回答1:


Based on the code snippets you shared, I would make two suggestions:

  • It is generally not a good idea to train the model on test data (if you are going to use that test data to evaluate its performance afterwards) so I would replace X_test by X_train in your first line.

  • To get the label for your adversarial examples, you can use the original labels of the training data or the predictions of the model on the original training data model.predict_classes(X_train) (this assumes that the adversarial example is not perturbed enough to change the label of the input).



来源:https://stackoverflow.com/questions/53827086/generating-adversarial-data-from-cleverhans-attack-models

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!