What are common sources of randomness in Machine Learning projects with Keras?

跟風遠走 提交于 2019-12-13 08:33:38

问题


Reproducibility is important. In a closed-source machine learning project I'm currently working on it is hard to achieve it. What are the parts to look at?


回答1:


Setting seeds

Computers have pseudo-random number generators which are initialized with a value called the seed. For machine learning, you might need to do the following:

# I've heard the order here is important
import random
random.seed(0)

import numpy as np
np.random.seed(0)

import tensorflow as tf
tf.set_random_seed(0)
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1,
                              inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)

from keras import backend as K
K.set_session(sess)  # tell keras about the seeded session

# now import keras stuff

See also: Keras FAQ: How can I obtain reproducible results using Keras during development?

sklearn

sklearn.model_selection.train_test_split has a random_state parameter.

What to check

  1. Am I loading the data in the same order every time?
  2. Do I initialize the model the same way?
  3. Do you use external data that might change?
  4. Do you use external state that might change (e.g. datetime.now)?


来源:https://stackoverflow.com/questions/51715573/what-are-common-sources-of-randomness-in-machine-learning-projects-with-keras

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!