How does asynchronous training work in distributed Tensorflow?

后端 未结 3 1630
孤独总比滥情好
孤独总比滥情好 2020-12-12 19:43

I\'ve read Distributed Tensorflow Doc, and it mentions that in asynchronous training,

each replica of the graph has an independent training loop that

3条回答
  •  清歌不尽
    2020-12-12 20:25

    In asynchronous training there is no synchronization of weights among the workers. The weights are stored on the parameter server. Each worker loads and changes the shared weights independently from each other. This way if one worker finished an iteration faster than the other workers, it proceeds with the next iteration without waiting. The workers only interact with the shared parameter server and don't interact with each other.

    Overall it can (depending on the task) speedup the computation significantly. However the results are sometimes worse than the ones obtained with the slower synchronous updates.

提交回复
热议问题