The standard is float32 but I\'m wondering under what conditions it\'s ok to use float16?
I\'ve compared running the same covnet with both datatypes and haven\'t noticed
According to this study:
Gupta, S., Agrawal, A., Gopalakrishnan, K., & Narayanan, P. (2015, June). Deep learning with limited numerical precision. In International Conference on Machine Learning (pp. 1737-1746). At: https://arxiv.org/pdf/1502.02551.pdf
stochastic rounding was required to obtain convergence when using half-point floating precision (float16); however, when that rounding technique was used, they claimed to get very good results.
Here's a relevant quotation from that paper:
"A recent work (Chen et al., 2014) presents a hardware accelerator for deep neural network training that employs fixed-point computation units, but finds it necessary to use 32-bit fixed-point representation to achieve convergence while training a convolutional neural network on the MNIST dataset. In contrast, our results show that it is possible to train these networks using only 16-bit fixed-point numbers, so long as stochastic rounding is used during fixed-point computations."
For reference, here's the citation for Chen at al., 2014:
Chen, Y., Luo, T., Liu, S., Zhang, S., He, L., Wang, J., ... & Temam, O. (2014, December). Dadiannao: A machine-learning supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (pp. 609-622). IEEE Computer Society. At: http://ieeexplore.ieee.org/document/7011421/?part=1