My tests with your f16
and f32
arrays shows that f16
is 5-10x slower for all calculations. It's only when doing byte level operations like array copy
does more compact nature of float16 show any speed advantage.
https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html
Is the section in the gcc
docs about half floats, fp16. With the right processor and right compiler switches, it may possible to install numpy in way that speeds up these calculations. We'd also have to check if numpy
.h
files have any provision for special handling of half floats.
Earlier questions, may be good enough to be duplicate references
Python Numpy Data Types Performance
Python numpy float16 datatype operations, and float8?