I am new to programming in general so please keep that in mind when you answer my question.
I have a program that takes a large 3D array (1 billion elements) and sums up
Absolutely. At least getting each core on a thread to work on your problem concurrently will help. It's not clear if more threads would help, but it's possible.