Fast bignum square computation
问题 To speed up my bignum divisons I need to speed up operation y = x^2 for bigints which are represented as dynamic arrays of unsigned DWORDs. To be clear: DWORD x[n+1] = { LSW, ......, MSW }; where n+1 is number of used DWORDs so value of number x = x[0]+x[1]<<32 + ... x[N]<<32*(n) The question is: How do I compute y = x^2 as fast as possible without precision loss? - Using C++ and with integer arithmetics (32bit with Carry) at disposal. My current approach is applying multiplication y = x*x