I would like my C function to efficiently compute the high 64 bits of the product of two 64 bit signed ints. I know how to do this in x86-64 assembly, with imulq and pullin
Wait, you have a perfectly good, optimized assembly solution already working for this, and you want to back it out and try to write it in an environment that doesn't support 128 bit math? I'm not following.
As you're obviously aware, this operation is a single instruction on x86-64. Obviously nothing you do is going to make it work any better. If you really want portable C, you'll need to do something like DigitalRoss's code above and hope that your optimizer figures out what you're doing.
If you need architecture portability but are willing to limit yourself to gcc platforms, there are __int128_t (and __uint128_t) types in the compiler intrinsics that will do what you want.