问题
I'm trying to implement _mm_add_epi32
in golang assembly, optionally with help of avo. But I know little about assembly and do not even know how to start it. Can you give me some hint of code? Thank you all.
Here's the equivalent slower golang version:
func add(x, y []uint32) []uint32 {
if len(x) != len(y) {
return nil
}
result := make([]uint32, len(x))
for i := 0; i < len(x); i++ {
result[i] = x[i] + y[i]
}
return result
}
I know that the struction paddq xmm, xmm
is what we need, but do not kown how to convert a slice of []byte
to the 256 bit register YMM
.
回答1:
Here's an example for such an addition function:
// func add(x, y [8]int32) [8]int32
// q = x + y
TEXT ·add(SB),0,$0
VMOVDQU x+0(FP), Y0
VPADDD Y+32(FP), Y0, Y0
VMOVDQU Y0, q+64(FP)
VZEROUPPER
RET
Before reading this code, familiarise yourself with this document. Unfortunately, Go-style assembly (aka Plan 9-style assembly) is poorly documented.
Arrays are passed on the stack by value. A return value is passed as an extra rightmost argument read back by the caller. Use (FP)
as documented in the document I linked to access function arguments.
Apart from that, it's pretty straightforward. The syntax is similar (but not equal) to AT&T syntax. Note that the register names are different and giving a size suffix is mandatory.
As you can see, writing an assembly function for a single operation is pretty pointless. It's probably going to work a lot better to take the algorithm you need and write it completely in assembly.
来源:https://stackoverflow.com/questions/63242918/golang-assembly-implement-of-mm-add-epi32