arm

GCC ARM Performance drop

你离开我真会死。 提交于 2021-02-11 18:19:25
问题 I stumbled upon very strange issue with GCC. The issue is 25% drop in performance. Here is the story. I have a pice of software which is fp32 compute intensive (neural networks compiled with TVM). I compile it for ARM (rk3399 device), here is info: gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/5/lto-wrapper Target: arm-linux-gnueabihf Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12' --with-bugurl

GCC ARM Performance drop

回眸只為那壹抹淺笑 提交于 2021-02-11 18:18:57
问题 I stumbled upon very strange issue with GCC. The issue is 25% drop in performance. Here is the story. I have a pice of software which is fp32 compute intensive (neural networks compiled with TVM). I compile it for ARM (rk3399 device), here is info: gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/5/lto-wrapper Target: arm-linux-gnueabihf Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12' --with-bugurl

GCC ARM Performance drop

限于喜欢 提交于 2021-02-11 18:18:36
问题 I stumbled upon very strange issue with GCC. The issue is 25% drop in performance. Here is the story. I have a pice of software which is fp32 compute intensive (neural networks compiled with TVM). I compile it for ARM (rk3399 device), here is info: gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/5/lto-wrapper Target: arm-linux-gnueabihf Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12' --with-bugurl

Convert DWT cycle count to time using STM32 and HAL

醉酒当歌 提交于 2021-02-11 17:01:58
问题 I am developing on STM32F302R8 in FreeRTOS. I am using the following DWT code from here to profile execution time. My DWT cycle count seems to be working, but I am unsure how to convert it into seconds. From what I gathered online, it seems like the cycle count is based on the CPU frequency. Which HAL function will return the correct CPU frequency for me? I am thinking that it's one of the following uint32_t HAL_RCC_GetSysClockFreq(void); uint32_t HAL_RCC_GetHCLKFreq(void); uint32_t HAL_RCC

Selected processor does not support `dmb ish' in ARM mode

让人想犯罪 __ 提交于 2021-02-11 14:34:22
问题 I am building an embedded linux distro on a Beaglebone Black (AM335x chip Cortex-A8 Arm-v7 Instruction set) using crosstool-NG, U-Boot, Kernel (5.5.5) and buildroot. When compiling the kernel I am getting this error message: /tmp/ccxFZlyN.s: Assembler messages: /tmp/ccxFZlyN.s:39: Error: selected processor does not support `isb ' in ARM mode /tmp/ccxFZlyN.s:90: Error: selected processor does not support `isb ' in ARM mode /tmp/ccxFZlyN.s:371: Error: selected processor does not support `isb '

Understanding ARM relocation (example: str x0, [tmp, #:lo12:zbi_paddr])

左心房为你撑大大i 提交于 2021-02-11 13:21:58
问题 I found this line of assembly in zircon kernel start.S str x0, [tmp, #:lo12:zbi_paddr] for ARM64. I also found that zbi_paddr is defined in C++: extern paddr_t zbi_paddr; So I started looking about what does #:lo12: mean. I found https://stackoverflow.com/a/38608738/6655884 which looks like a great explanation, but it does not explain the very basic: what is a rellocation and why some things are needed. I guess that since zbi_paddrr is defined in start.S and used in C++ code, since start.S

Understanding ARM relocation (example: str x0, [tmp, #:lo12:zbi_paddr])

一个人想着一个人 提交于 2021-02-11 13:21:56
问题 I found this line of assembly in zircon kernel start.S str x0, [tmp, #:lo12:zbi_paddr] for ARM64. I also found that zbi_paddr is defined in C++: extern paddr_t zbi_paddr; So I started looking about what does #:lo12: mean. I found https://stackoverflow.com/a/38608738/6655884 which looks like a great explanation, but it does not explain the very basic: what is a rellocation and why some things are needed. I guess that since zbi_paddrr is defined in start.S and used in C++ code, since start.S

Execution freezes when I try to allocate Array in Armv8 assembly

你。 提交于 2021-02-11 13:14:55
问题 So I am programming in assemply, this is just a simple code so I can learn how to allocate arrays in order to use them on NEON programming later. ASM_FUNC(FPE) .data .balign 8 array: .skip 80 array1: .word 10,20,30,40 .text ldr x0,=array mov x1,#10 check: cmp x1,#1 bne loop b exit loop: str x1,[x0],#8 //Stores the value in x1 into x0 and moves the address +8 bytes sub x1,x1,#1 //x1-- b check exit: mov x0,#11 ret So, some parts are commented so I could try to find where the code is breaking (I

Execution freezes when I try to allocate Array in Armv8 assembly

这一生的挚爱 提交于 2021-02-11 13:14:28
问题 So I am programming in assemply, this is just a simple code so I can learn how to allocate arrays in order to use them on NEON programming later. ASM_FUNC(FPE) .data .balign 8 array: .skip 80 array1: .word 10,20,30,40 .text ldr x0,=array mov x1,#10 check: cmp x1,#1 bne loop b exit loop: str x1,[x0],#8 //Stores the value in x1 into x0 and moves the address +8 bytes sub x1,x1,#1 //x1-- b check exit: mov x0,#11 ret So, some parts are commented so I could try to find where the code is breaking (I

Execution freezes when I try to allocate Array in Armv8 assembly

戏子无情 提交于 2021-02-11 13:14:17
问题 So I am programming in assemply, this is just a simple code so I can learn how to allocate arrays in order to use them on NEON programming later. ASM_FUNC(FPE) .data .balign 8 array: .skip 80 array1: .word 10,20,30,40 .text ldr x0,=array mov x1,#10 check: cmp x1,#1 bne loop b exit loop: str x1,[x0],#8 //Stores the value in x1 into x0 and moves the address +8 bytes sub x1,x1,#1 //x1-- b check exit: mov x0,#11 ret So, some parts are commented so I could try to find where the code is breaking (I