instructions

What is the meaning of parentheses in opcodes in a NASM generated listing file?

本秂侑毒 提交于 2019-12-01 21:45:12
When looking at a listing file that was generated by NASM, I see that there are three kinds of opcodes: Without parentheses. With round parentheses. With square parentheses. What they mean? when each of them is used? This is an example of a listing file that demonstrate all of the above: 1 section .text 2 extern printf 3 extern fgets 4 00000000 313233 str3: db "123" 5 main: 6 00000003 68[00000000] push str1 7 00000008 68[09000000] push str2 8 0000000D 68[00000000] push str3 9 00000012 E8(00000000) call func1 10 00000017 E8(04000000) call func2 11 0000001C E80B000000 call func3 12 00000021 E8

Trying to understand this short assembler instruction but I don't understand

别等时光非礼了梦想. 提交于 2019-12-01 18:11:35
问题 We had a task, given was an assembler instruction of a 2-addressing machine: mov 202, 100[r1+] Note down a minimal assembler instruction sequence which replaces this instruction (see above) where n[rx+] : register indexed by post increment; n is index value and rx is register x single numeric value: directly addressed / stored The addresses we are supposed to use are: rx - register direct addressing [rx] - register indirect addressing #n - directly addressing And we are only allowed to use

Do modern cpus skip multiplications by zero?

試著忘記壹切 提交于 2019-12-01 18:08:27
I would like to know if current cpus avoid multiplying two numbers when at least one of them is a zero. Thanks Modern CPUs - what do you mean by that? Do you mean most commonly used (like x86, AMD64, ARM) or most recently developed. Every processor architecture has it's own properties. Moreover, each company (like Intel or AMD) can make processor different way (that usually being company secret). As you question goes, I doubt that. You know, even checking if number is equal to zero twice prior EVERY multiplication is too much overhead, if you account how low percentage of multiply operations

Do modern cpus skip multiplications by zero?

穿精又带淫゛_ 提交于 2019-12-01 17:11:20
问题 I would like to know if current cpus avoid multiplying two numbers when at least one of them is a zero. Thanks 回答1: Modern CPUs - what do you mean by that? Do you mean most commonly used (like x86, AMD64, ARM) or most recently developed. Every processor architecture has it's own properties. Moreover, each company (like Intel or AMD) can make processor different way (that usually being company secret). As you question goes, I doubt that. You know, even checking if number is equal to zero twice

measure time to execute single instruction

放肆的年华 提交于 2019-12-01 10:45:50
Is there a way using C or assembler or maybe even C# to get an accurate measure of how long it takes to execute a ADD instruction? Yes, sort of, but it's non-trivial and produces results that are almost meaningless, at least on most reasonably modern processors. On relatively slow processors (e.g., up through the original Pentium in the Intel line, still true on most small embedded processors) you can just look in the processor's data sheet and it'll (normally) tell you how many clock ticks to expect. Quick, simple, and easy. On a modern desktop machine (e.g., Pentium Pro or newer), life isn't

How does mtune actually work?

孤街浪徒 提交于 2019-12-01 03:21:10
There's this related question: GCC: how is march different from mtune? However, the existing answers don't go much further than the GCC manual itself. At most, we get: If you use -mtune , then the compiler will generate code that works on any of them, but will favour instruction sequences that run fastest on the specific CPU you indicated. and The -mtune=Y option tunes the generated code to run faster on Y than on other CPUs it might run on. But exactly how does GCC favor one specific architecture, when bulding, while still being capable of running the build on other (usually older)

How does mtune actually work?

五迷三道 提交于 2019-11-30 23:50:30
问题 There's this related question: GCC: how is march different from mtune? However, the existing answers don't go much further than the GCC manual itself. At most, we get: If you use -mtune , then the compiler will generate code that works on any of them, but will favour instruction sequences that run fastest on the specific CPU you indicated. and The -mtune=Y option tunes the generated code to run faster on Y than on other CPUs it might run on. But exactly how does GCC favor one specific

How do ASCII Adjust and Decimal Adjust instructions work?

ε祈祈猫儿з 提交于 2019-11-30 21:18:50
I've been struggling with understanding the ASCII adjust instructions from x86 assembly language. I see all over the internet information telling me different things, but I guess it's just the same thing explained in a different form that I still don't get. Can anyone explain why in the pseudo-code of AAA , AAS we have to add, subtract 6 from the low-order nibble in AL? And can someone explain AAM , AAD and the Decimal adjust instructions pseudo-code in the Intel instruction set manuals too, why are they like that, what's the logic behind them? And at last, can someone give examples when these

Assembly programming - WinAsm vs Visual Studio 2017

徘徊边缘 提交于 2019-11-30 15:59:24
I'm here to ask you some stuff about VS2017. In the past I had used WinAsm for MASM and I never got problems with it. However, when I'm trying to do something with MASM in VS2017, I always gonna get problems and stuff... I've checked the whole internet about "how to set up VS for MASM", but nothing has helped me as I'm always getting troubles... Is there any way to use Visual Studio 2017 for MASM32/64bit without any kind of headache? Can someone give me the ultimate guide to set up VS2017 for assembly programming? Thanks you very much and sorry for my weak english. How to build a x64/x86

How to process a 24-bit 3 channel color image with SSE2/SSE3/SSE4?

不羁的心 提交于 2019-11-30 10:35:46
I just started to use SS2 optimization of image processing, but for the 3 channel 24 bit color images have no idea. My pix data arranged by BGR BGR BGR ... ,unsigned char 8-bi, so if I want to implement the Color2Gray with SSE2/SSE3/SSE4's instruction C/C++ fun ,how would I do? Does need to align(4/8/16) for my pix data? I have read article: http://supercomputingblog.com/windows/image-processing-with-sse/ But it is ARGB 4 channel 32-bit color,exactly process 4 color pix data every time. Thanks! //Assume the original pixel: unsigned char* pDataColor=(unsigned char*)malloc(src.width*src.height*3