assembly

How Windows thread stack guard page mechanism works in case of uninitialized local variables?

≯℡__Kan透↙ 提交于 2020-02-27 07:51:35
问题 On Windows OS for x86-32/x86-64 architecture thread stack virtual memory consist of "Reserved Part" "Commit Part", "Guard Page" and "Reserved Page". Question: Imagine that I have 1 page of commit memory, and 1MB of reserve memory for thread stack. I allocate on the stack some memory equal to K Pages without initialization. K is equal for example 10. It seems that in start of stack frame memory on the stack will be allocated by user space code like this: sub esp, K*4096 Guard Page mechanism

Get MSB / LSB in register

无人久伴 提交于 2020-02-25 22:36:05
问题 This might seem like a stupid question, but I cant really tell what I'm missing On ARM 7: I have a 8 digit number in register 0, say 10110111 I want to 'loop' through this and do something with the current bit until the 8 bits are up, but I'm having alot of trouble with this simple issue.. My logic is: - get MSB / LSB of number in r0 - shift it to r1 - lsl / lsr r0 But from this logic, I dont know how you would get the MSB / LSB. Could anyone help me out? Or is there a better way of looping

how does this recursive function works

做~自己de王妃 提交于 2020-02-25 17:07:26
问题 I'm new to programming, and I'm starting to read a book about it to understand the fundamentals. I couldn't understand how the following assembly code works: it calculates the factorial of a number. I have added comments to the instructions that I can understand - clearly I'm missing something. .section .data .section .text .globl _start .globl factorial _start: pushl $4 call factorial popl %ebx movl %eax, %ebx movl $1, %eax int $0x80 factorial: pushl %ebp # push the base pointer movl %esp,

How does a bitwise AND (or TEST) with 16 test the 5th bit?

ぐ巨炮叔叔 提交于 2020-02-25 06:08:59
问题 In my College's documentation on 8086 Assembly is the following example: TEST AL, 16 ; tests the 5th bit's state Is this at all correct given what the TEST instruction does? It sets flags based on AL & 16 . How does that test the 5th bit? NOTE: there's no previously mentioned value to AL, just exactly what's shown here, so I assume this has to work in the general case. 回答1: 16 in decimal is 10000 in binary. Notice the fifth bit from the right is set, and that it is the only one. TEST is

Venus RISC-V how to loop, compare, and print?

不问归期 提交于 2020-02-25 04:03:48
问题 I am trying to loop through an array and if the number is larger than X then print. I've tried to find tutorials online but I'm just stuck on why it is not working/outputting anything. My comments kind of explain what i tried to do. .data arrayOfNums: .word 0 .word 1 .word 122 .word 1112 .word 4294967295 .word 22 .word 234234 .word 23332 .word 42 .word 23423 K: .word 2237 .text .globl main main: #### *** vv My problem starts here vv *** #### la t0 K #set t0 to K la t1 arrayOfNums #set t1

Finding an efficient shift/add/LEA instruction sequence to multiply by a given constant (avoiding MUL/IMUL)

假装没事ソ 提交于 2020-02-24 09:59:45
问题 I'm trying to write a C program mult.c that has a main function that receives 1 int argument (parsed with atoi(argv[1]) ), that is some constant k we want to multiply by. This program will generate an assembly file mult.s that implements int mult(int x) { return x * k; } for that constant k . (This is a followup to Efficient Assembly multiplication) For example: if main() in mult.c gets 14 as argument it may generate (though it is not minimal as later emphasized): .section .text .globl mult

Efficiently check an FP bit-pattern for being a whole integer. Faster to branch once on a combination of conditions?

吃可爱长大的小学妹 提交于 2020-02-24 09:05:06
问题 I have the next ASM code: mov r10 , 9007199254740990 ; mask mov r8 , rax shr r8 , 53 sub r8 , 1023 cmp r8 , 52 ; r8 - 52 < 0 setnb ch shrx r11 , r10 , r8 and r11 , rax setne cl ; r11 == 0 test rcx , rcx jz @C_2 ret @C_2: ; integer ret Well, here we have only one branch instruction. And we can rewrite this code by replacing SETcc instructionos on corresponding Jump instructions, and thus we'll get two branch instructions in the code above. My question is, which code will run faster in common

MITE (legacy pipeline) used instead of DSB (uops cache) when jump is not quite aligned on 32 bytes

强颜欢笑 提交于 2020-02-24 03:59:12
问题 This question used to be a part of this (now updated) question, but it seems like it should be another question, since it didn't help to get an answer to the other one. My starting point is a loop doing 3 independent additions: for (unsigned long i = 0; i < 2000000000; i++) { asm volatile("" : "+r" (a), "+r" (b), "+r" (c), "+r" (d)); // prevents C compiler from optimizing out adds a = a + d; b = b + d; c = c + d; } When this loop is not unrolled, it executes in 1 cycle (which is to be

MITE (legacy pipeline) used instead of DSB (uops cache) when jump is not quite aligned on 32 bytes

六眼飞鱼酱① 提交于 2020-02-24 03:56:44
问题 This question used to be a part of this (now updated) question, but it seems like it should be another question, since it didn't help to get an answer to the other one. My starting point is a loop doing 3 independent additions: for (unsigned long i = 0; i < 2000000000; i++) { asm volatile("" : "+r" (a), "+r" (b), "+r" (c), "+r" (d)); // prevents C compiler from optimizing out adds a = a + d; b = b + d; c = c + d; } When this loop is not unrolled, it executes in 1 cycle (which is to be

FAT 12 Implementation

寵の児 提交于 2020-02-23 04:12:08
问题 I have been following the Operating System development tutorial on http://www.brokenthorn.com. Right now I'm trying to setup the BIOS parameter block with this code: jmp loader bpbName db "NubOS",0,0,0 bpbBytesPerSector: DW 512 bpbSectorsPerCluster: DB 1 bpbReservedSectors: DW 1 bpbNumberOfFATs: DB 2 bpbRootEntries: DW 224 bpbTotalSectors: DW 2880 bpbMedia: DB 0xF0 bpbSectorsPerFAT: DW 9 bpbSectorsPerTrack: DW 18 bpbHeadsPerCylinder: DW 2 bpbHiddenSectors: DD 0 bpbTotalSectorsBig: DD 0