Creating an x86 assembler program that converts an integer to a 16-bit binary string of 0's and 1's

前端 未结 2 1693
逝去的感伤
逝去的感伤 2020-12-12 03:52

As the question suggests, I have to write a MASM program to convert an integer to binary. I have tried many different approaches, but none of them helped me at all. The fina

相关标签:
2条回答
  • 2020-12-12 04:46

    This is high-level answer to explain some terms.

    Part 1 - about integer numbers and their encoding in computer

    Integer value is integer value, in math it's purely abstract thing. Number "5" is not what you see on the monitor (that's digit 5 (graphical image or "glyph") representing value 5 in base-10 (decimal) format for humans (and some trained animals) who can recognize that glyph pattern; the value 5 itself is purely abstract).

    When you use int in C++, it's not completely abstract, it's lot more hard-wired into the metal. It's 32 bit (on most of the platforms) integer value.

    But still that abstract description is much closer to truth, than imagining it as human decimal format of it.

    int a = 12345; // decimal number
    

    Here a contains value 12345, not the format. It's not aware it was entered as decimal string in the source code.

    int a = 0x3039; // hexadecimal number
    

    will compile into the exactly same machine code, for CPU it's the same thing, still (a == 12345). And finally:

    int a = 0b0011000000111001; // binary number
    

    is again the same thing. It's still the same 12345 value, just written in different formatting.

    The last binary form is closest to what CPU is using to store the value. It is stored in 32 bits (low/high voltage cells/wires), so if you would measure voltage on particular cell/wire, you would see the "0" voltage level on top 18 bits, then 2 bits with "1" voltage level, and then the rest like in that binary format above... With two least significant bits being "0" and "1".

    Also most of CPU circuitry is not aware of particular value of particular bit, that's again "interpretation" of that 0/1, done by the code. Many CPU algorithms like add or sub work "from right to left" over all bits, not being aware that currently processed bit is representing in final integer value for example 213 value (that's the 14th least significant bit).

    It's upon taking those bits, and calculating string with decimal/hexadecimal/binary representation of those bit values, when you give those "1"s their value. So then it becomes text "12345".

    If you treat those 32 bits in different way, like representation of ON/OFF LED lights for a LED display panel, then so it will be, once you send it from CPU to the display, the LED display panel will turn on corresponding LED lights, not caring that those bits form also 12345 value when treated as int.

    Only very few CPU instructions work in a way, where they need to be aware of particular value of particular bit.

    Part 2 - about input, output and arguments of C/C++ functions

    You want to "convert decimal integer (input) to binary."

    So let's reason what is input and what is output. Input is taken from std::cin, so the user will enter string.

    Yet if you will do:

    int inputNum;
    std::cin >> inputNum;
    

    You will end with already converted integer value (32 bits, see above) (or invalid std::cin state, when user will not enter correct number, probably not your task to handle this).

    If you have the number in int, the binary conversion was already done by the clib, when it was encoding user input string as 32 bit integer.

    Now you can create asm function with C prototype:

    void formatToBinary(uint16_t value, char result[17]);
    

    That means you will give it uint16_t (unsigned 16 bit) integer value, and pointer to 17 reserved bytes in memory, where you will write '0' and '1' ASCII characters, and terminate it by another 0 value (for rough description of this one follow my first link in comments under your question).

    If you must take input as string, ie.

    char str[17];
    std::cin > str;
    

    Then you will have in str (after "12345" input) bytes with values: '1' (49 in decimal), '2', '3', '4', '5', 0. (Note the last one is zero, NOT ASCII digit '0' = value 48).

    You will need first to convert these ASCII bytes into integer value (in C++ atoi may help, or one of few other functions for conversions/formatting). In ASM check SO for questions "how to enter integer".

    Once you will convert it to integer value, you can proceed the same way as described a bit above (at that moment it's already encoded in 16 or 32 bits, so outputting string representation of it should be easy).

    You may still run into some tricky parts, like if you don't want to output leading zeroes, etc... but all of that should be easy, if you understand how this works.

    In this case your ASM function prototype may be only void convertToBinary(char*); to reuse the string pointer both as input, and output.

    Your int intToBin(char*); looks weird, because it means the ASM will return int .. but why? That's integer value, not bonded into any particular formatting, so it's binary/octal/decimal/hexa at the same time. Depends how you display it. So you don't need it, you need only the string representing the value in binary form, that's that char *. And you don't give it the number you entered (unless it's taking it from the string).


    From the task description and your skill level I think you are allowed to convert the input into int right in C++ (ie. std::cin >> int_variable;).


    BTW, if you fully understand what is happening to values in computer, and how CPU instruction work over them, you can often come with many different ways how to achieve some result. For example Jose's conversion to binary is written in simple way how an Assembly newcomer would write it (he wrote it like that to make it easier for you to understand):

               mov eax, num   // ◄■■ THE NUMBER.
               lea edi, bin   // ◄■■ POINT TO VARIABLE "BIN".
               mov ecx, 32    // ◄■■ NUMBER IS 32 BITS.
            conversion:
                shl eax, 1     // ◄■■ GET LEFTMOST BIT.
                jc  bit1       // ◄■■ IF EXTRACTED BIT == 1
                mov [edi], '0'
                jmp skip
            bit1:
                mov [edi], '1'
            skip :
                inc edi   // ◄■■ NEXT POSITION IN "BIN".
                loop conversion
    

    It's still a bit fragile, for example he initializes "bin" in such way, that it contains 32 spaces and 33th value is zero (null terminator of C string). Then in code he does modify exactly 32 bytes, so the 33th zero is still there and working. If you would adjust his code to skip leading zeroes, it would "break" by displaying remaining part of buffer, as he doesn't set null terminator explicitly.

    This is common way how to code in Assembly for performance, to be exactly aware of everything happening, and not setting values which are already set/etc. While you are learning, I would suggest you to work in "defensive" way, rather doing some wasteful things, which will work as safety net in case of some mistake, so I would add mov byte ptr [edi],0 after loop to set terminator explicitly again.

    But it is actually not very fast, as it is using branching. CPU doesn't like that, decoding new instructions is a costly task, and if it is not sure, which instructions will be executed, it simply decodes ahead one path, and in case of wrong guess, it will throw it out, and decode the correct path, but that means several cycles pause in execution, until first instruction of new path is fully decoded and ready for execution.

    So when coding for performance, you want to avoid hard-to-predict branches (the final loop is easy to predict for CPU, as it always loops, only until final exit after ecx is 0). One of many possible ways in this case can be:

       mov edx, num
       lea edi, bin
       mov ah,'0'/2   // for fast init of al later
       // '0' is 48 (even), '0'/2 will work (24)
       mov ecx, 32    // countdown counter
    conversion:
       mov al,ah      // al = '0'/2
       shl edx, 1     // most significant bit into CF
       adc al,al      // al = '0'/2 + '0'/2 + CF = '0' or '1'
       stosb          // store the '0' or '1' to [edi++]
       dec ecx        // manually written "loop"
       jnz conversion // (it is faster on modern CPUs)
       mov [edi],ch   // explicit set of null-terminator
           // (ch == 0, because here ecx == 0)
    

    As you can see, now there is no branching except the loop, CPU branch prediction will handle this much more smoothly, and the performance will be considerably better.


    A dword variant for discussion with Cody (NASM syntax, 32b target):

    ; .data
    binNumber   times 36 db 0
    
    ; .text
    numberToBin:
        mov     edx,0x12345678
        lea     edi,[binNumber]
        mov     ecx, 32/4       ; countdown counter
    n2b_conversion:
        mov     eax,0b11000000110000001100000011000
          ; ^ will become '0'/'1' for each of four bits
        shl     edx,1
        rcr     eax,8
        shl     edx,1
        rcr     eax,8
        shl     edx,1
        rcr     eax,8
        shl     edx,1
        rcr     eax,8
          ; here was "or eax,'0000'" => no more needed.
        stosd
        dec     ecx
        jnz     n2b_conversion
        mov     [edi],dl        ; null terminator
        ret
    

    Didn't profile it, just verified it return correct result.

    0 讨论(0)
  • 2020-12-12 04:50

    Next is an example of using "atoi" to convert the string to number, then use assembly to convert the number to binary:

    #include "stdafx.h"
    #include <iostream>
    using namespace std;
    int _tmain(int argc, _TCHAR* argv[])
    {   char str[6]; // ◄■■ NUMBER IN STRING FORMAT.
        int num;    // ◄■■ NUMBER IN NUMERIC FORMAT.
        char bin[33] = "                                "; // ◄■■ BUFFER FOR ONES AND ZEROES.
        cout << "Enter a number: ";
        cin >> str;  // ◄■■ CAPTURE NUMBER AS STRING.
        num = atoi(str); // ◄■■ CONVERT STRING TO NUMBER.
        __asm { 
               mov eax, num   // ◄■■ THE NUMBER.
               lea edi, bin   // ◄■■ POINT TO VARIABLE "BIN".
               mov ecx, 32    // ◄■■ NUMBER IS 32 BITS.
            conversion:
                shl eax, 1     // ◄■■ GET LEFTMOST BIT.
                jc  bit1       // ◄■■ IF EXTRACTED BIT == 1
                mov [edi], '0'
                jmp skip
            bit1:
                mov [edi], '1'
            skip :
                inc edi   // ◄■■ NEXT POSITION IN "BIN".
                loop conversion
        }
        cout << bin;
        return 0;
    }
    
    0 讨论(0)
提交回复
热议问题