How to normalize a mantissa

后端未结

关注

 5  1660

你的背包 2020-12-09 05:15

I\'m trying to convert an int into a custom float, in which the user specifies the amount of bits reserved for the exp and mantissa, but I don\'t understand how

5条回答

臣服心动 (楼主)

2020-12-09 05:54
Tommy -- chux and eigenchris, along with the others have provided excellent answers, but if I am looking at your comments correctly, you still seem to be struggling with the nuts-and-bolts of "how would I take this info and then use this in creating a custom float representation where the user specifies the amount of bits for the exponent?" Don't feel bad, it is a clear as mud the first dozen times you go through it. I think I can take a stab at clearing it up.

You are familiar with the IEEE754-Single-Precision-Floating-Point representation of:
```
IEEE-754 Single Precision Floating Point Representation of (13.25)

  0 1 0 0 0 0 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 |- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -|
 |s|      exp      |                  mantissa                   |
```
That the 1-bit sign-bit, 8-bit biased exponent (in 8-bit excess-127 notation), and the remaining 23-bit mantissa.

When you allow the user to choose the number of bits in the exponent, you are going to have to rework the exponent notation to work with the new user-chosen limit.

What will that change?
- Will it change the sign-bit handling -- No.
- Will it change the mantissa handling -- No (you will still convert the mantissa/significand to "hidden bit" format).
So the only thing you need to focus on is exponent handling.

How would you approach this? Recall, the current 8-bit exponent is in what is called excess-127 notation (where 127 represents the largest value for 7 bits allowing any bias to be contained and expressed within the current 8-bit limit. If your user chooses 6 bits as the exponent size, then what? You will have to provide a similar method to insure you have a fixed number to represent your new excess-## notation that will work within the user limit.

Take a 6-bit user limit, then a choice for the unbiased exponent value could be tried as 31 (the largest values that can be represented in 5-bits). To that you could apply the same logic (taking the 13.25 example above). Your binary representation for the number is 1101.01 to which you move the decimal 3 positions to the left to get 1.10101 which gives you an exponent bias of 3.

In your 6-bit exponent case you would add 3 + 31 to obtain your excess-31 notation for the exponent: 100010, then put the mantissa in "hidden bit" format (i.e. drop the leading 1 from 1.10101 resulting in your new custom Tommy Precision Representation:
```
IEEE-754 Tommy Precision Floating Point Representation of (13.25)

  0 1 0 0 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 |- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -|
 |s|    exp    |                    mantissa                     |
```
With 1-bit sign-bit, 6-bit biased exponent (in 6-bit excess-31 notation), and the remaining 25-bit mantissa.

The same rules would apply to reversing the process to get your floating point number back from the above notation. (just using 31 instead of 127 to back the bias out of the exponent)

Hopefully this helps in some way. I don't see much else you can do if you are truly going to allow for a user-selected exponent size. Remember, the IEEE-754 standard wasn't something that was guessed at and a lot of good reasoning and trade-offs went into arriving at the 1-8-23 sign-exponent-mantissa layout. However, I think your exercise does a great job at requiring you to firmly understand the standard.

Now totally lost and not addressed in this discussion is what effects this would have on the range of numbers that could be represented in this Custom Precision Floating Point Representation. I haven't looked at it, but the primary limitation would seem to be a reduction in the MAX/MIN that could be represented.
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...