问题
I have a list of floats generated from a data structure which is a list of dictionaries - i.e. I've iterated over the whole list and selected for certain values in the given dictionary. Now, I want to actually do something with these data points, for which I need some reference to the original position. I tried to simply use the data point as a key, but after trying and failing I did some digging and realized that floats aren't precisely represented due to the way computers work.
So, what I need is some way to assign a unique value to each dictionary in the list, e.g:
list = [...]
vallist = []
index = {}
for i in range(0, len(list)):
value = i+0.123
vallist.append(value)
index[value] = i
Except I evidently need to assign each value a unique item to be able to point back to their position in the list object. I'm imagining I could possibly create a new object called "valuelist" or something and then int over that, but this seems like something that probably has an obvious workaround that I'm just too thick to figure out.
To reiterate, what I want is a way to make the values point back to their original position in the list - in my data structure, my list contains a ton of dictionaries, and the way I handle it is somewhat more complicated, so I'm sort of stuck with my possibly impractical structure.
Thanks!
回答1:
Firstly, let's address the problems posed by using floating point.
floats aren't precisely represented due to the way computers work.
Floating point numbers are precisely represented in computers. There are, however, some limitations:
- Resolution is finite. It's impossible to represent a irrational number in finite memory, and typical floating points can only represent a couple dozen digits.
- Some decimal (base10) numbers have no exact representation in binary. For example, 0.1 cannot be represented in base 2 exactly. Running
"{0:.20f}".format(0.1)in python will return0.10000000000000000555.
Now, depending on the source of your numbers, and the kind of computations you want to perform, there are different possible solutions for indexing them.
For numbers that can be described precisely in base10, you can use a Decimal. This represents numbers in base10 exactly:
>>> from decimal import Decimal
>>> "{0:.20f}".format(Decimal('0.1'))
'0.10000000000000000000'
If you're dealing exclusively with rational numbers (even those without exact decimal representation), you can use fractions.
Note that if you use decimals or fractions, you'll need to use them as soon as possible in your processing. Converting from a float to a decimal/fraction in the late stages defeats their purpose - you can't get data that isn't there:
>>> "{0:.20f}".format(Decimal('0.1'))
'0.10000000000000000000'
>>> "{0:.20f}".format(Decimal(0.1))
'0.10000000000000000555'
Also, using decimals or fractions will come at a significant performance penalty. For serious number crunching you'll want to always use float, or even integers in their place
Finally, if your numbers are irrational, or if you're getting indexing mishaps even while using decimals or fractions, your best choice is probably indexing rounded versions of the numbers. Use buckets if necessary. collections.defaultdict may be useful for this.
You could also keep a tree, or use binary search over a list with a custom comparison function, but you won't have O(1) lookup
回答2:
If I understand correctly, you have generated a list of floats, each one from one of the dicts in the original list. Instead of generating a list of floats, why not generate a list of 2-tuples, being the float and it's corresponding dictionary-list-index...
来源:https://stackoverflow.com/questions/21162624/indexing-float-values-in-python