Normalizing a list of numbers in Python

偶尔善良 提交于 2019-11-27 19:21:08
Tony Suffolk 66

Use :

norm = [float(i)/sum(raw) for i in raw]

to normalize against the sum to ensure that the sum is always 1.0 (or as close to as possible).

use

norm = [float(i)/max(raw) for i in raw]

to normalize against the maximum

How long is the list you're going to normalize?

def psum(it):
    "This function makes explicit how many calls to sum() are done."
    print "Another call!"
    return sum(it)

raw = [0.07,0.14,0.07]
print "How many calls to sum()?"
print [ r/psum(raw) for r in raw]

print "\nAnd now?"
s = psum(raw)
print [ r/s for r in raw]

# if one doesn't want auxiliary variables, it can be done inside
# a list comprehension, but in my opinion it's quite Baroque    
print "\nAnd now?"
print [ r/s  for s in [psum(raw)] for r in raw]

Output

# How many calls to sum()?
# Another call!
# Another call!
# Another call!
# [0.25, 0.5, 0.25]
# 
# And now?
# Another call!
# [0.25, 0.5, 0.25]
# 
# And now?
# Another call!
# [0.25, 0.5, 0.25]

try:

normed = [i/sum(raw) for i in raw]

normed
[0.25, 0.5, 0.25]

There isn't any function in the standard library (to my knowledge) that will do it, but there are absolutely modules out there which have such functions. However, its easy enough that you can just write your own function:

def normalize(lst):
    s = sum(lst)
    return map(lambda x: float(x)/s, lst)

Sample output:

>>> normed = normalize(raw)
>>> normed
[0.25, 0.5, 0.25]

if your list has negative numbers, this is how you would normalize it

a = range(-30,31,5)
norm = [(float(i)-min(a))/(max(a)-min(a)) for i in a]

If you consider using numpy, you can get a faster solution.

import random, time
import numpy as np

a = random.sample(range(1, 20000), 10000)
since = time.time(); b = [i/sum(a) for i in a]; print(time.time()-since)
# 0.7956490516662598

since = time.time(); c=np.array(a);d=c/sum(a); print(time.time()-since)
# 0.001413106918334961

Try this :

from __future__ import division

raw = [0.07, 0.14, 0.07]  

def norm(input_list):
    norm_list = list()

    if isinstance(input_list, list):
        sum_list = sum(input_list)

        for value in input_list:
            tmp = value  /sum_list
            norm_list.append(tmp) 

    return norm_list

print norm(raw)

This will do what you asked. But I will suggest to try Min-Max normalization.

min-max normalization :

def min_max_norm(dataset):
    if isinstance(dataset, list):
        norm_list = list()
        min_value = min(dataset)
        max_value = max(dataset)

        for value in dataset:
            tmp = (value - min_value) / (max_value - min_value)
            norm_list.append(tmp)

    return norm_list
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!