Normalizing a list of numbers in Python

前端 未结 9 1196
北荒
北荒 2020-12-05 06:37

I need to normalize a list of values to fit in a probability distribution, i.e. between 0.0 and 1.0.

I understand how to normalize, but was curious if Pytho

相关标签:
9条回答
  • 2020-12-05 06:58

    How long is the list you're going to normalize?

    def psum(it):
        "This function makes explicit how many calls to sum() are done."
        print "Another call!"
        return sum(it)
    
    raw = [0.07,0.14,0.07]
    print "How many calls to sum()?"
    print [ r/psum(raw) for r in raw]
    
    print "\nAnd now?"
    s = psum(raw)
    print [ r/s for r in raw]
    
    # if one doesn't want auxiliary variables, it can be done inside
    # a list comprehension, but in my opinion it's quite Baroque    
    print "\nAnd now?"
    print [ r/s  for s in [psum(raw)] for r in raw]
    

    Output

    # How many calls to sum()?
    # Another call!
    # Another call!
    # Another call!
    # [0.25, 0.5, 0.25]
    # 
    # And now?
    # Another call!
    # [0.25, 0.5, 0.25]
    # 
    # And now?
    # Another call!
    # [0.25, 0.5, 0.25]
    
    0 讨论(0)
  • 2020-12-05 06:59

    If you consider using numpy, you can get a faster solution.

    import random, time
    import numpy as np
    
    a = random.sample(range(1, 20000), 10000)
    since = time.time(); b = [i/sum(a) for i in a]; print(time.time()-since)
    # 0.7956490516662598
    
    since = time.time(); c=np.array(a);d=c/sum(a); print(time.time()-since)
    # 0.001413106918334961
    
    0 讨论(0)
  • 2020-12-05 07:00

    For ones who wanna use scikit-learn, you can use

    from sklearn.preprocessing import normalize
    
    x = [1,2,3,4]
    normalize([x]) # array([[0.18257419, 0.36514837, 0.54772256, 0.73029674]])
    normalize([x], norm="l1") # array([[0.1, 0.2, 0.3, 0.4]])
    normalize([x], norm="max") # array([[0.25, 0.5 , 0.75, 1.]])
    
    0 讨论(0)
  • 2020-12-05 07:08

    If working with data, many times pandas is the simple key

    This particular code will put the raw into one column, then normalize by column per row. (But we can put it into a row and do it by row per column, too! Just have to change the axis values where 0 is for row and 1 is for column.)

    import pandas as pd
    
    
    raw = [0.07, 0.14, 0.07]  
    
    raw_df = pd.DataFrame(raw)
    normed_df = raw_df.div(raw_df.sum(axis=0), axis=1)
    normed_df
    

    where normed_df will display like:

        0
    0   0.25
    1   0.50
    2   0.25
    

    and then can keep playing with the data, too!

    0 讨论(0)
  • 2020-12-05 07:11

    Use :

    norm = [float(i)/sum(raw) for i in raw]
    

    to normalize against the sum to ensure that the sum is always 1.0 (or as close to as possible).

    use

    norm = [float(i)/max(raw) for i in raw]
    

    to normalize against the maximum

    0 讨论(0)
  • 2020-12-05 07:14

    if your list has negative numbers, this is how you would normalize it

    a = range(-30,31,5)
    norm = [(float(i)-min(a))/(max(a)-min(a)) for i in a]
    
    0 讨论(0)
提交回复
热议问题