Normalizing a pandas DataFrame by row

前端 未结 2 1997
再見小時候
再見小時候 2020-12-08 09:32

What is the most idiomatic way to normalize each row of a pandas DataFrame? Normalizing the columns is easy, so one (very ugly!) option is:

(df.T / df.T.sum(         


        
相关标签:
2条回答
  • 2020-12-08 10:15

    To overcome the broadcasting issue, you can use the div method:

    df.div(df.sum(axis=1), axis=0)
    

    See http://pandas.pydata.org/pandas-docs/stable/basics.html#matching-broadcasting-behavior

    0 讨论(0)
  • 2020-12-08 10:22

    I would suggest to use Scikit preprocessing libraries and transpose your dataframe as required:

    '''
    Created on 05/11/2015
    
    @author: rafaelcastillo
    '''
    
    import matplotlib.pyplot as plt
    import pandas
    import random
    import numpy as np
    from sklearn import preprocessing
    
    def create_cos(number_graphs,length,amp):
        # This function is used to generate cos-kind graphs for testing
        # number_graphs: to plot
        # length: number of points included in the x axis
        # amp: Y domain modifications to draw different shapes
        x = np.arange(length)
        amp = np.pi*amp
        xx = np.linspace(np.pi*0.3*amp, -np.pi*0.3*amp, length)
        for i in range(number_graphs):
            iterable = (2*np.cos(x) + random.random()*0.1 for x in xx)
            y = np.fromiter(iterable, np.float)
            if i == 0: 
                yfinal =  y
                continue
            yfinal = np.vstack((yfinal,y))
        return x,yfinal
    
    x,y = create_cos(70,24,3)
    data = pandas.DataFrame(y)
    
    x_values = data.columns.values
    num_rows = data.shape[0]
    
    fig, ax = plt.subplots()
    for i in range(num_rows):
        ax.plot(x_values, data.iloc[i])
    ax.set_title('Raw data')
    plt.show() 
    
    std_scale = preprocessing.MinMaxScaler().fit(data.transpose())
    df_std = std_scale.transform(data.transpose())
    data = pandas.DataFrame(np.transpose(df_std))
    
    
    fig, ax = plt.subplots()
    for i in range(num_rows):
        ax.plot(x_values, data.iloc[i])
    ax.set_title('Data Normalized')
    plt.show()                                   
    
    0 讨论(0)
提交回复
热议问题