In pandas Parallel processing using Dask

∥☆過路亽.° 提交于 2019-12-11 17:35:46

问题


i want to reduce the time of process in pandas.

i tried to make smaller pandas memory using .cat method and tried multiprocessing but there is no change in time

import multiprocessing
import time
import pandas as pd
start=time.time()

def square(df1):
    df1['M_threading'] = df1['M_Invoice_type']
def multiply(df4):
    df4['M_threading'] = df4['M_Invoice_type']

if __name__ == '__main__':
    df = pd.read_excel("C:/Users/Admin/Desktop/schindler purchase Apr-19.xlsx")
    df1 = df.loc[df['M_Invoice_type'] == 'B2B']
    df4 = df.loc[df['M_Invoice_type'] == 'B2BUR']
    p=multiprocessing.Process(target=square,args=(df1,))
    p1 = multiprocessing.Process(target=multiply, args=(df4,))
    p.start()
    p1.start()
    p.join()
    p1.join()
    print("Done")
    end=time.time()
    print(end-start)

please can anyone help with this

来源:https://stackoverflow.com/questions/56338653/in-pandas-parallel-processing-using-dask

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!