Hash each row of pandas dataframe column using apply

。_饼干妹妹 提交于 2019-12-11 00:15:18

问题


I'm trying to hash each value of a python 3.6 pandas dataframe column with the following algorithm on the dataframe-column ORIG:

HK_ORIG = base64.b64encode(hashlib.sha1(str(df.ORIG).encode("UTF-8")).digest())

However, the above mentioned code does not hash each value of the column, so, in order to hash each value of the df-column ORIG, I need to use the apply function. Unfortunatelly, I don't seem to be good enough to get this done.

I imagine it to look like the following code:

df["HK_ORIG"] = str(df['ORIG']).encode("UTF-8")).apply(hashlib.sha1)

I'm looking very much forward to your answers! Many thanks in advance!


回答1:


You can either create a named function and apply it - or apply a lambda function. In either case, do as much processing as possible withing the dataframe.

A lambda-based solution:

df['ORIG'].astype(str).str.encode('UTF-8')\
          .apply(lambda x: base64.b64encode(hashlib.sha1(x).digest()))

A named function solution:

def hashme(x):
    return base64.b64encode(hashlib.sha1(x).digest())
df['ORIG'].astype(str).str.encode('UTF-8')\
          .apply(hashme)


来源:https://stackoverflow.com/questions/51178961/hash-each-row-of-pandas-dataframe-column-using-apply

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!