Dask equivalent to Pandas replace?

前端 未结 2 440
自闭症患者
自闭症患者 2021-01-02 18:42

Something I use regularly in pandas is the .replace operation. I am struggling to see how one readily performs this same operation on a dask dataframe?

df.r         


        
相关标签:
2条回答
  • 2021-01-02 19:05

    If anyone would like to know how to replace certain values in a specific column, here's how to do this:

    def replace(x: pd.DataFrame) -> pd.DataFrame:
        return x.replace(
          {'a_feature': ['PASS', 'FAIL']},
          {'a_feature': ['0', '1']}
        )
    df = df.map_partitions(replace)
    

    Since we operate on a pandas' DataFrame here, please refer to the documentation for further information

    0 讨论(0)
  • 2021-01-02 19:26

    You can use mask:

    df = df.mask(df == 'PASS', '0')
    df = df.mask(df == 'FAIL', '1')
    

    Or equivalently chaining the mask calls:

    df = df.mask(df == 'PASS', '0').mask(df == 'FAIL', '1')
    
    0 讨论(0)
提交回复
热议问题