python pandas merge two or more lines of text into one line

筅森魡賤 提交于 2019-12-07 03:52:28

In case of name column consists of unique values,

print df

    name          address  number
0    Bob              bob   No.56
1    NaN       @gmail.com     NaN
2  Carly  carly@world.com   No.90
3  Gorge       greg@yahoo     NaN
4    NaN             .com     NaN
5    NaN              NaN  No.100

df['name'] = df['name'].ffill()
print df.fillna('').groupby(['name'], as_index=False).sum()

    name          address  number
0    Bob    bob@gmail.com   No.56
1  Carly  carly@world.com   No.90
2  Gorge   greg@yahoo.com  No.100

you may need ffill(), bfill(), [::-1], .groupby('name').apply(lambda x: ' '.join(x['address'])), strip(), lstrip(), rstrip(), replace() kind of thing to extend above code to more complicated data.

Neo X

If you want to convert a data frame of sex rows (with possible NaN entry in each column), there might be no direct pandas methods for that.

You will need some codes to assign the value in name column, so that pandas can know the split rows of bob and @gmail.com belong to same user Bob.

You can fill each empty entry in column name with its preceding user using the fillna or ffill methods, see pandas dataframe missing data.

df ['name'] = df['name'].ffill()

# gives
    name    address number
0   Bob bob No.56
1   Bob @gmail.com  
2   Carly   carly@world.com No.90
3   Gorge   greg@yahoo  
4   Gorge   .com    
5   Gorge       No.100

Then you can use the groupby and sum as the aggregation function.

df.groupby(['name']).sum().reset_index()

# gives
    name    address number
0   Bob bob@gmail.com   No.56
1   Carly   carly@world.com No.90
2   Gorge   greg@yahoo.com  No.100

You may find converting between NaN and white space useful, see Replacing blank values (white space) with NaN in pandas and pandas.DataFrame.fillna.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!