问题
I have data frame with text data like below,
name | address | number
1 Bob bob No.56
2 @gmail.com
3 Carly carly@world.com No.90
4 Gorge greg@yahoo
5 .com
6 No.100
and want to make it like this frame.
name | address | number
1 Bob bob@gmail.com No.56
2 Carly carly@world.com No.90
3 Gorge greg@yahoo.com No.100
I am using pandas to read file but not sure how to use merge or concat.
回答1:
In case of name
column consists of unique values,
print df
name address number
0 Bob bob No.56
1 NaN @gmail.com NaN
2 Carly carly@world.com No.90
3 Gorge greg@yahoo NaN
4 NaN .com NaN
5 NaN NaN No.100
df['name'] = df['name'].ffill()
print df.fillna('').groupby(['name'], as_index=False).sum()
name address number
0 Bob bob@gmail.com No.56
1 Carly carly@world.com No.90
2 Gorge greg@yahoo.com No.100
you may need ffill()
, bfill()
, [::-1]
, .groupby('name').apply(lambda x: ' '.join(x['address']))
, strip()
, lstrip()
, rstrip()
, replace()
kind of thing to extend above code to more complicated data.
回答2:
If you want to convert a data frame of sex rows (with possible NaN
entry in each column), there might be no direct pandas
methods for that.
You will need some codes to assign the value in name
column, so that pandas can know the split rows of bob
and @gmail.com
belong to same user Bob
.
You can fill each empty entry in column name
with its preceding user using the fillna
or ffill
methods, see pandas dataframe missing data.
df ['name'] = df['name'].ffill()
# gives
name address number
0 Bob bob No.56
1 Bob @gmail.com
2 Carly carly@world.com No.90
3 Gorge greg@yahoo
4 Gorge .com
5 Gorge No.100
Then you can use the groupby
and sum
as the aggregation function.
df.groupby(['name']).sum().reset_index()
# gives
name address number
0 Bob bob@gmail.com No.56
1 Carly carly@world.com No.90
2 Gorge greg@yahoo.com No.100
You may find converting between NaN
and white space useful, see Replacing blank values (white space) with NaN in pandas and pandas.DataFrame.fillna.
来源:https://stackoverflow.com/questions/42240022/python-pandas-merge-two-or-more-lines-of-text-into-one-line