Converting a Panda DF List into a string

前端未结

关注

 4  555

悲哀的现实

I have a panda data frame. One of the columns contains a list. I want that column to be a single string.

For example my list [\'one\',\'two\',\'three\'] should sim

相关标签:

4条回答

情书的邮戳

2020-11-30 10:28

Pandas offers a method for this, Series.str.join.

0 讨论(0)
发布评论:

提交评论
- 加载中...

春和景丽

2020-11-30 10:35

When you cast col to str with astype, you get a string representation of a python list, brackets and all. You do not need to do that, just apply join directly:

import pandas as pd

df = pd.DataFrame({
    'A': [['a', 'b', 'c'], ['A', 'B', 'C']]
    })

# Out[8]: 
#            A
# 0  [a, b, c]
# 1  [A, B, C]

df['Joined'] = df.A.apply(', '.join)

#            A   Joined
# 0  [a, b, c]  a, b, c
# 1  [A, B, C]  A, B, C

0 讨论(0)

鱼传尺愫

2020-11-30 10:36

You could convert your list to str with astype(str) and then remove ', [, ] characters. Using @Yakim example:

In [114]: df
Out[114]:
           A
0  [a, b, c]
1  [A, B, C]

In [115]: df.A.astype(str).str.replace('\[|\]|\'', '')
Out[115]:
0    a, b, c
1    A, B, C
Name: A, dtype: object

Timing

import pandas as pd
df = pd.DataFrame({'A': [['a', 'b', 'c'], ['A', 'B', 'C']]})
df = pd.concat([df]*1000)


In [2]: timeit df['A'].apply(', '.join)
292 µs ± 10.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [3]: timeit df['A'].str.join(', ')
368 µs ± 24.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [4]: timeit df['A'].apply(lambda x: ', '.join(x))
505 µs ± 5.74 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [5]: timeit df['A'].str.replace('\[|\]|\'', '')
2.43 ms ± 62.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

0 讨论(0)

花落未央

2020-11-30 10:42
You should certainly not convert to string before you transform the list. Try:
```
df['col'].apply(', '.join)
```
Also note that apply applies the function to the elements of the series, so using df['col'] in the lambda function is probably not what you want.

Edit: thanks Yakym for pointing out that there is no need for a lambda function.

Edit: as noted by Anton Protopopov, there is a native .str.join method, but it is (surprisingly) a bit slower than apply.
0 讨论(0)
发布评论:

提交评论
- 加载中...