问题
When I am programming on colab, I keep running into this issue:
Here is my df:
0 1
0 [2.7436598593417045e-05, 3.731542193080655e-05]
1 [8.279973504084787e-05, 2.145002145002145e-05]
2 [0.00022534319714215346, 0.0002031172259231674]
3 [3.239841667031943e-05, 2.7771297808289177e-05]
4 [0.00011311134356928321, 9.428422928088026e-05]
I want to get the data from df[1] into a list of lists so I can feed it into my model. To do so, I run:
df[1].to_list()
and i get:
['[2.7436598593417045e-05, 3.731542193080655e-05]',
'[8.279973504084787e-05, 2.145002145002145e-05]',
'[0.00022534319714215346, 0.00020311722592316746]',
'[3.239841667031943e-05, 2.7771297808289177e-05]',
'[0.00011311134356928321, 9.428422928088026e-05]']
which is a list of strings which I cannot use to feed into the model. I use this code all the time locally and it works fine, but on colab I get this result. Any ideas? The result I want is:
[[2.7436598593417045e-05, 3.731542193080655e-05],
[8.279973504084787e-05, 2.145002145002145e-05],
[0.00022534319714215346, 0.00020311722592316746],
[3.239841667031943e-05, 2.7771297808289177e-05],
[0.00011311134356928321, 9.428422928088026e-05]]
回答1:
Try ast.literal_eval
from ast import literal_eval
df[1].map(literal_eval).to_list()
[[2.7436598593417045e-05, 3.731542193080655e-05],
[8.279973504084787e-05, 2.145002145002145e-05],
[0.00022534319714215346, 0.00020311722592316746],
[3.239841667031943e-05, 2.7771297808289177e-05],
[0.00011311134356928321, 9.428422928088026e-05]]
回答2:
If I make a dataframe with list elements:
In [135]: df = pd.DataFrame([[1,[1,3]],[2,[3,5]]])
In [136]: df
Out[136]:
0 1
0 1 [1, 3]
1 2 [3, 5]
In [137]: df.dtypes
Out[137]:
0 int64
1 object
dtype: object
In [138]: df[1].to_list()
Out[138]: [[1, 3], [3, 5]]
Doing the same with strings of lists:
In [139]: df1 = pd.DataFrame([[1,'[1,3]'],[2,'[3,5]']])
In [140]: df1
Out[140]:
0 1
0 1 [1,3]
1 2 [3,5]
In [141]: df1.dtypes
Out[141]:
0 int64
1 object
dtype: object
In [142]: df1[1].to_list()
Out[142]: ['[1,3]', '[3,5]']
df1
looks just like df
, except the column elements are strings.
df1
type of frame often results from saving df
to a csv
, and reloading it.
In [143]: df.to_csv('test.csv')
In [144]: cat test.csv
,0,1
0,1,"[1, 3]"
1,2,"[3, 5]"
to match the table format, it has to quote the lists.
来源:https://stackoverflow.com/questions/61531470/pd-series-to-list-changing-dtype