largest element all lists in Panda Series

十年热恋 提交于 2019-12-31 05:04:26

问题


I have a pandas series say

import pandas as pd
a = pd.Series([
    [1, 2, 3, 4, 5],
    [6, 7, 8, 3, 334],
    [333, 4, 5, 3, 4]
])

I want to find the largest element in all lists, which is 334, what is the easy way to do it?


回答1:


Option 1
Only works if elements are actually list. This is because sum concatenates lists. This is also likely very slow.

max(a.sum())

334

Option 2
minimal two tiered application of max

max(map(max, a))

334

Option 3
Only works if all lists are same length

np.max(a.tolist())

334

Option 4
One application of max on an unwound generator

max(x for l in a for x in l)

334



回答2:


To dataframe

pd.DataFrame(a.values.tolist()).max().max()
Out[200]: 334

Or numpy.concatenate

np.concatenate(a.values).max()
Out[201]: 334

Or

max(sum(a,[]))
Out[205]: 334



回答3:


This is one way:

max(max(i) for i in a)

Functional variant:

max(map(max, a))

Alternative method which only calculates one max:

from toolz import concat

max(concat(a))

For the fun of it below is some benchmarking. The lazy function concat and optimised map / list comprehension do best, then come numpy functions, pandas methods usually worse, clever sum applications last.

import numpy as np
from toolz import concat
import pandas as pd

a = pd.Series([list(np.random.randint(0, 10, 100)) for i in range(1000)])

# times in ms
5.92  max(concat(a))
6.29  max(map(max, a))
6.67  max(max(i) for i in a)
17.4  max(x for l in a for x in l)
19.2  np.max(a.tolist())
20.4  np.concatenate(a.values).max()
64.6  pd.DataFrame(a.values.tolist()).max().max()
373   np.max(a.apply(pd.Series).values)
672   max(sum(a,[]))
696   max(a.sum())



回答4:


Yet another answer using np.max:

import numpy as np
np.max(a.apply(pd.Series).values)
Out[175]: 334


来源:https://stackoverflow.com/questions/48554205/largest-element-all-lists-in-panda-series

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!