问题
I have this weird behaviour in a pandas Dataframe. I am using .apply(single_seats_comma) on a column with the following example content: (1,2)
. However, it seems to return it as range(1,3)
instead of a string (1,2)
. Other rows have more than 2 entries as well, e.g.
(30,31,32)
. I have a function which splits on ,
and converts each value in brackets into a new row however with (x,x)
it breaks.
def single_seats_comma(row):
strlist = str(row).split(',')
strlist = filter(None, strlist)
intlist = []
for el in strlist:
intlist.append(int(el))
return intlist
Example for 'apply':
tickets['seats'][:1].apply(single_seats_comma)
The Error output of the def is
ValueError: invalid literal for int() with base 10: 'range(1'
Trying to find a solution, I found this:
str(tickets['seats'][:1])
>>'0 (1, 2)\nName: seats, dtype: object'
tickets['seats'][:1].values
>> '[range(1, 3)]'
It works on a column if the values are just 1,2
.
Any help help is much appreciated!
回答1:
Perhaps it would be easier to simply iterate over the elements of the row instead of converting to string then splitting. This is simple enough to use a lambda.
tickets['seats'][:1].apply(lambda row: [int(e) for e in row])
回答2:
I cannot reproduce the range
string.
But this function should work for both cases:
def single_seats_comma(row):
if type(row) is tuple:
return list(row)
elif type(row) is range:
res = [row.start]
end = row.stop - 1
if end - row.start > 1:
res.append(end)
return res
Example:
>>> tickets = pd.DataFrame({'seats': [(100, 1022), range(3, 4), range(2, 10)]})
>>> tickets['seats'].apply(single_seats_comma)
0 [100, 1022]
1 [3]
2 [2, 9]
Name: seats, dtype: object
回答3:
Thanks to all contributors to get me closer to a solution. The solution is actually quite simple.
The challenge was that pandas interpreted (1,2) as range and not as string However, the target was to create a list of all values, originally by splitting a string on ','. Not needed!
list(range(1,2)) does the job already. Here is the example and solution:
list(range(11, 17))
>> [11, 12, 13, 14, 15, 16]
tickets['seats'][0]
>> range(1, 3)
list(alltickets['seats'][0])
>> [1, 2]
So solution(s):
def single_seats_comma(row):
strlist = list(row)
return strlist
tickets['seats'].apply(single_seats_comma)
or
tickets['seats'].apply(lambda row: list(row))
来源:https://stackoverflow.com/questions/48019434/force-pandas-to-interpret-1-2-in-column-as-string-and-not-as-range