问题
I know this question was asked before but each case is different... My plea is this:
df = pd.read_csv(‘file.csv’)
# convert the string into a datetime object
time = pd.to_datetime(df.dttm_utc)
Month=time.dt.month
Day=time.dt.day
Hour=time.dt.Hour
InDayLightSavings=True
if (Month<3): InDayLightSavings=False
if (Month==3) and (Day<11) and (Hour<2): InDayLightSavings=False
if (Month>11): InDayLightSavings=False
if (Month==11) and (Day>4)and (Hour>=2): InDayLightSavings=False
if (InDayLightSavings):
time=time-datetime.timedelta(hours=1)
And it returns, as you guessed correctly,Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). I used this with timestamp, changing it to an ISO8601 before and this method works but it doesn't work for series apparently. And I tried adding .any() and it doesn't work. I also change and to & as suggested in other thread. A par of my file.csv looks like this, running til end of 2012:
timestamp dttm_utc value
1325376300 2012-01-01 0:05:00 16.9444
1325376600 2012-01-01 0:10:00 16.6837
1325376900 2012-01-01 0:15:00 16.6837
1325377200 2012-01-01 0:20:00 16.9444
1325377500 2012-01-01 0:25:00 16.1623
1325377800 2012-01-01 0:30:00 16.6837
3/13/2016 1:00 51
3/13/2016 1:15 48
3/13/2016 1:30 50.4
3/13/2016 1:45 51
3/13/2016 3:00 47.4
3/13/2016 3:15 49.8
3/13/2016 3:30 51
3/13/2016 3:45 51
3/13/2016 4:00 48.6
Any help is appreciated.Thank you!
回答1:
The exception you are seeing is due to the fact that you try to evaluate a series with many different entries against a set of single conditions. Briefly, let's have a look what you do:
Error analysis (why not to do it like that):
First, you did take a pandas dataframe column and then converted it to datetime, which of course also returns a column (series).
time = pd.to_datetime(df.dttm_utc) # Convert content of dttm_utc COLUMN to datetime
# This returns a dataframe COLUMN / series
Month = time.dt.month # Convert content of your COLUMN/series to month
Day = time.dt.day # Convert content of your COLUMN/series to month
Hour = time.dt.Hour # Convert content of your COLUMN/series to month
Your mistake: You then try to assess specific conditions along the series:
if (Month == whatever_condition):
do_something()
However, you can't compare a single condition to a series, at least not like that. Python doesn't know which entry in the series you mean, as some values in it may be different to others. That means, for some items in the series the condition may be fulfilled, for others not. Hence the ValueError: The truth value of a series is ambiguous
.
What you want to do instead:
Evaluate item by item, ideally in a vectorized way. My suggestion: stay in the pandas dataframe all time:
df['Datetime'] = pd.to_datetime(df['dttm_utc']) # Add second column with datetime format
df['Month'] = df.Datetime.dt.month # Extract month and add to new column
# Same for day
df.loc[(df.Month < 3), 'InDayLightSavings'] = False
# You can add multiple conditions here
# Finally, your filter:
df.loc[(df.InDayLightSavings == True), 'Time'] = df['Time'] - dt.timedelta(hours=1)
# dt when import datetime as dt, else just datetime
Further reading here, here, here and here.
回答2:
How about using & instead of and?
The or and and python statements require truth-values. We do this because in pandas these are considered ambiguous so we should use bitwise operators "|" (or) or "&" (and) operations.
if (Month<3): InDayLightSavings=False
if (Month==3) & (Day<11) & (Hour<2): InDayLightSavings=False
if (Month>11): InDayLightSavings=False
if (Month==11) & (Day>4) & (Hour>=2): InDayLightSavings=False
(PS : I could've commented this, but dont have the reputation to do so.)
来源:https://stackoverflow.com/questions/50828894/python-use-if-function-valueerrortruth-value-of-a-series-is-ambiguous-use-a-e