问题
I am using a module called pyhaystack to retrieve data (rest API) from a building automation system based on 'tags.' Python will return a dictionary of the data. Im trying to use pandas with an If Else statement further below that I am having trouble with. The pyhaystack is working just fine to get the data...
This connects me to the automation system: (works just fine)
from pyhaystack.client.niagara import NiagaraHaystackSession
import pandas as pd
session = NiagaraHaystackSession(uri='http://0.0.0.0', username='Z', password='z', pint=True)
This code finds my tags called 'znt', converts dictionary to Pandas, and filters for time: (works just fine for the two points)
znt = session.find_entity(filter_expr='znt').result
znt = session.his_read_frame(znt, rng= '2018-01-01,2018-02-12').result
znt = pd.DataFrame.from_dict(znt)
znt.index.names=['Date']
znt = znt.fillna(method = 'ffill').fillna(method = 'bfill').between_time('08:00','17:00')
What I am most interested in is the column name, where ultimately I want Python to return the column named based on conditions:
print(znt.columns)
print(znt.values)
Returns:
Index(['C.Drivers.NiagaraNetwork.Adams_Friendship.points.A-Section.AV1.AV1ZN~2dT', 'C.Drivers.NiagaraNetwork.points.A-Section.AV2.AV2ZN~2dT'], dtype='object')
[[ 65.9087 66.1592]
[ 65.9079 66.1592]
[ 65.9079 66.1742]
...,
[ 69.6563 70.0198]
[ 69.6563 70.2873]
[ 69.5673 70.2873]]
I am most interested in this name of the Pandas dataframe. C.Drivers.NiagaraNetwork.Adams_Friendship.points.A-Section.AV1.AV1ZN~2dT
For my two arrays, I am subtracting the value of 70 for the data in the data frames. (works just fine)
znt_sp = 70
deviation = znt - znt_sp
deviation = deviation.abs()
deviation
And this is where I am getting tripped up in Pandas. I want Python to print the name of the column if the deviation is greater than four else print this zone is Normal. Any tips would be greatly appreciated..
if (deviation > 4).any():
print('Zone %f does not make setpoint' % deviation)
else:
print('Zone %f is Normal' % deviation)
The columns names in Pandas are the: C.Drivers.NiagaraNetwork.Adams_Friendship.points.A-Section.AV1.AV1ZN~2dT
回答1:
Solution:
You can iterate over columns
for col in df.columns:
if (df[col] > 4).any(): # or .all if needed
print('Zone %s does not make setpoint' % col)
else:
print('Zone %s is Normal' % col)
Or by defining a function and using apply
def _print(x):
if (x > 4).any():
print('Zone %s does not make setpoint' % x.name)
else:
print('Zone %s is Normal' % x.name)
df.apply(lambda x: _print(x))
# you can even do
[_print(df[col]) for col in df.columns]
Advice: maybe you would keep the result in another structure, change the function to return a boolean series that "is normal":
def is_normal(x):
return not (x > 4).any()
s = df.apply(lambda x: is_normal(x))
# or directly
s = df.apply(lambda x: not (x > 4).any())
it will return a series s
where index
is column names of your df
and values
a boolean corresponding to your condition.
You can then use it to get all the Normal columns names s[s].index
or the non-normal s[~s].index
Ex : I want only the normal columns of my df: df[s[s].index]
A complete example
For the example I will use a sample df
with a different condition from yours (I check if no element is lower than 4 - Normal else Does not make the setpoint )
df = pd.DataFrame(dict(a=[1,2,3],b=[2,3,4],c=[3,4,5])) # A sample
print(df)
a b c
0 1 2 3
1 2 3 4
2 3 4 5
Your use case: Print if normal or not - Solution
for col in df.columns:
if (df[col] < 4).any():
print('Zone %s does not make setpoint' % col)
else:
print('Zone %s is Normal' % col)
Result
Zone a is Normal
Zone b is does not make setpoint
Zone c is does not make setpoint
To illustrate my Advice : Keep the is_normal
columns in a series
s = df.apply(lambda x: not (x < 4).any()) # Build the series
print(s)
a True
b False
c False
dtype: bool
print(df[s[~s].index]) #Falsecolumns df
b c
0 2 3
1 3 4
2 4 5
print(df[s[s].index]) #Truecolumns df
a
0 1
1 2
2 3
回答2:
I think DataFrame would be a good way to handle what you want. Starting with znt you can make all the calculation there :
deviation = znt - 70
deviation = deviation.abs()
# and the cool part is filtering in the df
problem_zones =
deviation[deviation['C.Drivers.NiagaraNetwork.Adams_Friendship.points.A-
Section.AV1.AV1ZN~2dT']>4]
You can play with this and figure out a way to iterate through columns, like :
for each in df.columns:
# if in this column, more than 10 occurences of deviation GT 4...
if len(df[df[each]>4]) > 10:
print('This zone have a lot of troubles : ', each)
edit
I like adding columns to a DataFrame instead of just building an external Series.
df[‘error_for_a’] = df[a] - 70
This open possibilities and keep everything together. One could use
df[df[‘error_for_a’]>4]
Again, all() or any() can be useful but in a real life scenario, we would probably need to trig the “fault detection” when a certain number of errors are present.
If the schedule has been set ‘occupied’ at 8hAM.... maybe the first entries won’t be correct.... (any would trig an error even if the situation gets better 30minutes later). Another scenario would be a conference room where error is tiny....but as soon as there are people in it...things go bad (all() would not see that).
来源:https://stackoverflow.com/questions/48874706/python-pandas-pyhaystack