问题
I have a collection of internet-connected devices hanging around in various places. I have a dataframe that contains seven rows, one for each day in the past week. Each row contains the serial number of each device that didn't connect to my server that day. I am trying to compile a report that creates an 8th row, which contains the serial number of each device that failed to communicate for seven straight days. Here is a simplified mock-up of my dataframe:
2016-10-01, AAAA, BBBB, CCCC, EEEE
2016-10-02, AAAA, BBBB, EEEE,
2016-10-03, AAAA, BBBB, CCCC, EEEE
2016-10-04, AAAA, BBBB, CCCC, EEEE
2016-10-05, BBBB, CCCC, DDDD, EEEE
2016-10-06, AAAA, BBBB, CCCC, EEEE
2016-10-07, AAAA, BBBB, CCCC, FFFF
Here is the code block that is giving me problems. I am trying to compare the values in the first column with the values of each of the other columns. If I get 6 Trues
, I add the serial number to a new list
then try to tack that on to the dataframe.
cursor = localConnection.cursor()
cursor.execute(allInstalledQuery % ('', fac[0]))
cursor.fetchall()
totalDevices = cursor.rowcount
cursor.close()
for i in range(7, 0, -1):
loopCursor = localConnection.cursor()
print(i)
sns = []
d = datetime.datetime.strptime(todayDate, "%Y-%m-%d") + datetime.timedelta(days=-i)
sns.append(d.strftime("%Y-%m-%d"))
if i != 1:
loopCursor.execute(missingReportQuery % (i, 1, i, 1, '', fac[0]))
else:
loopCursor.execute(missingReportQuery % (i, 0, i, 0, '', fac[0]))
rows = loopCursor.fetchall()
numMissing = loopCursor.rowcount
missingSummary = "%d / %d devices missing"
sns.append(staleSummary % (nummissing, totaldevices))
for row in rows:
sns.append(row[4])
masterList.append(sns)
loopCursor.close()
df = pandas.DataFrame(masterList)
EDIT The following code block is depricated:
firstDayData = df.iloc[[0], :].values
missingSevenDays = []
for s in firstDayData:
print(s)
a = pandas.Series(df[1])
b = pandas.Series(df[2])
c = pandas.Series(df[3])
d = pandas.Series(df[4])
e = pandas.Series(df[5])
sn = list(str(s))
if a.isin(sn) is True and b.isin(sn) is True and c.isin(sn) is True and d.isin(sn) is True \
and e.isin(sn) is True:
missingSevenDays.append(sn)
df.append(missingSevenDays)
AND has been replaced by this:
counts = df.stack().value_counts()
seven_day = counts[counts == 7]
filtered_df = df[seven_day.index]
missingSevenDays = []
for neat in filtered_df.values:
print(neat)
I want this to print out the serial number of all devices that have been missing for 7 days. As it stands, it is just printing out []
. I'm afraid I'm completely bewildered as to how to use these data structures.
来源:https://stackoverflow.com/questions/40244094/python-pandas-determine-if-values-in-column-0-are-repeated-in-each-subsequent