Averaging down a column of averaged data

耗尽温柔 提交于 2019-12-06 12:00:18

I'm not sure I understand which columns you want to average in 3), but maybe this does what you want:

with open("test2.xls") as w:
    w.next()  # skip over header row
    for row in w:
        (date, time, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t,
         u, LZA, SZA, LAM) = row.split("\t")  # split columns into fields

        A = [(float(a) + float(b) + float(c))/3,
             (float(d) + float(e) + float(f))/3,
             (float(g) + float(h) + float(i))/3,
             (float(j) + float(k) + float(l))/3,
             (float(m) + float(n) + float(o))/3,
             (float(p) + float(q) + float(r))/3,
             (float(s) + float(t) + float(u))/3]
        print ('['+ ', '.join(['{:.6f}']*len(A)) + ']').format(*A)
        avg = sum(A)/len(A)
        print avg

You could do the same thing a little more concisely with code like the following:

avg = lambda nums: sum(nums)/float(len(nums))

with open("test2.xls") as w:
    w.next()  # skip over header row
    for row in w:
        cols = row.split("\t")  # split into columns
        # then split that into fields
        date, time, values, LZA, SZA, LAM = (cols[0], cols[1],
                                             map(float, cols[2:23]), 
                                             cols[23], cols[24], cols[25])
        A = [avg(values[i:i+3]) for i in xrange(0, 21, 3)]
        print ('['+ ', '.join(['{:.6f}']*len(A)) + ']').format(*A)
        print avg(A)

You can use the decimal module to display the exact numbers.

from decimal import *
getcontext().prec = 6 # sets the precision to 6

note that floating points are used which means that:

print(Decimal(1)/(Decimal(7)) # 0.142857
print(Decimal(100)/(Decimal(7)) # results in 14.2857

This means you probably need to set the precision to a higher value to get 6 decimal places... for example:

from decimal import *
getcontext().prec = 28
print("{0:.6f}".format(Decimal(100) / Decimal(7))) # 14.285714

To give a complete answer to your question, could you explain what average you are looking for? The average over all (21) columns? Could you maybe post some test_data.xls?

I would consider using numpy. I'm not sure how to read in xls files, but there seem to be packages out there that provide this functionality. I'd do something like this:

import numpy as np

with open("test2.txt") as f:
    for row in f:
        # row is a string, split on tabs, but ignore the values that
        # don't go into the average.  If you need to keep those you 
        # might want to look into genfromtxt and defining special datatypes
        data = (np.array(row.split('\t')[2:23])).astype(np.float)
        # split the data array into 7 separate arrays (3 columns each) and average on those
        avg = np.mean(np.array_split(data,7))
        print avg

I'm not sure if the avg above is exactly what you want. You might need to save off the smaller arrays (smallArrays = np.array_split(data,7)) then iterate over those, calculating the average as you go.

Even if this isn't exactly what you want, I recommend looking into numpy. I've found it to be really easy to use and very helpful when it comes to doing calculations like you're trying to do.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!