Python using Kalman Filter to improve simulation but getting worse results

久未见 提交于 2020-01-02 19:46:13

问题


I have questions on the behavior I am seeing with applying Kalman Filter (KF) to the following forecast problem. I have included a simple code sample.

Goal: I would like to know if KF is suitable for improving forecast/simulation result for a day ahead (at t+24 hours), using the measurement result obtained now (at t). The goal is to get the forecast as close to measurement as possible

Assumption: We assume the measurement is perfect (ie. if we can get the forecast matches the measurement perfectly, we are happy).

We have a single measurement variable (z, real wind speed), and a single simulated variable (x, predicted wind speed).

The simulated wind speed x is produced from a NWP (numerical weather prediction) software using a variety meteorological data (black box to me). Simulation file is produced daily, containing data every half an hour.

I attempt to correct the t+24hr forecast, using the measurement I obtained now and the forecast data now (generated t-24 hr ago) using a scalar Kalman filter. For reference, I used: http://www.swarthmore.edu/NatSci/echeeve1/Ref/Kalman/ScalarKalman.html

Code:

#! /usr/bin/python

import numpy as np
import pylab

import os


def main():

    # x = 336 data points of simulated wind speed for 7 days * 24 hour * 2 (every half an hour)
    # Imagine at time t, we will get a x_t fvalue or t+48 or a 24 hours later.
    x = load_x()

    # this is a list that will contain 336 data points of our corrected data
    x_sample_predict_list = []

    # z = 336 data points for 7 days * 24 hour * 2 of actual measured wind speed (every half an hour)
    z = load_z()

    # Here is the setup of the scalar kalman filter
    # reference: http://www.swarthmore.edu/NatSci/echeeve1/Ref/Kalman/ScalarKalman.html
    # state transition matrix (we simply have a scalar)
    # what you need to multiply the last time's state to get the newest state
    # we get the x_t+1 = A * x_t, since we get the x_t+1 directly for simulation
    # we will have a = 1
    a = 1.0

    # observation matrix
    # what you need to multiply to the state, convert it to the same form as incoming measurement 
    # both state and measurements are wind speed, so set h = 1
    h = 1.0

    Q = 16.0    # expected process variance of predicted Wind Speed
    R = 9.0 # expected measurement variance of Wind Speed

    p_j = Q # process covariance is equal to the initial process covariance estimate

    # Kalman gain is equal to k = hp-_j / (hp-_j + R).  With perfect measurement
    # R = 0, k reduces to k=1/h which is 1
    k = 1.0

    # one week data
    # original R2 = 0.183
    # with delay = 6, R2 = 0.295
    # with delay = 12, R2 = 0.147   
    # with delay = 48, R2 = 0.075
    delay = 6 

    # Kalman loop
    for t, x_sample in enumerate(x):

        if t <= delay:          
            # for the first day of the forecast,
            # we don't have forecast data and measurement 
            # from a day before to do correction
            x_sample_predict = x_sample             
        else: # t > 48
            # for a priori estimate we take x_sample as is
            # x_sample = x^-_j = a x^-_j_1 + b u_j
            # Inside the NWP (numerical weather prediction, 
            # the x_sample should be on x_sample_j-1 (assumption)

            x_sample_predict_prior = a * x_sample

            # we use the measurement from t-delay (ie. could be a day ago)
            # and forecast data from t-delay, to produce a leading residual that can be used to
            # correct the forecast.
            residual = z[t-delay] - h * x_sample_predict_list[t-delay]


            p_j_prior = a**2 * p_j + Q

            k = h * p_j_prior / (h**2 * p_j_prior + R)

            # we update our prediction based on the residual
            x_sample_predict = x_sample_predict_prior + k * residual

            p_j = p_j_prior * (1 - h * k)

            #print k
            #print p_j_prior
            #print p_j
            #raw_input()

        x_sample_predict_list.append(x_sample_predict)

    # initial goodness of fit
    R2_val_initial = calculate_regression(x,z)
    R2_string_initial = "R2 initial: {0:10.3f}, ".format(R2_val_initial)    
    print R2_string_initial     # R2_val_initial = 0.183

    # final goodness of fit
    R2_val_final = calculate_regression(x_sample_predict_list,z)
    R2_string_final = "R2 final: {0:10.3f}, ".format(R2_val_final)  
    print R2_string_final       # R2_val_final = 0.117, which is worse


    timesteps = xrange(len(x))      
    pylab.plot(timesteps,x,'r-', timesteps,z,'b:', timesteps,x_sample_predict_list,'g--')
    pylab.xlabel('Time')
    pylab.ylabel('Wind Speed')
    pylab.title('Simulated Wind Speed vs Actual Wind Speed')
    pylab.legend(('predicted','measured','kalman'))
    pylab.show()


def calculate_regression(x, y):         
    R2 = 0  
    A = np.array( [x, np.ones(len(x))] )
    model, resid = np.linalg.lstsq(A.T, y)[:2]  
    R2_val = 1 - resid[0] / (y.size * y.var())          
    return R2_val

def load_x():
    return np.array([2, 3, 3, 5, 4, 4, 4, 5, 5, 6, 5, 7, 7, 7, 8, 8, 8, 9, 9, 10, 10, 10, 11, 11,
     11, 10, 8, 8, 8, 8, 6, 3, 4, 5, 5, 5, 6, 5, 5, 5, 6, 5, 5, 6, 6, 7, 6, 8, 9, 10,
     12, 11, 10, 10, 10, 11, 11, 10, 8, 8, 9, 8, 9, 9, 9, 9, 8, 9, 8, 11, 11, 11, 12,
     12, 13, 13, 13, 13, 13, 13, 13, 14, 13, 13, 12, 13, 13, 12, 12, 13, 13, 12, 12, 
     11, 12, 12, 19, 18, 17, 15, 13, 14, 14, 14, 13, 12, 12, 12, 12, 11, 10, 10, 10, 
     10, 9, 9, 8, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 7, 7, 8, 8, 8, 6, 5, 5, 
     5, 5, 5, 5, 6, 4, 4, 4, 6, 7, 8, 7, 7, 9, 10, 10, 9, 9, 8, 7, 5, 5, 5, 5, 5, 5, 
     5, 5, 6, 5, 5, 5, 4, 4, 6, 6, 7, 7, 7, 7, 6, 6, 5, 5, 4, 2, 2, 2, 1, 1, 1, 2, 3,
     13, 13, 12, 11, 10, 9, 10, 10, 8, 9, 8, 7, 5, 3, 2, 2, 2, 3, 3, 4, 4, 5, 6, 6,
     7, 7, 7, 6, 6, 6, 7, 6, 6, 5, 4, 4, 3, 3, 3, 2, 2, 1, 5, 5, 3, 2, 1, 2, 6, 7, 
     7, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 9, 9, 9, 9, 9, 8, 8, 8, 8, 7, 7, 
     7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 11, 11, 11, 11, 10, 10, 9, 10, 10, 10, 2, 2,
     2, 3, 1, 1, 3, 4, 5, 8, 9, 9, 9, 9, 8, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7,
     7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 7, 5, 5, 5, 5, 5, 6, 5])

def load_z():
    return np.array([3, 2, 1, 1, 1, 1, 3, 3, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 2, 1, 1, 2, 2, 2,
     2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 3, 4, 4, 4, 4, 5, 4, 4, 5, 5, 5, 6, 6,
     6, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 8, 8, 8, 8, 8, 8, 9, 10, 9, 9, 10, 10, 9,
     9, 10, 9, 9, 10, 9, 8, 9, 9, 7, 7, 6, 7, 6, 6, 7, 7, 8, 8, 8, 8, 8, 8, 7, 6, 7,
     8, 8, 7, 8, 9, 9, 9, 9, 10, 9, 9, 9, 8, 8, 10, 9, 10, 10, 9, 9, 9, 10, 9, 8, 7, 
     7, 7, 7, 8, 7, 6, 5, 4, 3, 5, 3, 5, 4, 4, 4, 2, 4, 3, 2, 1, 1, 2, 1, 2, 1, 4, 4,
     4, 4, 4, 3, 3, 3, 1, 1, 1, 1, 2, 3, 3, 2, 3, 3, 3, 2, 2, 5, 4, 2, 5, 4, 1, 1, 1, 
     1, 1, 1, 1, 2, 2, 1, 1, 3, 3, 3, 3, 3, 4, 3, 4, 3, 4, 4, 4, 4, 3, 3, 4, 4, 4, 4,
     4, 4, 5, 5, 5, 4, 3, 3, 3, 3, 3, 3, 3, 3, 1, 2, 2, 3, 3, 1, 2, 1, 1, 2, 4, 3, 1,
     1, 2, 0, 0, 0, 2, 1, 0, 0, 2, 3, 2, 4, 4, 3, 3, 4, 5, 5, 5, 4, 5, 4, 4, 4, 5, 5, 
     4, 3, 3, 4, 4, 4, 3, 3, 3, 4, 4, 4, 5, 5, 5, 4, 5, 5, 5, 5, 6, 5, 5, 8, 9, 8, 9,
     9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 9, 10, 9, 8, 8, 9, 8, 9, 9, 10, 9, 9, 9,
     7, 7, 9, 8, 7, 6, 6, 5, 5, 5, 5, 3, 3, 3, 4, 6, 5, 5, 6, 5])

if __name__ == '__main__': main()  # this avoids executing main on import your_module

-------------------------

Observations:

1) If yesterday’s forecast is over-predicting (positive bias), then today, I would make corrections by subtracting away the bias. In practice, if today I happen to be under-predicting, then subtracting the positive bias lead to an even worse prediction. And I actually observe a wider swing of data with poorer overall fit. What is wrong with my example?

2) Most Kalman filter resource indicates that Kalman filter minimizes the a posteriori covariance p_j = E{(x_j – x^_j)}, and has the proof selecting K to minimize the p_j. But can someone explain how minimizing the a posteriori covariance actually minimizes the effects of the process white noise w? In a real time case, let’s say the actual wind speed and measured wind speed is 5 m/s. The prediction wind speed is 6 m/s, ie. there was a noise of w = 1 m/s. The residual is 5 – 6 = -1 m/s. You correct by taking the 1 m/s from your prediction to get back 5 m/s. Is that how the effect of the process noise is minimized?

3) Here is a paper that mentioned applying KF to smooth weather forecast. http://hal.archives-ouvertes.fr/docs/00/50/59/93/PDF/Louka_etal_jweia2008.pdf. The interesting point is on pg 9 eq (7) that ”as soon as the new observation value y_t is known, the estimate of x at time t becomes x_t = x_t/t-1 = K(y_t – H_t * x_t/t-1) ”. If I were to paraphrase it in reference to actual time, then “as soon as the new observation value is known now, the estimate now becomes x_t …. ” I get how KF can bring your data close to measurement in real time. But if you are correcting the forecast data at t=now, with measurement data from t=now, how is that a forecast anymore?

Thanks!

UPDATE1:

4) I have added a delay in the code to investigate how much later can the forecast be than the current bias calculated from the current measurement, if we want the R2 between the Kalman processed data vs measure data time series to improve from the unprocessed data vs. measure data. In this example, if the measurement is used to improve the forecast 6 time step (3 hours from now) it is still useful (R2 goes from 0.183 to 0.295). But if the measurement is used to improve the forecast 1 day from now, then it destroys the correlation (R2 goes down to 0.075).


回答1:


I updated my test scalar implementation, without making the assumption of perfect measurement R of 1, which was what reduced the kalman gain to a constant value of 1. Now I am seeing an improvement on the time series with reduced RMSE error.

#! /usr/bin/python

import numpy as np
import pylab

import os

# RMSE improved
def main():

    # x = 336 data points of simulated wind speed for 7 days * 24 hour * 2 (every half an hour)
    # Imagine at time t, we will get a x_t fvalue or t+48 or a 24 hours later.
    x = load_x()

    # this is a list that will contain 336 data points of our corrected data
    x_sample_predict_list = []

    # z = 336 data points for 7 days * 24 hour * 2 of actual measured wind speed (every half an hour)
    z = load_z()

    # Here is the setup of the scalar kalman filter
    # reference: http://www.swarthmore.edu/NatSci/echeeve1/Ref/Kalman/ScalarKalman.html
    # state transition matrix (we simply have a scalar)
    # what you need to multiply the last time's state to get the newest state
    # we get the x_t+1 = A * x_t, since we get the x_t+1 directly for simulation
    # we will have a = 1
    a = 1.0

    # observation matrix
    # what you need to multiply to the state, convert it to the same form as incoming measurement 
    # both state and measurements are wind speed, so set h = 1
    h = 1.0

    Q = 1.0     # expected process noise of predicted Wind Speed    
    R = 1.0     # expected measurement noise of Wind Speed

    p_j = Q # process covariance is equal to the initial process covariance estimate

    # Kalman gain is equal to k = hp-_j / (hp-_j + R).  With perfect measurement
    # R = 0, k reduces to k=1/h which is 1
    k = 1.0

    # one week data
    # original R2 = 0.183
    # with delay = 6, R2 = 0.295
    # with delay = 12, R2 = 0.147   
    # with delay = 48, R2 = 0.075
    delay = 6 

    # Kalman loop
    for t, x_sample in enumerate(x):

        if t <= delay:          
            # for the first day of the forecast,
            # we don't have forecast data and measurement 
            # from a day before to do correction
            x_sample_predict = x_sample             
        else: # t > 48
            # for a priori estimate we take x_sample as is
            # x_sample = x^-_j = a x^-_j_1 + b u_j
            # Inside the NWP (numerical weather prediction, 
            # the x_sample should be on x_sample_j-1 (assumption)

            x_sample_predict_prior = a * x_sample

            # we use the measurement from t-delay (ie. could be a day ago)
            # and forecast data from t-delay, to produce a leading residual that can be used to
            # correct the forecast.
            residual = z[t-delay] - h * x_sample_predict_list[t-delay]

            p_j_prior = a**2 * p_j + Q

            k = h * p_j_prior / (h**2 * p_j_prior + R)

            # we update our prediction based on the residual
            x_sample_predict = x_sample_predict_prior + k * residual

            p_j = p_j_prior * (1 - h * k)

            #print k
            #print p_j_prior
            #print p_j
            #raw_input()

        x_sample_predict_list.append(x_sample_predict)

    # initial goodness of fit
    R2_val_initial = calculate_regression(x,z)
    R2_string_initial = "R2 original: {0:10.3f}, ".format(R2_val_initial)   
    print R2_string_initial     # R2_val_original = 0.183

    original_RMSE = (((x-z)**2).mean())**0.5
    print "original_RMSE"
    print original_RMSE 
    print "\n"

    # final goodness of fit
    R2_val_final = calculate_regression(x_sample_predict_list,z)
    R2_string_final = "R2 final: {0:10.3f}, ".format(R2_val_final)  
    print R2_string_final       # R2_val_final = 0.267, which is better

    final_RMSE = (((x_sample_predict-z)**2).mean())**0.5
    print "final_RMSE"
    print final_RMSE    
    print "\n"


    timesteps = xrange(len(x))      
    pylab.plot(timesteps,x,'r-', timesteps,z,'b:', timesteps,x_sample_predict_list,'g--')
    pylab.xlabel('Time')
    pylab.ylabel('Wind Speed')
    pylab.title('Simulated Wind Speed vs Actual Wind Speed')
    pylab.legend(('predicted','measured','kalman'))
    pylab.show()


def calculate_regression(x, y):         
    R2 = 0  
    A = np.array( [x, np.ones(len(x))] )
    model, resid = np.linalg.lstsq(A.T, y)[:2]  
    R2_val = 1 - resid[0] / (y.size * y.var())          
    return R2_val

def load_x():
    return np.array([2, 3, 3, 5, 4, 4, 4, 5, 5, 6, 5, 7, 7, 7, 8, 8, 8, 9, 9, 10, 10, 10, 11, 11,
     11, 10, 8, 8, 8, 8, 6, 3, 4, 5, 5, 5, 6, 5, 5, 5, 6, 5, 5, 6, 6, 7, 6, 8, 9, 10,
     12, 11, 10, 10, 10, 11, 11, 10, 8, 8, 9, 8, 9, 9, 9, 9, 8, 9, 8, 11, 11, 11, 12,
     12, 13, 13, 13, 13, 13, 13, 13, 14, 13, 13, 12, 13, 13, 12, 12, 13, 13, 12, 12, 
     11, 12, 12, 19, 18, 17, 15, 13, 14, 14, 14, 13, 12, 12, 12, 12, 11, 10, 10, 10, 
     10, 9, 9, 8, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 7, 7, 8, 8, 8, 6, 5, 5, 
     5, 5, 5, 5, 6, 4, 4, 4, 6, 7, 8, 7, 7, 9, 10, 10, 9, 9, 8, 7, 5, 5, 5, 5, 5, 5, 
     5, 5, 6, 5, 5, 5, 4, 4, 6, 6, 7, 7, 7, 7, 6, 6, 5, 5, 4, 2, 2, 2, 1, 1, 1, 2, 3,
     13, 13, 12, 11, 10, 9, 10, 10, 8, 9, 8, 7, 5, 3, 2, 2, 2, 3, 3, 4, 4, 5, 6, 6,
     7, 7, 7, 6, 6, 6, 7, 6, 6, 5, 4, 4, 3, 3, 3, 2, 2, 1, 5, 5, 3, 2, 1, 2, 6, 7, 
     7, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 9, 9, 9, 9, 9, 8, 8, 8, 8, 7, 7, 
     7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 11, 11, 11, 11, 10, 10, 9, 10, 10, 10, 2, 2,
     2, 3, 1, 1, 3, 4, 5, 8, 9, 9, 9, 9, 8, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7,
     7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 7, 5, 5, 5, 5, 5, 6, 5])

def load_z():
    return np.array([3, 2, 1, 1, 1, 1, 3, 3, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 2, 1, 1, 2, 2, 2,
     2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 3, 4, 4, 4, 4, 5, 4, 4, 5, 5, 5, 6, 6,
     6, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 8, 8, 8, 8, 8, 8, 9, 10, 9, 9, 10, 10, 9,
     9, 10, 9, 9, 10, 9, 8, 9, 9, 7, 7, 6, 7, 6, 6, 7, 7, 8, 8, 8, 8, 8, 8, 7, 6, 7,
     8, 8, 7, 8, 9, 9, 9, 9, 10, 9, 9, 9, 8, 8, 10, 9, 10, 10, 9, 9, 9, 10, 9, 8, 7, 
     7, 7, 7, 8, 7, 6, 5, 4, 3, 5, 3, 5, 4, 4, 4, 2, 4, 3, 2, 1, 1, 2, 1, 2, 1, 4, 4,
     4, 4, 4, 3, 3, 3, 1, 1, 1, 1, 2, 3, 3, 2, 3, 3, 3, 2, 2, 5, 4, 2, 5, 4, 1, 1, 1, 
     1, 1, 1, 1, 2, 2, 1, 1, 3, 3, 3, 3, 3, 4, 3, 4, 3, 4, 4, 4, 4, 3, 3, 4, 4, 4, 4,
     4, 4, 5, 5, 5, 4, 3, 3, 3, 3, 3, 3, 3, 3, 1, 2, 2, 3, 3, 1, 2, 1, 1, 2, 4, 3, 1,
     1, 2, 0, 0, 0, 2, 1, 0, 0, 2, 3, 2, 4, 4, 3, 3, 4, 5, 5, 5, 4, 5, 4, 4, 4, 5, 5, 
     4, 3, 3, 4, 4, 4, 3, 3, 3, 4, 4, 4, 5, 5, 5, 4, 5, 5, 5, 5, 6, 5, 5, 8, 9, 8, 9,
     9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 9, 10, 9, 8, 8, 9, 8, 9, 9, 10, 9, 9, 9,
     7, 7, 9, 8, 7, 6, 6, 5, 5, 5, 5, 3, 3, 3, 4, 6, 5, 5, 6, 5])

if __name__ == '__main__': main()  # this avoids executing main on import your_module



回答2:


This line is not respecting the Scalar Kalman Filter:

residual = z[t-delay] - h * x_sample_predict_list[t-delay]

In my opinion you should have done:

 residual = z[t -delay] - h * x_sample_predict_prior


来源:https://stackoverflow.com/questions/19440881/python-using-kalman-filter-to-improve-simulation-but-getting-worse-results

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!