Python utiliza el filtro Kalman para mejorar la simulación pero obtiene peores resultados

Tengo preguntas sobre el comportamiento que veo al aplicar el filtro de Kalman (KF) al siguiente problema de pronóstico. He incluido un ejemplo de código simple.

Objetivo: Me gustaría saber si KF es adecuado para mejorar el pronóstico / simulación para un día por delante (en t + 24 horas), utilizando el resultado de medición obtenido ahora (en t). El objective es obtener el pronóstico lo más cerca posible de la medición.

Supuesto: Suponemos que la medición es perfecta (es decir, si podemos lograr que el pronóstico coincida perfectamente con la medición, estamos contentos).

Tenemos una única variable de medición (z, velocidad real del viento) y una única variable simulada (x, velocidad pronosticada del viento).

La velocidad del viento simulada x se produce a partir de un software de PNT (predicción numérica del tiempo) que utiliza una variedad de datos meteorológicos (caja negra para mí). El archivo de simulación se produce diariamente y contiene datos cada media hora.

Intento corregir el pronóstico de t + 24 h usando la medición que obtuve ahora y los datos de pronóstico ahora (generados hace t-24 h) usando un filtro escalar de Kalman. Para referencia, utilicé: http://www.swarthmore.edu/NatSci/echeeve1/Ref/Kalman/ScalarKalman.html

Código:

#! /usr/bin/python import numpy as np import pylab import os def main(): # x = 336 data points of simulated wind speed for 7 days * 24 hour * 2 (every half an hour) # Imagine at time t, we will get a x_t fvalue or t+48 or a 24 hours later. x = load_x() # this is a list that will contain 336 data points of our corrected data x_sample_predict_list = [] # z = 336 data points for 7 days * 24 hour * 2 of actual measured wind speed (every half an hour) z = load_z() # Here is the setup of the scalar kalman filter # reference: http://www.swarthmore.edu/NatSci/echeeve1/Ref/Kalman/ScalarKalman.html # state transition matrix (we simply have a scalar) # what you need to multiply the last time's state to get the newest state # we get the x_t+1 = A * x_t, since we get the x_t+1 directly for simulation # we will have a = 1 a = 1.0 # observation matrix # what you need to multiply to the state, convert it to the same form as incoming measurement # both state and measurements are wind speed, so set h = 1 h = 1.0 Q = 16.0 # expected process variance of predicted Wind Speed R = 9.0 # expected measurement variance of Wind Speed p_j = Q # process covariance is equal to the initial process covariance estimate # Kalman gain is equal to k = hp-_j / (hp-_j + R). With perfect measurement # R = 0, k reduces to k=1/h which is 1 k = 1.0 # one week data # original R2 = 0.183 # with delay = 6, R2 = 0.295 # with delay = 12, R2 = 0.147 # with delay = 48, R2 = 0.075 delay = 6 # Kalman loop for t, x_sample in enumerate(x): if t  48 # for a priori estimate we take x_sample as is # x_sample = x^-_j = ax^-_j_1 + b u_j # Inside the NWP (numerical weather prediction, # the x_sample should be on x_sample_j-1 (assumption) x_sample_predict_prior = a * x_sample # we use the measurement from t-delay (ie. could be a day ago) # and forecast data from t-delay, to produce a leading residual that can be used to # correct the forecast. residual = z[t-delay] - h * x_sample_predict_list[t-delay] p_j_prior = a**2 * p_j + Q k = h * p_j_prior / (h**2 * p_j_prior + R) # we update our prediction based on the residual x_sample_predict = x_sample_predict_prior + k * residual p_j = p_j_prior * (1 - h * k) #print k #print p_j_prior #print p_j #raw_input() x_sample_predict_list.append(x_sample_predict) # initial goodness of fit R2_val_initial = calculate_regression(x,z) R2_string_initial = "R2 initial: {0:10.3f}, ".format(R2_val_initial) print R2_string_initial # R2_val_initial = 0.183 # final goodness of fit R2_val_final = calculate_regression(x_sample_predict_list,z) R2_string_final = "R2 final: {0:10.3f}, ".format(R2_val_final) print R2_string_final # R2_val_final = 0.117, which is worse timesteps = xrange(len(x)) pylab.plot(timesteps,x,'r-', timesteps,z,'b:', timesteps,x_sample_predict_list,'g--') pylab.xlabel('Time') pylab.ylabel('Wind Speed') pylab.title('Simulated Wind Speed vs Actual Wind Speed') pylab.legend(('predicted','measured','kalman')) pylab.show() def calculate_regression(x, y): R2 = 0 A = np.array( [x, np.ones(len(x))] ) model, resid = np.linalg.lstsq(AT, y)[:2] R2_val = 1 - resid[0] / (y.size * y.var()) return R2_val def load_x(): return np.array([2, 3, 3, 5, 4, 4, 4, 5, 5, 6, 5, 7, 7, 7, 8, 8, 8, 9, 9, 10, 10, 10, 11, 11, 11, 10, 8, 8, 8, 8, 6, 3, 4, 5, 5, 5, 6, 5, 5, 5, 6, 5, 5, 6, 6, 7, 6, 8, 9, 10, 12, 11, 10, 10, 10, 11, 11, 10, 8, 8, 9, 8, 9, 9, 9, 9, 8, 9, 8, 11, 11, 11, 12, 12, 13, 13, 13, 13, 13, 13, 13, 14, 13, 13, 12, 13, 13, 12, 12, 13, 13, 12, 12, 11, 12, 12, 19, 18, 17, 15, 13, 14, 14, 14, 13, 12, 12, 12, 12, 11, 10, 10, 10, 10, 9, 9, 8, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 7, 7, 8, 8, 8, 6, 5, 5, 5, 5, 5, 5, 6, 4, 4, 4, 6, 7, 8, 7, 7, 9, 10, 10, 9, 9, 8, 7, 5, 5, 5, 5, 5, 5, 5, 5, 6, 5, 5, 5, 4, 4, 6, 6, 7, 7, 7, 7, 6, 6, 5, 5, 4, 2, 2, 2, 1, 1, 1, 2, 3, 13, 13, 12, 11, 10, 9, 10, 10, 8, 9, 8, 7, 5, 3, 2, 2, 2, 3, 3, 4, 4, 5, 6, 6, 7, 7, 7, 6, 6, 6, 7, 6, 6, 5, 4, 4, 3, 3, 3, 2, 2, 1, 5, 5, 3, 2, 1, 2, 6, 7, 7, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 9, 9, 9, 9, 9, 8, 8, 8, 8, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 11, 11, 11, 11, 10, 10, 9, 10, 10, 10, 2, 2, 2, 3, 1, 1, 3, 4, 5, 8, 9, 9, 9, 9, 8, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 7, 5, 5, 5, 5, 5, 6, 5]) def load_z(): return np.array([3, 2, 1, 1, 1, 1, 3, 3, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 2, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 3, 4, 4, 4, 4, 5, 4, 4, 5, 5, 5, 6, 6, 6, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 8, 8, 8, 8, 8, 8, 9, 10, 9, 9, 10, 10, 9, 9, 10, 9, 9, 10, 9, 8, 9, 9, 7, 7, 6, 7, 6, 6, 7, 7, 8, 8, 8, 8, 8, 8, 7, 6, 7, 8, 8, 7, 8, 9, 9, 9, 9, 10, 9, 9, 9, 8, 8, 10, 9, 10, 10, 9, 9, 9, 10, 9, 8, 7, 7, 7, 7, 8, 7, 6, 5, 4, 3, 5, 3, 5, 4, 4, 4, 2, 4, 3, 2, 1, 1, 2, 1, 2, 1, 4, 4, 4, 4, 4, 3, 3, 3, 1, 1, 1, 1, 2, 3, 3, 2, 3, 3, 3, 2, 2, 5, 4, 2, 5, 4, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 3, 3, 3, 3, 3, 4, 3, 4, 3, 4, 4, 4, 4, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 4, 3, 3, 3, 3, 3, 3, 3, 3, 1, 2, 2, 3, 3, 1, 2, 1, 1, 2, 4, 3, 1, 1, 2, 0, 0, 0, 2, 1, 0, 0, 2, 3, 2, 4, 4, 3, 3, 4, 5, 5, 5, 4, 5, 4, 4, 4, 5, 5, 4, 3, 3, 4, 4, 4, 3, 3, 3, 4, 4, 4, 5, 5, 5, 4, 5, 5, 5, 5, 6, 5, 5, 8, 9, 8, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 9, 10, 9, 8, 8, 9, 8, 9, 9, 10, 9, 9, 9, 7, 7, 9, 8, 7, 6, 6, 5, 5, 5, 5, 3, 3, 3, 4, 6, 5, 5, 6, 5]) if __name__ == '__main__': main() # this avoids executing main on import your_module 

————————-

Observaciones:

1) Si el pronóstico de ayer es sobre-predecible (sesgo positivo), entonces hoy, haría correcciones restando el sesgo. En la práctica, si hoy estoy por debajo de las predicciones, restar el sesgo positivo conduce a una predicción aún peor. Y en realidad observo un mayor intercambio de datos con un ajuste general más deficiente. ¿Qué hay de malo en mi ejemplo?

2) La mayoría del recurso de filtro de Kalman indica que el filtro de Kalman minimiza la covarianza a posteriori p_j = E {(x_j – x ^ _j)}, y tiene la prueba de seleccionar K para minimizar el p_j. Pero, ¿puede alguien explicar cómo minimizar la covarianza a posteriori minimiza los efectos del ruido blanco del proceso w? En un caso en tiempo real, digamos que la velocidad real del viento y la velocidad medida del viento es de 5 m / s. La predicción de la velocidad del viento es de 6 m / s, es decir. hubo un ruido de w = 1 m / s. El residual es 5 – 6 = -1 m / s. Se corrige tomando los 1 m / s de su predicción para recuperar 5 m / s. ¿Es así como se minimiza el efecto del ruido de proceso?

3) Aquí hay un documento que menciona la aplicación de KF para suavizar el pronóstico del tiempo. http://hal.archives-ouvertes.fr/docs/00/50/59/93/PDF/Louka_etal_jweia2008.pdf . El punto interesante está en pg 9 eq (7) que “tan pronto como se conoce el nuevo valor de observación y_t, la estimación de x en el tiempo t se convierte en x_t = x_t / t-1 = K (y_t – H_t * x_t / t- 1) ”. Si tuviera que parafrasearlo en referencia al tiempo real, entonces “tan pronto como se conozca el nuevo valor de observación ahora, la estimación ahora se convierte en x_t…. “Entiendo cómo KF puede acercar sus datos a la medición en tiempo real. Pero si está corrigiendo los datos de pronóstico en t = ahora, con los datos de medición de t = ahora, ¿cómo es eso un pronóstico más?

¡Gracias!

ACTUALIZACIÓN1:

4) He agregado una demora en el código para investigar cuánto más tarde puede ser el pronóstico que el sesgo actual calculado a partir de la medición actual, si queremos que el R2 entre los datos procesados ​​de Kalman y las series temporales de datos de medición mejore a partir de los datos no procesados vs. datos de medida. En este ejemplo, si la medición se utiliza para mejorar el tiempo de pronóstico 6 (3 horas a partir de ahora), sigue siendo útil (R2 va de 0.183 a 0.295). Pero si la medición se usa para mejorar el pronóstico dentro de 1 día, entonces se destruye la correlación (el R2 baja a 0.075).

Actualicé mi implementación escalar de prueba, sin suponer una medida perfecta R de 1, que fue lo que redujo la ganancia de kalman a un valor constante de 1. Ahora estoy viendo una mejora en la serie de tiempo con un error RMSE reducido.

 #! /usr/bin/python import numpy as np import pylab import os # RMSE improved def main(): # x = 336 data points of simulated wind speed for 7 days * 24 hour * 2 (every half an hour) # Imagine at time t, we will get a x_t fvalue or t+48 or a 24 hours later. x = load_x() # this is a list that will contain 336 data points of our corrected data x_sample_predict_list = [] # z = 336 data points for 7 days * 24 hour * 2 of actual measured wind speed (every half an hour) z = load_z() # Here is the setup of the scalar kalman filter # reference: http://www.swarthmore.edu/NatSci/echeeve1/Ref/Kalman/ScalarKalman.html # state transition matrix (we simply have a scalar) # what you need to multiply the last time's state to get the newest state # we get the x_t+1 = A * x_t, since we get the x_t+1 directly for simulation # we will have a = 1 a = 1.0 # observation matrix # what you need to multiply to the state, convert it to the same form as incoming measurement # both state and measurements are wind speed, so set h = 1 h = 1.0 Q = 1.0 # expected process noise of predicted Wind Speed R = 1.0 # expected measurement noise of Wind Speed p_j = Q # process covariance is equal to the initial process covariance estimate # Kalman gain is equal to k = hp-_j / (hp-_j + R). With perfect measurement # R = 0, k reduces to k=1/h which is 1 k = 1.0 # one week data # original R2 = 0.183 # with delay = 6, R2 = 0.295 # with delay = 12, R2 = 0.147 # with delay = 48, R2 = 0.075 delay = 6 # Kalman loop for t, x_sample in enumerate(x): if t <= delay: # for the first day of the forecast, # we don't have forecast data and measurement # from a day before to do correction x_sample_predict = x_sample else: # t > 48 # for a priori estimate we take x_sample as is # x_sample = x^-_j = ax^-_j_1 + b u_j # Inside the NWP (numerical weather prediction, # the x_sample should be on x_sample_j-1 (assumption) x_sample_predict_prior = a * x_sample # we use the measurement from t-delay (ie. could be a day ago) # and forecast data from t-delay, to produce a leading residual that can be used to # correct the forecast. residual = z[t-delay] - h * x_sample_predict_list[t-delay] p_j_prior = a**2 * p_j + Q k = h * p_j_prior / (h**2 * p_j_prior + R) # we update our prediction based on the residual x_sample_predict = x_sample_predict_prior + k * residual p_j = p_j_prior * (1 - h * k) #print k #print p_j_prior #print p_j #raw_input() x_sample_predict_list.append(x_sample_predict) # initial goodness of fit R2_val_initial = calculate_regression(x,z) R2_string_initial = "R2 original: {0:10.3f}, ".format(R2_val_initial) print R2_string_initial # R2_val_original = 0.183 original_RMSE = (((xz)**2).mean())**0.5 print "original_RMSE" print original_RMSE print "\n" # final goodness of fit R2_val_final = calculate_regression(x_sample_predict_list,z) R2_string_final = "R2 final: {0:10.3f}, ".format(R2_val_final) print R2_string_final # R2_val_final = 0.267, which is better final_RMSE = (((x_sample_predict-z)**2).mean())**0.5 print "final_RMSE" print final_RMSE print "\n" timesteps = xrange(len(x)) pylab.plot(timesteps,x,'r-', timesteps,z,'b:', timesteps,x_sample_predict_list,'g--') pylab.xlabel('Time') pylab.ylabel('Wind Speed') pylab.title('Simulated Wind Speed vs Actual Wind Speed') pylab.legend(('predicted','measured','kalman')) pylab.show() def calculate_regression(x, y): R2 = 0 A = np.array( [x, np.ones(len(x))] ) model, resid = np.linalg.lstsq(AT, y)[:2] R2_val = 1 - resid[0] / (y.size * y.var()) return R2_val def load_x(): return np.array([2, 3, 3, 5, 4, 4, 4, 5, 5, 6, 5, 7, 7, 7, 8, 8, 8, 9, 9, 10, 10, 10, 11, 11, 11, 10, 8, 8, 8, 8, 6, 3, 4, 5, 5, 5, 6, 5, 5, 5, 6, 5, 5, 6, 6, 7, 6, 8, 9, 10, 12, 11, 10, 10, 10, 11, 11, 10, 8, 8, 9, 8, 9, 9, 9, 9, 8, 9, 8, 11, 11, 11, 12, 12, 13, 13, 13, 13, 13, 13, 13, 14, 13, 13, 12, 13, 13, 12, 12, 13, 13, 12, 12, 11, 12, 12, 19, 18, 17, 15, 13, 14, 14, 14, 13, 12, 12, 12, 12, 11, 10, 10, 10, 10, 9, 9, 8, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 7, 7, 8, 8, 8, 6, 5, 5, 5, 5, 5, 5, 6, 4, 4, 4, 6, 7, 8, 7, 7, 9, 10, 10, 9, 9, 8, 7, 5, 5, 5, 5, 5, 5, 5, 5, 6, 5, 5, 5, 4, 4, 6, 6, 7, 7, 7, 7, 6, 6, 5, 5, 4, 2, 2, 2, 1, 1, 1, 2, 3, 13, 13, 12, 11, 10, 9, 10, 10, 8, 9, 8, 7, 5, 3, 2, 2, 2, 3, 3, 4, 4, 5, 6, 6, 7, 7, 7, 6, 6, 6, 7, 6, 6, 5, 4, 4, 3, 3, 3, 2, 2, 1, 5, 5, 3, 2, 1, 2, 6, 7, 7, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 9, 9, 9, 9, 9, 8, 8, 8, 8, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 11, 11, 11, 11, 10, 10, 9, 10, 10, 10, 2, 2, 2, 3, 1, 1, 3, 4, 5, 8, 9, 9, 9, 9, 8, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 7, 5, 5, 5, 5, 5, 6, 5]) def load_z(): return np.array([3, 2, 1, 1, 1, 1, 3, 3, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 2, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 3, 4, 4, 4, 4, 5, 4, 4, 5, 5, 5, 6, 6, 6, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 8, 8, 8, 8, 8, 8, 9, 10, 9, 9, 10, 10, 9, 9, 10, 9, 9, 10, 9, 8, 9, 9, 7, 7, 6, 7, 6, 6, 7, 7, 8, 8, 8, 8, 8, 8, 7, 6, 7, 8, 8, 7, 8, 9, 9, 9, 9, 10, 9, 9, 9, 8, 8, 10, 9, 10, 10, 9, 9, 9, 10, 9, 8, 7, 7, 7, 7, 8, 7, 6, 5, 4, 3, 5, 3, 5, 4, 4, 4, 2, 4, 3, 2, 1, 1, 2, 1, 2, 1, 4, 4, 4, 4, 4, 3, 3, 3, 1, 1, 1, 1, 2, 3, 3, 2, 3, 3, 3, 2, 2, 5, 4, 2, 5, 4, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 3, 3, 3, 3, 3, 4, 3, 4, 3, 4, 4, 4, 4, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 4, 3, 3, 3, 3, 3, 3, 3, 3, 1, 2, 2, 3, 3, 1, 2, 1, 1, 2, 4, 3, 1, 1, 2, 0, 0, 0, 2, 1, 0, 0, 2, 3, 2, 4, 4, 3, 3, 4, 5, 5, 5, 4, 5, 4, 4, 4, 5, 5, 4, 3, 3, 4, 4, 4, 3, 3, 3, 4, 4, 4, 5, 5, 5, 4, 5, 5, 5, 5, 6, 5, 5, 8, 9, 8, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 9, 10, 9, 8, 8, 9, 8, 9, 9, 10, 9, 9, 9, 7, 7, 9, 8, 7, 6, 6, 5, 5, 5, 5, 3, 3, 3, 4, 6, 5, 5, 6, 5]) if __name__ == '__main__': main() # this avoids executing main on import your_module 

Esta línea no respeta el filtro escalar de Kalman :

 residual = z[t-delay] - h * x_sample_predict_list[t-delay] 

En mi opinión deberías haber hecho:

  residual = z[t -delay] - h * x_sample_predict_prior