Second order extended ensemble Kalman filter with stochastically perturbed innovation for initializing artificial neural network weights

Cavin Oyugi Ongere; David Angwenyi; Robert Oryiema

doi:10.51867/ajernet.6.3.42

Auteurs

Cavin Oyugi Ongere
cavinovic@gmail.com
Department of Mathematics, Masinde Muliro University of Science and Technology, Kenya
David Angwenyi Department of Mathematics, Masinde Muliro University of Science and Technology, Kenya https://orcid.org/0000-0002-6958-2817
Robert Oryiema Department of Mathematics, Masinde Muliro University of Science and Technology, Kenya

Mots-clés :

Bayesian method, Convergence time, Non-linaer filtering, Non-linear state space dynamic models

Résumé

Artificial neural networks are widely applied in solving non-linear state-space dynamic models, yet the challenge of inaccurate initial weights remains a critical bottleneck. Weight initialization methods significantly influence convergence speed and model efficiency. While conventional approaches such as random initialization and filtering techniques are commonly used, the Bayesian method—though highly accurate—suffers from the computational burden of inverting highdimensional matrices. This study has developed a novel solution: the Second Order Extended Ensemble Filter with Perturbed Innovation (SoEEFPI). Developed from a second-order Taylor expansion of the stochastically perturbed KushnerStratonovich equation, SoEEFPI provides a tractable numerical solution to the inverse covariance matrix problem. Validation is conducted using the Lorenz63 system, comparing SoEEFPI’s performance to that of the Kalman-Bucy Filter (SoEKBF) and the First Order Extended Ensemble Filter (FoEEF) in MATLAB. Furthermore, SoEEFPI is employed to initialize neural network weights, yielding a new model whose convergence time, RMSE, and epoch count are evaluated. Results demonstrate improved convergence efficiency and accuracy, positioning SoEEFPI as a robust alternative for neural network initialization.

Dimensions

REFERENCES

Angwenyi, D., and Oryiema, R. (2023). An Extended Ensemble Kalman Filter for Nonlinear State Estimation, Journal of Computational Dynamics, vol. 12, no. 3, pp. 123-135.

Angwenyi, D., and Oryiema, R. (2023). First Order Extended Ensemble Filter for Neural Network Weight Estimation, Proceedings of the International Conference on Computational Intelligence, pp. 45-54.

Chen, Y., Chi, Y., Fan, J., & Ma, C. (2019). Gradient descent with random initialization: Fast global convergence for nonconvex phase retrieval. Mathematical Programming, 176. https://doi.org/10.1007/s10107-019-01363-6 DOI: https://doi.org/10.1007/s10107-019-01363-6

Corenflos, A., Thornton, J., Deligiannidis, G., & Doucet, A. (2021). Differentiable particle filtering via entropyregularized optimal transport. In Proceedings of the 38th International Conference on Machine Learning (ICML), pp. 2100-2111.

Glorot, X., Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), 249-256.

Hardesty, L. (2017). Explained: Neural networks. MIT News, 14.

Liao, K.-H. (2021). Enhanced Battery State of Charge Estimation by Machine Learning and Unscented Kalman Filter. California State University, Los Angeles.

Igbida, N. (2009). Equivalent formulations for Monge-Kantorovich equation. Nonlinear Analysis: Theory, Methods and Applications, 71(9), 3805-3813. https://doi.org/10.1016/j.na.2009.02.039 DOI: https://doi.org/10.1016/j.na.2009.02.039

He, K., Zhang, X., Ren, S., Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 1026-1034. https://doi.org/10.1109/ICCV.2015.123 DOI: https://doi.org/10.1109/ICCV.2015.123

Makkuva, A., Taghvaei, A., Oh, S., & Lee, J. (2020). Optimal transport mapping via input convex neural networks. In Proceedings of the 37th International Conference on Machine Learning (ICML), pp. 6672-6681.

Marron, M., Garcia, J. C., Sotelo, M. A., Cabello, M., Pizarro, D., Huerta, F., & del Cerro, J. (2007). Comparing a Kalman filter and a particle filter in a multiple objects tracking application. In Proceedings of the 2007 IEEE International Symposium on Intelligent Signal Processing (WISP), IEEE. 538 DOI: https://doi.org/10.1109/WISP.2007.4447520

Murru, N., & Rossini, R. (2020). A Bayesian approach for initialization of weights in backpropagation neural net with application to character recognition. arXiv preprint arXiv:2004.01875.

Midenyo, K., Angwenyi, D., & Oganga, D. (2024). Second Order Extended Ensemble Filter for Non-linear Filtering. African Journal of Empirical Research, 5(4), 302-316.

https://doi.org/10.51867/ajernet.mathematics.5.4.25 DOI: https://doi.org/10.51867/ajernet.mathematics.5.4.25

M'erigot, Q., & Thibert, B. (2021). Optimal transport: discretization and algorithms. In P. G. Ciarlet & T. Gallou¨et (Eds.), Handbook of Numerical Analysis, Vol. 22, pp. 133-212. Elsevier. https://doi.org/10.1016/bs.hna.2020.10.001 DOI: https://doi.org/10.1016/bs.hna.2020.10.001

Narkhede, S. (2021). Weight Initialization Techniques in Neural Networks: A Brief Overview. Towards Data Science. Retrieved from https://towardsdatascience.com/weight-initialization-techniques-in-neural-networks-26c649eb3b78

https://doi.org/10.1007/s10462-021-10033-z DOI: https://doi.org/10.1007/s10462-021-10033-z

R. Oryiema and D. Angwenyi, Improved Extended Kalman Filter Techniques for Neural Network Initialization. International Journal of Neural Systems, vol. 18, no. 2, pp. 85-98, 2023.

Sarfaraz, A., Mohsin, M. (2018). Neural Network Weight Initialization: A Survey. International Journal of Computer Applications, 179(10), 1-6.

Xia, N., Qiu, T.-S., Li, J.-C., & Li, S.-F. (2013). A nonlinear filtering algorithm combining the Kalman filter and the particle filter. Acta Electronica Sinica, 41.