Downloads

Jiarui Rao, Qian Zhang, Zong Ke, Shaoyu Liu, & Xinqi Liu. (2023). Integrating Textual Analytics with Time Series Forecasting Models: Enhancing Predictive Accuracy in Global Energy and Commodity Markets. Innovations in Applied Engineering and Technology, 2(1), 1–7. https://doi.org/10.62836/iaet.v2i1.265

Integrating Textual Analytics with Time Series Forecasting Models: Enhancing Predictive Accuracy in Global Energy and Commodity Markets

This study presents a comprehensive framework for predicting crude oil prices by integrating textual features extracted from news headlines into a time series forecasting model. The rationale for using headlines instead of full articles is twofold: headlines encapsulate the essence of the news, and the approach aligns with previous research by Li et al. The focus on futures news over gold news is justified by the larger dataset and the complex interrelations between futures prices, including gold, natural gas, and crude oil. The methodology involves extracting thematic and sentiment information from news headlines using text mining techniques, constructing daily topic strength indices, and developing an emotional strength index that accounts for the decay effect of news influence over time. The study employs a vector autoregression model to determine the optimal lags for various exogenous sequences, including topics and sentiment indices, relative to crude oil prices. The forecasting model is trained using machine learning techniques such as Random Forest Regression (RF), Support Vector Regression (SVR), Autoregressive Integrated Moving Average (ARIMA), and their extended versions with exogenous variables (ARIMAX). The performance of the models is evaluated using metrics like Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). The results indicate that incorporating textual features significantly improves the prediction accuracy of RF, SVR, and AdaBoost models, while the traditional ARIMA model performs well without textual features. The study also introduces a novel approach combining Ensemble Empirical Mode Decomposition (EEMD) with Independent Component Analysis for analyzing non-linear and non-stationary time series data, specifically applied to gold price analysis. The EEMD-BPNN-ADD model is identified as the most accurate for forecasting, with interval predictions provided for gold prices. This research contributes to the field by demonstrating the effectiveness of integrating textual analysis with traditional financial models for improved market forecasting.

financial time series; decomposition techniques; trend analysis; seasonality; forecasting; risk management

References

  1. Li S, Mo Y, Li Z. Automated Pneumonia Detection in Chest X-Ray Images Using Deep Learning Model. Innovations in Applied Engineering and Technology 2022; 1: 1–6.
  2. Wu Z, Wang Q, Gribok AV, Chen KP. Pipeline Degradation Evaluation Based on Distributed Fiber Sensors and Convolutional Neural Networks (CNNs). In Proceedings of the 27th International Conference on Optical Fiber Sensors, Alexandria, VA, USA, 29 August–2 September 2022. https://doi.org/10.1364/OFS.2022.W4.41.
  3. Wang Q, Jian J, Wang M, Wu J, Mao Z-H, Gribok AV, Chen KP. Pipeline Defects Detection and Classification Based on Distributed Fiber Sensors and Neural Networks. In Proceedings of the Optical Fiber Sensors Conference 2020 Special Edition, OSA Technical Digest, Washington, DC, USA, 8–12 June 2020. https://doi.org/10.1364/OFS.2020.W2B.3.
  4. Peng Z, Jian J, Wang M, Wang Q, Boyer T, Wen H, Liu H, Mao Z-H, Chen KP. Big Data Analytics on Fiber-Optical Distributed Acoustic Sensing with Rayleigh Enhancements. In Proceedings of the 2019 IEEE Photonics Conference (IPC), San Antonio, TX, USA, 29 September–3 October 2019; pp. 1–3. https://doi.org/10.1109/IPCon.2019.8908496.
  5. Wang Q, Zhao K, Badar M, Yi X, Lu P, Buric M, Mao Z-H, Chen KP. Improving OFDR Distributed Fiber Sensing by Fibers with Enhanced Rayleigh Backscattering and Image Processing. IEEE Sensors Journal 2022; 22: 18471–18478. https://doi.org/10.1109/JSEN.2022.3197730.
  6. Badar M, Lu P, Wang M, Wang Q, Chen KP, Buric M, Ohodnicki PR. Integrated Auxiliary Interferometer to Correct Non-Linear Tuning Errors in OFDR. In Proceedings of the SPIE, Optical Waveguide and Laser Sensors, 114050G, Online, 8 May 2020; Volume 11405. https://doi.org/10.1117/12.2558910.
  7. Kumada H, Li Y, Yasuoka K, Naito F, Kurihara T, Sugimura T, Sakae T. Current Development Status of iBNCT001, Demonstrator of a LINAC-based Neutron Source for BNCT. Journal of Neutron Research 2022; 24(3–4): 347–358. https://doi.org/10.3233/JNR-220029.
  8. Chen M, Chen Y, Zhang Q. A Review of Energy Consumption in the Acquisition of Bio-Feedstock for Microalgae Biofuel Production. Sustainability 2021; 13(16): 8873.
  9. Li Y, Mizumoto M, Oshiro Y, Nitta H, Saito T, Iizumi T, Sakurai H. A Retrospective Study of Renal Growth Changes after Proton Beam Therapy for Pediatric Malignant Tumor. Current Oncology 2023; 30: 1560–1570. https://doi.org/10.3390/curroncol30020120.
  10. Li Y, Shimizu S, Mizumoto M, Iizumi T, Numajiri H, Makishima H, Sakurai H. Proton Beam Therapy for Multifocal Hepatocellular Carcinoma (HCC) Showing Complete Response in Pathological Anatomy After Liver Transplantation. Cureus 2022; 14: e25744. https://doi.org/10.7759/cureus.25744.
  11. Li Y, Matsumoto Y, Chen L, Sugawara Y, Oe E, Fujisawa N, Sakurai H. Smart Nanofiber Mesh with Locally Sustained Drug Release Enabled Synergistic Combination Therapy for Glioblastoma. Nanomaterials 2023; 13: 414. https://doi.org/10.3390/nano13030414.
  12. Chen M. Investigating the Influence of Interannual Precipitation Variability on Terrestrial Ecosystem Productivity. Doctoral Dissertation, Massachusetts Institute of Technology, Cambridge, MA, USA, 2023.
  13. Chen M. Annual Precipitation Forecast of Guangzhou Based on Genetic Algorithm and Backpropagation Neural Network (GA-BP). In Proceedings of the International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2021), 19–21 November 2021; Volume 12156, pp. 182–186.
  14. Dong S, Xu T, Chen M. Solar Radiation Characteristics in Shanghai. In Journal of Physics: Conference Series; IOP Publishing: Bristol, UK, 2022; Volume 2351, p. 012016.
  15. Wang R, Shapiro V. Topological Semantics for Lumped Parameter Systems Modeling. Advanced Engineering Informatics 2019; 42: 100958.
  16. Wang R, Behandish M. Surrogate Modeling for Physical Systems with Preserved Properties and Adjustable Tradeoffs. arXiv 2022, arXiv:2202.01139.
  17. Wang J, Tong J, Tan K, Vorobeychik Y, Kantaros Y. Conformal Temporal Logic Planning Using Large Language Models: Knowing When to Do What and When to Ask for Help. arXiv 2023, arXiv:2309.10092.
  18. Shimizu S, Nakai K, Li Y, Mizumoto M, Kumada H, Ishikawa E, Sakurai H. Boron Neutron Capture Therapy for Recurrent Glioblastoma Multiforme: Imaging Evaluation of a Case with Long-Term Local Control and Survival. Cureus 2023; 15: e33898. https://doi.org/10.7759/cureus.33898.
  19. Shimizu S, Mizumoto M, Okumura T, Li Y, Baba K, Murakami M, Sakurai H. Proton Beam Therapy for a Giant Hepatic Hemangioma: A Case Report and Literature Review. Clinical and Translational Radiation Oncology 2021; 27: 152–156. https://doi.org/10.1016/j.ctro.2021.01.014.
  20. Li S, Mo Y, Li Z. Automated Pneumonia Detection in Chest X-Ray Images Using Deep Learning Model. Innovations in Applied Engineering and Technology 2022; 1(1): 1–6. https://doi.org/10.62836/iaet.vli1.002.
  21. Li Z, et al. Stock Market Analysis and Prediction Using LSTM: A Case Study on Technology Stocks.Innovations in Applied Engineering and Technology 2023; 2(1): 1–6. https://doi.org/10.62836/iaet.v2i1.162.

Supporting Agencies

  1. Funding: Not applicable.