UTAR Institutional Repository

Integrating natural language processing (NLP) for enhanced stock market prediction trhough text and news data fusion

Ong, Dun Yi (2025) Integrating natural language processing (NLP) for enhanced stock market prediction trhough text and news data fusion. Final Year Project, UTAR.

[img]
Preview
PDF
Download (1749Kb) | Preview

    Abstract

    This study addresses critical challenges in stock market forecasting by introducing FusionStockBERT, a transformer-based, multi-modal model that predicts both nextminute price movement direction (up/down) and expected return — the latter reframing traditional stock price regression into a more stable return regression task.Existing NLP methods in stock market prediction faced limitations from the context drift, extrapolation errors, out-of-vocab issue in lexicon, growth issue of dictionary size and the inability to capture complex textual semantics when being applied to the stock market movement prediction task. To address these issues, we proposed a self-supervised learning to further fine-tune the FinBERT (a pre-trained BERT model) directly on directional movement labels and augment its final [CLS] representation with engineered trading features via an intermediate neural fusion layer for downstream tasks.Evaluated on minute-level Bloomberg news transcripts paired with trading data,FusionStockBERT achieves 80.53% accuracy on the development set and 71.42% on a held-out validation set for directional movement prediction—substantially outperforming both non-attention CNN baselines and majority-class baselines. In the return regression task, it delivers an MSE of 8.9×10⁻⁴ on the validation set, demonstrating competitive precision in estimating the directional movement’s return.These results highlight that integrating fine-tuned transformer embeddings with structured market data provides a powerful, real-time tool for high-frequency trading decision support.

    Item Type: Final Year Project / Dissertation / Thesis (Final Year Project)
    Subjects: H Social Sciences > HB Economic Theory
    T Technology > T Technology (General)
    T Technology > TD Environmental technology. Sanitary engineering
    Divisions: Faculty of Information and Communication Technology > Bachelor of Computer Science (Honours)
    Depositing User: ML Main Library
    Date Deposited: 29 Aug 2025 11:33
    Last Modified: 29 Aug 2025 11:33
    URI: http://eprints.utar.edu.my/id/eprint/7324

    Actions (login required)

    View Item