S&P 500 Trend Prediction
Shasha Yu,
Qinchen Zhang and
Yuwei Zhao
Papers from arXiv.org
Abstract:
This project aims to predict short-term and long-term upward trends in the S&P 500 index using machine learning models and feature engineering based on the "101 Formulaic Alphas" methodology. The study employed multiple models, including Logistic Regression, Decision Trees, Random Forests, Neural Networks, K-Nearest Neighbors (KNN), and XGBoost, to identify market trends from historical stock data collected from Yahoo! Finance. Data preprocessing involved handling missing values, standardization, and iterative feature selection to ensure relevance and variability. For short-term predictions, KNN emerged as the most effective model, delivering robust performance with high recall for upward trends, while for long-term forecasts, XGBoost demonstrated the highest accuracy and AUC scores after hyperparameter tuning and class imbalance adjustments using SMOTE. Feature importance analysis highlighted the dominance of momentum-based and volume-related indicators in driving predictions. However, models exhibited limitations such as overfitting and low recall for positive market movements, particularly in imbalanced datasets. The study concludes that KNN is ideal for short-term alerts, whereas XGBoost is better suited for long-term trend forecasting. Future enhancements could include advanced architectures like Long Short-Term Memory (LSTM) networks and further feature refinement to improve precision and generalizability. These findings contribute to developing reliable machine learning tools for market trend prediction and investment decision-making.
Date: 2024-12
New Economics Papers: this item is included in nep-big, nep-cmp and nep-fmk
References: Add references at CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2412.11462 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2412.11462
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().