About SmartStockPredictor


Approach & Key Concepts

We use historical market data from 700 to 1800 days in the past to forecast stock behavior for the upcoming 14 to 45 days. This timeframe is chosen to balance the need for sufficient post-COVID market context with a forecasting window that minimizes short-term volatility while still capturing market trends.

For each stock or ETF, we calculate an extensive set of features (detailed below) and build a custom predictive model. This allows for each model to capture the nuances of each stock/ETF. Each model is then backtested and saved.

Every night we introduce a new parameter setting, and retrain models for 50+ assets to provide the most up-to-date predictions. Our API then delivers these forecasts. This also allows for meta-learning, where we can analyze the performance of different models and parameters to improve our overall approach.

While our APIs are flexible and support all stocks and public ETFs, we recommend focusing on the consumer goods sector. Our findings show companies in this sector tend to align with macroeconomic trends and technical indicators, making them ideal candidates for our predictive models. Examples include:

  • Procter & Gamble (PG)
  • Johnson & Johnson (JNJ)
  • Coca-Cola (KO)
  • Consumer Staples Select Sector SPDR Fund (XLP)

Our models integrate a wide array of features:

  • Stock Prices: Open, High, Low, Close, Volume
  • Technical Indicators: RSI, MACD, MACD Signal, MACD Histogram, ADX, Bollinger Bands (Upper and Lower Bands)
  • Market Performance: S&P 500 Return, NASDAQ Return
  • Economic Indicators: GDP Growth, Inflation Rate, Unemployment Rate
  • Company Financials: Market Capitalization, P/E Ratio, Dividend Yield, Beta
  • Options Data: Put/Call Ratio, Implied Volatility
  • Market Sentiment: Sentiment Scores derived from AI-generated news analysis

Our Predictive Models

We utilize XGBoost for binary classification to predict whether a stock's price will increase or decrease over the next X days. XGBoost is selected for its ability to handle non-linear relationships, robustness to outliers, and high performance on tabular data.

Why Binary Classification?
  • Provides clear directional forecasts to simplify investment decisions.
  • Focuses on price movement direction rather than precise price levels.
Advantages of XGBoost:
  • Efficiently manages complex, non-linear data.
  • Delivers interpretable feature importance scores.
  • Optimized for speed and high performance.

For predicting exact returns X days into the future, we employ the Seasonal Autoregressive Integrated Moving Average with Exogenous Regressors (SARIMAX) model.

Why SARIMAX?
  • Incorporates exogenous variables such as economic indicators and technical data without risking data leakage.
  • Effectively captures autocorrelation and seasonal trends in time series data.
  • Offers high interpretability through detailed model diagnostics.
Benefits:
  • Ideal for small to medium-sized datasets.
  • Balances complexity to prevent overfitting.
  • Widely accepted and efficient in financial forecasting.

Model Evaluation & Validation

We use cross-validation to assess model performance on unseen data, reducing the risk of overfitting.

For classification models, we evaluate:

  • Accuracy
  • Precision
  • Recall
  • F1-score
  • ROC-AUC

For regression models, we assess:

  • RMSE
  • MAE
  • MAPE
  • Confidence Scores

By analyzing feature importance, we ensure that our models make decisions based on the most meaningful predictors, which helps maintain transparency in how forecasts are generated.

We meticulously separate training and testing datasets in our time series analyses to ensure that future data does not influence model training. This step is critical for producing reliable and unbiased predictions.

User Insights & FAQs

We are committed to transparency. Below, you’ll find common questions users ask about our process, along with detailed answers. If you have additional questions, feel free to reach out.

Our models undergo extensive testing and validation. We provide performance metrics alongside our forecasts to ensure full transparency regarding accuracy.

We implement cross-validation, regularization techniques, and carefully manage model complexity to effectively prevent overfitting.

Consumer goods stocks typically exhibit stability and are highly influenced by macroeconomic trends, making them ideal for our predictive models.

Our models are continuously retrained with the latest available data to ensure that our predictions remain accurate and reflective of current market conditions.

Yes, we provide detailed performance metrics and explanations for each model, ensuring transparency and helping users understand our forecasting approach.

Disclaimer

While our models are built using robust statistical methods and comprehensive datasets, they are intended as predictive tools rather than guarantees of future performance. We encourage users to consult with financial advisors and consider multiple sources of information before making any investment decisions.