AI Forecasts Supply Chain Disruptions from News

TL;DR

New AI trained on news articles predicts supply chain shocks more accurately than general models, helping businesses act before disruptions hit.

Supply chain disruptions are costly and difficult to anticipate, often catching firms and policymakers off guard with delayed or incomplete signals. Traditional indicators lag behind real-time developments, creating a forecasting gap where decisions must be made before reliable data arrives. A new study addresses this by using artificial intelligence to predict disruptions from news articles, offering a timely way to leverage unstructured text about geopolitical tensions, trade restrictions, and labor disputes. This approach could transform how businesses manage risk by providing early warnings based on publicly available information, moving beyond reactive measures to proactive forecasting.

The researchers developed a framework that trains large language models (LLMs) to produce calibrated probabilistic forecasts of supply chain disruptions using realized outcomes as supervision. They focused on a one-month-ahead task, predicting whether a disruption index for a specific country or product would experience a large increase, defined as exceeding one standard deviation of historical changes. The model, based on GPT-OSS-120B and fine-tuned with Low-Rank Adaptation (LoRA), outperformed strong baselines, including GPT-5, on key metrics such as Brier score, calibration error, and precision. For example, the fine-tuned model achieved a Brier score of 0.0791 and an expected calibration error (ECE) of 0.0525 on the test set, compared to 0.1433 and 0.1740 for the pretrained base model, indicating substantial improvements in accuracy and reliability.

Ology involved constructing a dataset linking timestamped news articles to future disruption outcomes, ensuring a strictly forward-looking setup without look-ahead bias. Each forecasting question included a news context with articles available up to the prediction month, current index values, and a binary outcome for the following month. The model was trained using a reinforcement learning objective from the Foresight Learning framework, where rewards were based on the log score of predicted probabilities against realized events. This end-to-end approach allowed the model to jointly identify salient signals in unstructured text, reason over them adaptively, and produce likelihood estimates aligned with observed outcomes, without relying on feature engineering or downstream models.

From the held-out test set, covering October 2025 to January 2026, showed the fine-tuned model's superiority across all evaluation metrics. It achieved a Brier skill score of 16.9% relative to a historical baseline, meaning it improved predictive accuracy by nearly 17%. Precision@10%, which measures the fraction of true disruptions among the top 10% of highest-confidence predictions, was 0.3478 for the fine-tuned model, compared to 0.1304 for the pretrained base model, indicating better ranking quality for practical decision-making. Calibration improvements were particularly notable, with ECE decreasing by nearly 70%, as shown in Figure 2 where predicted probabilities closely tracked empirical disruption frequencies, enhancing the reliability of probability estimates for real-world use.

Beyond quantitative performance, the training induced more structured and forward-looking reasoning behavior in the model. An automated evaluator detected increases in probabilistic reasoning patterns, such as base-rate anchoring (from 0.09 to 0.50 frequency), statistical modeling (from 0.48 to 1.00), and uncertainty refinement (from 0.33 to 1.00), as detailed in Table 4. The fine-tuned model more consistently linked news signals to future outcomes, used explicit forecasting models, and performed iterative refinement, without explicit prompting. This suggests that foresight-oriented training not only boosts accuracy but also enhances the model's ability to reason probabilistically, making its forecasts more interpretable and decision-relevant for supply chain analysts.

Of this research are significant for industries reliant on stable supply chains, as it demonstrates that AI can effectively aggregate heterogeneous textual signals to anticipate disruptions before they materialize. By improving calibration and precision, the model offers a practical tool for prioritizing high-risk alerts, potentially reducing costs and improving resilience. However, the study has limitations: the relationship between news and disruptions is noisy and incomplete, with some relevant developments unreported or not translating into measurable events. The analysis is confined to a one-month-ahead binary task and the post-2022 period, a single regime with elevated disruption levels, which may not generalize to longer horizons or different economic conditions. Future work could extend the framework to multi-period forecasting and integrate additional data sources for broader applicability.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn