AIResearch AIResearch
Back to articles
Network

AI Manages Gas Storage Better Than Humans

A reinforcement learning system called GasRL can optimize natural gas stockpiles to prevent supply disruptions while maintaining market stability—without needing historical price data for calibration.

AI Research
November 05, 2025
3 min read
AI Manages Gas Storage Better Than Humans

Natural gas powers homes, industries, and economies, but its supply is vulnerable to sudden disruptions that can trigger price spikes and shortages. In Europe, where countries must maintain minimum gas storage levels, finding the right balance between security and cost has been a persistent challenge. A new study introduces GasRL, an artificial intelligence system that uses reinforcement learning to manage gas storage more effectively than conventional methods, offering a tool that could help policymakers and operators avoid crises.

The key finding is that GasRL, when trained with the Soft Actor-Critic (SAC) algorithm, learns to set gas prices that lead to optimal stockpile management. This AI-driven approach achieves multiple objectives simultaneously: it maximizes profits for storage operators, minimizes price volatility, and ensures market stability by preventing supply failures. Remarkably, the system reproduces realistic price patterns, including seasonal variations, without being explicitly trained on historical price data. This indicates that the AI learns economically coherent behaviors through interaction with a simulated market environment.

The methodology combines a calibrated model of Italy's natural gas market—the largest in the EU—with a reinforcement learning agent that acts as a monopolistic storage operator. The simulator, built using Python and Gymnasium, models supply and demand dynamics over 30-year episodes, with each time step representing one month. The AI agent observes market conditions, such as current stock levels and seasonal demand shifts, and chooses a gas price. The environment then calculates how much gas is bought or sold based on that price, updating inventory and bank balances. The agent's reward function encourages profitable operations while penalizing market failures, excessive price swings, and non-compliance with storage mandates.

Results show that SAC outperforms other reinforcement learning algorithms, including DDPG, TD3, A2C, and PPO, in terms of cumulative reward and stability. Over 1.5 million training steps, SAC agents consistently achieved higher profits, eliminated market failures (where demand exceeds supply or gas is wasted), and maintained inventory levels that naturally reached 83% of capacity in November—matching EU regulatory targets—before settling at 73%. As shown in Figure 2, SAC-generated price series closely mirror real-world data from the TTF futures market (2010–2024), with similar seasonality peaks in November. The distribution of price changes had a standard deviation of 27%, comparable to the 25% observed in recent volatile years.

In practical terms, GasRL provides a testbed for evaluating energy policies. For example, the researchers simulated the impact of EU-mandated storage thresholds, requiring 83% fullness by November. They found that imposing this rule slightly improved market resilience to supply shocks—such as unexpected volatility increases—but reduced operator profitability and had no significant effect on average price levels. This trade-off highlights the system's utility for assessing regulatory trade-offs before implementation, helping avoid costly missteps in real markets.

Limitations of the study include the model's focus on a single, closed national market (Italy), which may not fully capture international gas trade dynamics. Future work could extend GasRL to multi-agent scenarios with competing operators or integrate cross-border pipeline and LNG infrastructure. The code is open-source, allowing others to reproduce results and explore further applications in energy policy and commodity markets.

Original Source

Read the complete research paper

View on arXiv

About the Author

Guilherme A.

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn