Smarter ≠ Safer – Risk-Aware Q-Learning

What happens when an agent learns not just to win, but to protect itself from loss?

This post is part of my series exploring how to simulate, analyze, and optimize roulette strategies using reinforcement learning.


After building a Q-learning agent that could learn from wins and losses, I wanted to take things a step further.

What if the agent didn’t just chase reward, but also learned to avoid risk?

I created a risk-aware roulette agent, using enhanced state features and reward shaping to emphasize survival, bankroll preservation, and streak awareness.


Key Additions

Expanded State Space

The agent’s state included:

  • Bankroll (bucketed)
  • Drawdown from peak bankroll
  • Current win/loss streak
  • Last spin result (win or loss)

This gave the agent memory of context — not just the current bankroll, but whether it was trending up or spiraling down.

Reward Shaping

In addition to win/loss payouts, the agent received:

  • Penalty for large drawdowns (>50% of starting bankroll)
  • Penalty for long losing streaks
  • Bonus for surviving every 100 spins
  • Bonus for growing bankroll >1.5x

This shaped the agent toward stable, long-lived behavior.


Results

After 1000 training episodes, I evaluated the risk-aware agent on 100 new roulette sessions.

StrategyAvg RewardMedian RewardAvg Spins% Profitable
Risk-Aware Agent-$14,300-$9,9784040%
Flat Betting-$32,660-$29,9177341%

Takeaways

  • The risk-aware agent lost less money than flat betting on average
  • However, it never produced profitable outcomes
  • By avoiding all risk, it eliminated all upside

In trying to prevent failure, the agent also stopped itself from succeeding.

This highlights a core trade-off: protecting capital vs pursuing growth. You need both — and this version only had one.


What’s Next?

The next step was to bring in deep learning — by moving from a Q-table to a neural network (DQN) that could handle continuous states and learn richer patterns.

It’s more complex, but it opens the door to more powerful policies — and possibly, something closer to intelligent betting.


Continue to Post 5: From Q-Tables to Neural Nets – A DQN Roulette Agent

Sharing

Related Articles

  • All Post
  • Articles
  • Blog Post
  • General Business Automation
  • Portfolio
  • Stock Market & Finance