The Alpha Arena trading competition revealed how even advanced AI systems struggle to stay profitable in today’s volatile market. The results highlight that discipline and risk control still outperform speed and aggression, whether human or machine.
                    
                        The AI trading competition hosted by NOF1, known as Alpha Arena, has officially concluded on November 3.
The champion was QWEN, which pulled off a dramatic comeback in the final moments to overtake DeepSeek and secure the top spot.
The arena featured six advanced models from leading AI research labs: GPT-5, Gemini 2.5 Pro, Claude Sonnet 4.5, Grok 4, DeepSeek v3.1, and Qwen3-Max. These models represented the state of the art across both closed and open-source systems from the United States and China.
The action space is simple: buy to enter (long), sell to enter (short), hold, or close. The tradable coin universe was constrained to six popular cryptocurrencies on Hyperliquid: BTC, ETH, SOL, BNB, DOGE, & XRP.
DeepSeek and Qwen3 Outtraded the Competition
Throughout the competition, the top rankings were consistently dominated by DeepSeek and Qwen3, both Chinese models. Some discussions claimed that Chinese AIs could not take short positions due to certain local factors, which supposedly gave them an advantage since the market was mostly bullish during the competition period.
However, that assumption is not true. They do take short trades, although they generally prefer long positions. In a way, that preference mirrors a human-like mindset, as the potential upside of a long position is theoretically unlimited, while shorting can only yield up to 100 percent profit.
Meanwhile, the American models such as Claude, Grok, Gemini, and GPT performed more weakly overall.
Fun Fact: ChatGPT will no longer provide medical, legal, or financial advice. Is it because of Alpha Arena?
So, What Type of Traders Are These AI?
Tldr:
	- 
	
Disciplined Swing / Position Traders: Qwen3 MAX, DeepSeek Chat V3.1
	 
	- 
	
Inconsistent Analysts: Claude Sonnet 4.5
	 
	- 
	
Experimental Traders: Grok 4
	 
	- 
	
Over-Traders: Gemini 2.5 Pro, GPT-5
	 

Gemini went full degen with 238 trades, the highest among all models. GPT-5 wasn’t far behind with 116 trades. Both entered positions non-stop, but efficiency wasn’t their strength. With Sharpe ratios of -0.566 and -0.525, they took plenty of swings but ended up trading noise more than signal, showing high volatility, low consistency, and little control.
On the other side, Claude barely moved with just 36 trades, while DeepSeek and Qwen3 kept it steady at 41 and 43. That restraint paid off. DeepSeek recorded the best Sharpe ratio at 0.359, followed by Qwen3 at 0.273, proving that fewer, smarter trades still outperform blind aggression.
Who Manages Risk Best?
A commonly favored ROI benchmark among traders is 1:3, meaning the potential reward should be at least three times greater than the potential risk.
The theory is simple. For a good trader, your win must always be greater than your loss, ideally by a multiple.
In this analysis, we use the following interpretation:
	- 
	
Ratio > 3 → Excellent risk-to-reward management.
	 
	- 
	
Ratio > 1 but < 3 → Acceptable, but with room for improvement.
	 
	- 
	
Ratio < 1 → Poor asymmetry, where losses outweigh gains, indicating weak risk control.
	 
Within this dataset, DeepSeek and Qwen3 exemplify strong risk-to-reward discipline, achieving ratios well above 3:1. Grok 4 and Claude fall into the acceptable range, maintaining a positive but modest edge. Meanwhile, Gemini 2.5 and GPT 5 display poor asymmetry, suggesting excessive drawdowns or weak profit capture discipline.
For simplification, this study uses only the Biggest Win and Biggest Loss to calculate the Win/Loss Ratio, defined as:
Win/Loss Ratio = Biggest Win/ Biggest Loss

Trader Archetypes Based on Stability and Analytical Skills
Stability and analytical skills represent two fundamental dimensions that define trader performance:
	- 
	
Stability reflects how consistent, risk-controlled, and disciplined a trader is in managing volatility and exposure.
	 
	- 
	
Analytical skills measure how effectively a trader interprets data, adapts to market conditions, and identifies profitable opportunities.
	 
The table below outlines how these dimensions are quantified to evaluate AI trading performance, along with their corresponding interpretations.

Based on these criteria, the chart below presents the resulting classification of traders across different archetypes.

High Stability + High Analytical Skills
	- 
	
Profitable, disciplined, and consistent. Demonstrates strong analytical reasoning, positive expectancy, and solid risk control.
	 
	- 
	
Trade Example: Qwen3 MAX / DeepSeek Chat V3.1
	 
Low Stability + High Analytical Skills
	- 
	
Intelligent and data-driven but inconsistent. Shows good analytical ability with weak discipline or emotional control, leading to volatility.
	 
	- 
	
Trade Example: Claude Sonnet 4.5
	 
Low Stability + Low Analytical Skills
	- 
	
Volatile and unstructured. Negative expectancy, poor Sharpe ratios, and overleveraged positions indicate a lack of both analytical precision and risk discipline.
	 
	- 
	
Trade Example: GPT-5 / Gemini 2.5 Pro / Grok 4
	 
Bull or Bear? Even AI Is Struggling to Decide
The market has remained volatile in recent weeks, teetering between bullish optimism and bearish pullbacks. This environment has made it increasingly difficult to sustain consistent profits, even for advanced AI systems designed to adapt and optimize in real time.
The Alpha Arena ran from October 18 to November 3, and as it concludes, all eyes now turn to Season 1.5.
 
Connect with us:
Fast News: t.me/blockflownews
Insights & Trends: x.com/BlockFlow_News