GTrader: How an Open-Source Trading Agent Outperformed Proprietary Models in Live Markets

Overview

When our Head of Developer Relations Harish Kotra set out to build GTrader [Github], he wasn't trying to prove anything grand. He just wanted to see if a lean, open-source trading agent could hold its own against the big names. Six days later, it finished second by the end of the week with the least amount of losses—ahead of Claude, GPT-5, Gemini Pro, DeepSeek, Grok, and Qwen3 Max's proprietary version.

More interesting than the ranking? Every agent lost money. GTrader's ability to minimize losses while competing against 12 other agents in a live, high-stakes environment proved something fundamental: open-source LLMs can significantly perform well with specific use-cases.

The Setup: Why This Actually Mattered

Most AI benchmarks are theater. Standardized datasets, predetermined conditions, everything optimized for the leaderboard. You finish first on ImageNet and nobody knows if you can actually see.

The Alpha Arena was different. Nof1.ai and Recall.Network ran a live trading competition with real money. 12 participants total—a mix of foundation models and community-built agents. Each deployed capital across six volatile assets: BTC, ETH, SOL, BNB, XRP, DOGE. The competition ran from October 27th through October 31st, with a Boost Window from October 26th–29th.

The infrastructure was clever too. Recall's skill markets didn't just rank agents—they let the community stake RECALL tokens behind the ones they believed in. It created a real economic signal. If an agent was actually good, people would fund it. If it was all hype, the market would know.

By the end, 264 positions had been opened across all agents. $6,736,970 in total volume moved. Average equity across participants sat at $4,472—meaning most agents had bled significant capital.

The brutal reality: Every agent lost money. The question wasn't who made money. It was who lost the least.

The Agent: Built for Speed, Not Elegance

Here's what GTrader wasn't: it wasn't a reasoning engine. No multi-step chains. No token counting. No elaborate prompts designed to trick the model into thinking harder.

Here's what it was: fast, focused, and purpose-built.

The Architecture

AI Inference: Gaia running open-source models fully locally on Harish's own machine (https://gaianet.ai)

Qwen3-30B-A3B-Q5_K_M (quantized for efficiency)
No API rate limits
Local inference = latency advantage over cloud-based competitors

Crypto Pricing: CoinGecko API (coingecko.com)

Fast, reliable data source
No API friction

Trading Execution: Hyperliquid API (app.hyperliquid.xyz)

Direct market access
Autonomous position management

The model choice was deliberate. Qwen3-30B is solid but not flashy. It's the kind of model that works, not the kind that makes headlines. But in a live market, "works" beats "impressive" every single time.

A Multi-Layer Build

Harish's personal insight: "I've personally noticed that when you combine multiple platforms to build something complex like this, you actually end up getting a good prototype."

The agent was built using:

Google AI Studio – for building the structure of the agent as a mono repo
Qwen3-Coder – for fixing issues with TAAPI API and Hyperliquid API
DeepSeek – for updating the strategy and minor bug fixes when placing orders on Hyperliquid

This multi-platform approach wasn't a constraint. It was a feature. Each tool was optimized for what it did best, then integrated into a cohesive system.

This is the kind of decision that separates builders from theorists. Theorists optimize for elegance. Builders optimize for what actually matters, in this case, latency, signal quality, and execution speed.

What Happened: The Good, the Bad, and the Churning

The Numbers

Losing money isn't usually something to celebrate, but context matters. Every agent lost money. GTrader's achievement wasn't turning a profit - it was minimizing losses while competing against 12 other agents, including some of the world's largest AI labs.

Rank	Agent/Model	ROI	Status
🥇	Bull vs Bear	Best performer	—
🥈	GTrader	Least losses	2nd Place
🥉	cassh	3rd	—
4	Gemini Pro	Significant losses	—
5	Claude	Significant losses	—
6	Qwen3 Max	Significant losses	—
—	GPT-5	Losses	—
—	DeepSeek	Losses	—
—	Grok	Losses	—
—	5 Other Agents	Losses	—

Where It Broke: The Vulnerability Pattern

Harish's critical self-assessment: "GTrader was capable of generating small, frequent wins but is highly vulnerable to large, infrequent losses that wipe out all progress."

This is the signature pattern of the agent's performance:

ETH and SOL – Catastrophic Drawdowns

"The positions in SOL and ETH have led to catastrophic drawdowns"
The sentiment model didn't adapt to volatility regime changes
When volatility spiked, sentiment became noise
GTrader kept trading anyway

XRP – The Churn Problem

"The focus on high-frequency trading with XRP isn't yielding results"
150+ trades. Almost no net gain
This is what traders call "churning"—the agent was making decisions constantly but those decisions weren't profitable
High frequency, low conviction

Where It Actually Worked

DOGE was the star. Lower volatility, clearer sentiment trends, fewer regime breaks. GTrader's model was built for exactly this environment. Consistent profits. Steady signals. The kind of performance that makes you think the agent actually understands something.

Key Insights: Harish's Takeaways

1. Open-Source LLMs Excel in Specific Use-Cases

"Open-source LLMs can significantly perform well with specific use-cases (not jack of all trades)"

This is the core insight. GTrader didn't try to be everything. It was purpose-built for crypto sentiment trading. That focus—that constraint—is what allowed it to outperform generalist models from the world's largest AI labs.

Claude, GPT-5, Gemini Pro, DeepSeek, and Grok are all brilliant general-purpose models. But they're not built for trading. Dropping them into a live market with no fine-tuning was like asking a philosophy professor to day-trade. They might be brilliant, but they're not built for this.

GTrader, by contrast, was specialized. Constrained. Focused. That's why it won.

2. Multi-Platform Integration Beats Single-Stack Optimization

"When you combine multiple platforms to build something complex like this, you actually end up getting a good prototype"

In the real world, just as a developer often needs the support of a peer or senior to resolve issues, in AI-generated code, it can be beneficial to consult other models for alternative approaches to a bug. Surprisingly, this often leads to quicker fixes.

This challenges the conventional wisdom that you should stick to one stack. Sometimes the best solution is the one that combines the best tools, even if they're from different ecosystems.

3. The Interesting Frontier: Multiple SLMs Competing

"It will be super interesting to try multiple Gaia Nodes (with different SLMs like Gemma3-4B, Llama 3.2 1B, Qwen3-4B, etc) compete against each other using the same prompts."

This is Harish's vision for what comes next. Not bigger models. Smaller, specialized models running in parallel. Each optimized for different aspects of the trading problem. Competing against each other using the same prompts.

This is the frontier of decentralized AI. Not centralized compute. Distributed intelligence.

The Infrastructure Layer: Why Gaia Mattered

GTrader ran on a single Gaia node. That's not a constraint—that's the entire point.

Gaia's decentralized compute network meant:

No API rate limits – Unlike Claude and Gemini, which had to make API calls
Local inference – Latency advantage over cloud-based competitors
Transparent costs – Every token accounted for
Full control – Running open-source models on your own machine

In a 6-day live trading scenario, this added up. Speed compounds. Transparency builds trust. Efficiency scales.

What's Next: Open-Sourcing GTrader

Harish's commitment: "I'm open-sourcing the repo next week and drop a 👋 in the comment below and I'll send the repo to you."

This is significant. GTrader isn't staying proprietary. The code, the prompts, the trading logic—all going open-source.

That means:

Community audit of strategy and risk management
Collaborative improvements to regime detection and loss management
Real-time feedback from traders and engineers
Redeployment in future Alpha Arenas with community-driven improvements

The Future: Multiple SLMs Competing

Harish's vision for the next iteration: "It will be super interesting to try multiple Gaia Nodes (with different SLMs like Gemma3-4B, Llama 3.2 1B, Qwen3-4B, etc) compete against each other using the same prompts."

This isn't about bigger models. It's about specialized models. Smaller language models (SLMs) like:

Gemma3-4B
Llama 3.2 1B
Qwen3-4B

Each running on its own Gaia node. Each optimized for different aspects of the trading problem. Competing against each other using identical prompts and conditions.

This is the frontier. Not centralized AI. Distributed intelligence. Not bigger models. Specialized models.

The Shift: What This Means for AI Development

GTrader proved something fundamental about the future of AI:

Specialized agents beat generalist models in specific domains. You don't need a 100B parameter model to trade crypto. You need a focused agent.

Open-source infrastructure can match centralized services. Gaia proved this. One node. Competitive results against labs with infinite compute.

Loss minimization in adversarial environments is a real achievement. When every participant loses money, finishing 2nd isn't luck. It's strategy.

Multi-platform development creates better prototypes. DeepSeek + Qwen3-Coder + Google AI Studio = a system better than any single platform alone.

The future isn't about bigger models. It's about better agents.

👉 GTrader Github: https://github.com/harishkotra/GTrader

Disclaimer

"This is not trading advice or a recommendation to do crypto trading. This is just a showcase of what I built."

GTrader demonstrates what's possible with AI agents in autonomous trading. It is not a recommendation to trade cryptocurrency. Crypto markets are highly volatile, risky, and unpredictable. GTrader itself lost money during the competition. Past performance does not indicate future results.

Competition Snapshot

Metric	Value
Duration	October 27–31, 2025
Boost Window	October 26–29
Total Participants	12
Total Positions Opened	264
Total Trading Volume	$6,736,970
Average Equity	$4,472
Total Rewards	40,000 RECALL tokens
Trading Pairs	BTC, ETH, SOL, BNB, XRP, DOGE
Key Result	Every agent lost money; GTrader had the least losses

GTrader Technical Specs

Component	Technology
AI Inference	Gaia (open-source) + Qwen3-30B-A3B-Q5_K_M
Crypto Pricing	CoinGecko API
Trading Execution	Hyperliquid API
Hosting	Railway
Development Stack	DeepSeek, Qwen3-Coder, Google AI Studio
Infrastructure	Single Gaia Domain (local inference)

Build with Gaia

👉 Explore the network at www.gaianet.ai
👉 Contribute on GitHub
👉 Follow us on X @GaiaNet_AI