Data Rights in Backtesting: The Ghost in the Prop Trading Time Machine

Imagine every trader has a time machine — a sleek console powered not by steam or lightning, but by data.

This time machine lets them test strategies in the past before risking them in the present. It’s not guesswork; it’s backtesting. They can ask, “What if I shorted Tesla after every earnings day?” or “Would this volatility filter work during 2020’s COVID crash?”

But here’s the catch: the fuel for this time machine — historical market data — is not always theirs to use.

Welcome to the foggy world of data rights in backtesting, where prop trading intersects with law, ethics, and digital ownership.

The Invisible Library

Think of proprietary data as a vast private library. Some prop trading firms own detailed, granular datasets: tick-by-tick pricing, order books, private indicators, alternative data feeds like satellite imagery or credit card swipes.

They’ve paid millions to gather and clean this data. Now imagine a trader — freelance, remote, or even part of another platform — sneaking into this library at night, copying pages, and using them in a backtest to build a high-performing strategy.


But the librarian sees it differently: “You’re building your weapon with our blueprint.”

The Ethical Mirag

  • Backtesting is foundational.
  • But not all backtesting is fair play.

For instance:

  • A trader downloads proprietary data from a firm’s internal dashboard and uses it to qualify for funding at another prop trading firm.
  • A quant scrapes order-flow from a private Discord API and embeds it in a historical signal model.

Each of these acts raises a crucial question: Who owns the past when it comes to price?

Legal Shadows and Bright Lines

While copyright laws often apply to creative works, financial data lives in a gray zone.

  • Raw price data (like OHLC) from public exchanges is often considered public domain — but the aggregation, formatting, or enrichment of that data can be protected.
  • Alternative data, such as foot traffic, mobile usage, or sentiment signals, is usually licensed, not owned. That means traders must follow strict use agreements.
  • Even generated datasets created by machine learning models on top of real data can have usage restrictions.

And many prop trading firms now add digital watermarks or unique timestamp variations to catch misuse of their backtest data.

The Moral Compass for Prop Traders

Ethically, it’s not just about legality. It’s about integrity.

Backtesting on stolen or unauthorized data is like cheating on a flight simulator exam before you get your wings. You may pass, but when it’s time to fly with real capital, you’re misrepresenting your skills.

Prop trading platforms, especially those funding remote talent, are starting to audit backtest provenance — asking not just how you backtested, but on what.

Solutions: Clearing the Data Air

  1. Transparent Data Agreements
    Prop firms now issue formal policies: “Use only approved datasets for backtesting before evaluation.” It sets clear boundaries.
  2. Sandbox Environments
    Some platforms offer a secure backtest zone — preloaded with licensed data — where traders can build with confidence and compliance.
  3. Data Provenance Logs
    Blockchain-inspired logs can verify what data was used, when, and by whom — a growing trend in advanced prop trading ecosystems.
  4. Open-Source Data Advocacy
    A movement is rising for democratizing historical data. Projects like FRED, Yahoo Finance APIs, and community tick archives are empowering ethical testing without shady scraping.

Fair Time Machines

But using borrowed data without permission is like rewriting history to win the future — it breaks the spell.

To preserve trust in the model, prop traders must not just build great strategies — they must build them on clean foundations.

So the next time you step into your backtest time machine, ask:
“Whose map am I using?”
Because in prop trading, even the past must be earned.

Leave a Reply