I've had a project on the backburner for several years now. A small-time trading strategy based on an offhand notion from my brother from when he was working in fintech.
The notion: On a short enough time scale, price changes are just noise. That wasn't a real dip or spike just then. It'll surely revert back to the mean. It's all just volatility, you know?
Whenever you see the price fall, surely it's going to come back up in (very) short order. Specifically, we're aiming for 'a few minutes': neither cutthroat High-Frequency Trading, nor regular speculation. Rather, a kind of listless, slow HFT. Buy tiny minute-level dips, wait for them to rise, and then take your tiny bit of profit.
Theoretically, this only works if you are indeed a drop in the ocean, else your presence itself would be signal. The hypothesis in question here is that these changes are noise. You can only do this at a very small scale. But because you can only do this small-time, then maybe that means nobody else has bothered doing it?
That was where I started, at least.
Where I Started, At Least
My first attempt was what I described above. Minute-level mean reversion:
v1: Naive Mean Reversion
Code: traderbot (feature/integrate-strategy branch)
if price < moving_average * (1 - threshold):
BUY
if price > take_profit OR price < stoploss:
SELL
if hold_time > max_hold_time:
SELL
Thesis: Prices always stay the same. If the price changed, that's noise. Dips will ~always revert.
Stoploss exists to protect us from the times when that's not the case. Additionally, if the price hasn't changed for long enough, then we exit our position and reset. We should be holding for maybe 5-ish minutes, maybe more. It's a hyperparameter to tune.
In Practice: This strategy was terrible, atrocious even. It would essentially always hit stop-losses before reversion. Captured the wrong side of volatility. It essentially rode dips down and then waited out rises. It was so bad, in fact, that I got it in my head that it was anti-signal.
But if it was anti-signal, maybe some contrary hypothesis would be signal?
v2: Momentum / Peak-Skip
Code: traderbot (master branch)
if price has risen from recent low:
BUY and ride the momentum
if price drops after gains:
SELL and back off
Thesis: Ride spikes, skip dips.
In Sooth: Returns correlated strongly with the underlying. With the right hyperparameters, the strategy could maybe post smaller drawdowns than buy-and-hold, but it was easy to overfit, and in the best case it was still just delivering beta (that is, following the market) with extra transaction costs.
I was tweaking this algorithm in search of better entry and exit signals. I wound up arriving at a trailing stoploss -- after rising to a certain price threshold, activate the trailing stoploss, if the price ever falls x percent from its highest high since we started holding, SELL -- And that was more effective than cashing out at a predetermined take_profit, where the algorithm would just sell if the price rose enough off its entry.
Given its effectiveness, I started thinking about doing the same thing as an entry signal. What if, after a sufficient dip while not holding, we waited for the price to rise enough off its lowest low (within a fixed timeframe), and then bought in once it did. (Or rather than 'after a sufficient dip', what if we just omitted that criterion and waited for a rise off the lowest low).
This momentum-based entry didn't make a big difference in this algorithm, but it did seem to be possessed of some signal... especially as an indicator for mean-reversion.
v3: VWAP Mean-Overshoot
Code: mr-traderbot
if price < VWAP * (1 +- entry signal hyperparam) and has bounced from low:
BUY
if price > VWAP * (1 + overshoot) or price > fixed take_profit percentage:
SELL (mean reversion achieved)
Thesis: We return, hat in hand, to our original thesis. Buy below some anchor, sell when price overshoots back.
However, this time I returned with some more wisdom. I had learned many finance words and phrases at this point, and one of those phrases was 'Volume Weighted Average Price', which I thought had potential as a more sophisticated mean than a moving average over some arbitrary previous number of bars. At least, it had one fewer hyperparameter (no need to specify a previous number of bars), which made it k times easier to search setups for, where k is the number of previous-bar-lengths I was testing.
(Though my testing suite was still tooled to test both VWAP-based and Moving-average-based versions of the strategy).
In any case, it seemed like there was some serious potential here. In fact, I believe there may even be some kind of structural edge in how arbitrageurs buy the price of inverse leveraged ETFs like SQQQ back to mean. The price seems to reliably overshoot the VWAP even as the macro trend of the day is ever more downward.
Truly I tell you: This thing looked great in backtests. I mostly was testing on coarse historical data of TQQQ and SQQQ (the triple leveraged and inverse-triple leveraged NASDAQ index symbols), mostly because they're just very high volume, and I wanted to remain a 'drop in the ocean'
I searched up hyperparameter sets that seemed resilient even on data I didn't use to find them (that is: out-of-sample data), and indeed I was getting robust 5-20% returns on both symbols with 9% drawdowns. Really exciting stuff. Too good to be true, even.
And indeed it was too good to be true.
| Config | Year | No Slippage | 1 bps Slippage |
|---|---|---|---|
| SQQQ (compromise gate) | 2022 | +5.2% | -20.4% |
| SQQQ (compromise gate) | 2023 | +4.3% | -4.6% |
| TQQQ (drift-gated) | 2022 | +23.7% | -1.0% |
Full parameter sets for the curious
**SQQQ "Compromise Gate"**: Uses VWAP anchor with a rolling 30-min window after the first 2 hours. Entry requires price to be 0.7% below VWAP. Exits on 0.15% overshoot above VWAP, 0.35% take-profit, or -0.45% stop-loss. Max hold 20 minutes. Exponential backoff after stops (10 min base, 2 hour max). The "compromise gate" pauses trading for the rest of the hour if recent trades show >60% stop rate or >30% timeout rate. **TQQQ "Drift-Gated"**: Similar VWAP setup, but entry is allowed up to 0.2% *above* VWAP. Wider exits: exit on the lower between a 0.3% overshoot of VWAP or a 0.6% take-profit above entry, plus a -0.7% stop-loss. The "drift gate" blocks whole days of trading, specifically only allowing trading when the 5-day return is positive ≥1%. This was actually an innovation intended to help with slippage (just don't trade on grindy days), but it alas wasn't good enough to manage that. Both configs assume a $10k position size for testing purposes, and both force flat before market close.Indeed the numbers were too good to be true. I wasn't taking into account slippage, another word I'd recently learned. And when I did, it completely destroyed my edge -- Even 1 basis point of slippage per side. Even assuming I could exit on limit with no slippage and full fills if I hit my exit signal! I still had no success. I was just making too many transactions to make a profit.
Those greedy market makers were slurping up all my precious alpha. And in return, they were doing what? Only providing the liquidity I absolutely required, without which I wouldn't even be able to execute my strategy in the first place?
Anyways, faced with this failure, I gave up, and that was the end. All of these ideas were too reliant on making tons of transactions with little benefit. Lackadaisical high-frequency trading is just untenable in a retail environmnet, was my conclusion, at least on the finance level.
However, this project wasn't necessarily about actually being profitable. The real profit is the ~~friends we made~~ lessons we learned along the way, isn't that right?
The Lessons We Learned Along the Way
As I was iterating on algorithms, I was also iterating on how the code was organized, and there was no small amount of iteration to be done. You see, when I first began, I wasn't sure there was anything real here worth pursuing. I knew little about finance and less about algorithmic trading, and while the premise, "prices always stay the same (on a short enough timescale)" sounded plausible enough, I didn't really think it was gonna turn into anything. As such, I started off just hacking things together with no regard for the future.
But the future kept coming.
Phase 1: Everything in One File
Originally, strategy logic lived directly in the backtester:
┌─────────────────────────────────────────┐
│ backtest.py │
│ │
│ for bar in historical_data: │
│ if price < threshold: ← hardcoded │
│ simulate_buy() │
│ │
└─────────────────────────────────────────┘
Problem (to say the least): This was an absolutely ridiculous way of doing things. The 'backtesting infrastructure' (that was the core loop of the script), was responsible for everything. The strategy was hardcoded. If I remember correctly, even the data file to test on was hard coded, and I'd have to change the filename within the script to try a different dataset. Even after breaking some of that stuff out, it was all still intertwined. If responsibilities were separated, it was in a truly ugly way. An affront to God and man.
But that was fine. The goal was to see if there was anything worth pursuing. Which, ironically, to start with there was not. Yet the initial mean-reversion strategy was so bad that I thought there might actually be something to its ability to lose money even in the best markets.
With everything hardcoded in, substantially changing the strategy meant a massive refactor anyways, so I basically ripped out the strategy code entirely as I went, resulting in:
Phase 2: Strategy Extraction
I pulled the strategy logic into its own class:
┌──────────────┐ ┌──────────────┐
│ Strategy │◄─────│ Backtester │
│ │ │ │
│ should_buy() │ │ (still │
│ should_sell()│ │ hardwired) │
└──────────────┘ └──────────────┘
Improvement: I mean really this is just obvious. These are two very different things, and with this logic I was able to swap strategies without touching the backtester. At this time I also rigged up a way of testing suites of hyperparameters for coarse searching. Not complicated stuff, just a yaml that enumerates all the different hyperparams to try, and then infrastructure for running every possible combination of those via the backtesting suite.
Remaining problem: Nevertheless, the backtester still had a hardwired relationship with the strategy, and still owned some of the decision logic that was, in some sense 'meta' to the actual trading strategy (eg, risk mitigation stuff like refusing to trade in low volatility regimes). I never did put that stuff into the 'strategy', but certainly the backtester wasn't the right place for it. The backtester was also responsible for, eg, tracking the position, which needed to be done, but wouldn't translate to live trading. That didn't belong in the strategy either though, but it was what it was for the time being.
What was important is that these changes allowed more iteration, and more implementation of features. I added trailing stops to the strategy logic, volatility regimes, timeouts for slow-bleeding price descents that weren't enough to trigger stoploss. There was some real improvement here, and the dopamine hit of seeing profitable backtests was intoxicating (though, as we've established, misleading).
Eventually I gave up on the momentum-based price following algorithm though, and with a new strategy came another round of introspection. The backtester was still this big kludge. Everything was mixed together in ways that made it hard to maintain, and hard again to rip out and replace the strategy code. And if I ever wanted to run this thing live, I'd have to reimplement half that logic separately and trust it matched.
Plus, also, somewhere down the line I'd put in enough effort that I wanted to show people what I was wasting time on, but that codebase? Sharing that setup would have been humiliating. So anyways please don't look at it.
Instead, look at:
Phase 3: Full Separation (Agent + Backend)
Code: mr-traderbot (current architecture)
The final architecture:
┌──────────────────────────────────────────────────────┐
│ STRATEGY │
│ (Signal Generation) │
│ │
│ • Anchor calculation (VWAP, rolling VWAP, EMA) │
│ • Entry/exit signals (deviation, momentum) │
│ • Market filters (regime, volatility) │
│ • 'Tactical' state (recent exit tracking) │
│ • No I/O, no position tracking │
└──────────────────────┬───────────────────────────────┘
│ should_buy() / should_sell()
│ signals
▼
┌──────────────────────────────────────────────────────┐
│ AGENT │
│ (Orchestration Layer) │
│ │
│ • Main loop: bar → strategy → order → execute │
│ • Position tracking & P&L │
│ • Risk limits (backoff, daily loss, stop clusters) │
│ • Order construction from signals │
│ • Same code for backtest and live │
└──────────────────────┬───────────────────────────────┘
│ OrderIntent
│ (side, qty, limit_price)
▼
┌──────────────────────────────────────────────────────┐
│ EXECUTION BACKEND │
│ (Pluggable Interface) │
│ │
│ ┌────────────────────┐ ┌─────────────────────┐ │
│ │ BacktestBackend │ │ LiveAlpacaBackend │ │
│ │ • Historical bars │ │ • Polls 1-min bars │ │
│ │ • Simulated fills │ │ • Real orders (IEX) │ │
│ │ • Slippage model │ │ • Alpaca API │ │
│ └────────────────────┘ └─────────────────────┘ │
└──────────────────────────────────────────────────────┘
The main point: The Agent is identical for backtest and live. The only thing that changes is the backend. Indeed, the agent in some sense shouldn't even know whether it's in a backtest, performing paper trading, or handling real money.
The Strategy is responsible for signal generation: it takes in market data (minute bars) and tells you whether to enter or exit based on price action and market conditions. It maintains some state that I call 'tactical' to delude myself into thinking it's fine for that state to live in the strategy script. Specifically, it tracks recent exits to detect bad trading patterns. The philosophical notion is that it stays focused on what to trade, not how to trade it.
On the flip side, the Agent handles account-level concerns: position tracking, risk limits (exponential backoff after losses, daily loss limits), and order construction. I'm not stoked about that either though. I wonder if risk-management ought to be broken out of one or both components into its own thing that the agent talks to.
Separate from strategic considerations (though intimately related to 'when to back off' signals, to be fair), the Agent handle processing price data and conveying order exection. It plugs into a Backend and gets price data (in the form of 1min OHCLV bars) with stream_bars() and makes trades with execute(order). Whether the backend is the backtester or a wrapper around the Alpaca API, it knows not and cares not.
This is, of course, basically dependency injection applied to trading systems. I probably should have considered building it this way in the first place, but I like to think that at any given moment I was taking the path of least resistance. I had to accumulate enough technical debt to make the refactor feel necessary. I had to write the bad version to even know why the good version would be good. I didn't know a thing about what I was doing at the start of this.
Conclusions
- The verisimilitude of one's simulation is pretty important.
I overlooked slippage, and spent a lot of computer cycles searching for hyperparameters over barren spaces devoid of alpha. The strategy looked profitable until it faced down the cruel demands of the market makers, or at least my rough approximation of those demands. Even the quite optimistic 1 basis point per side destroyed the edge. Perhaps if I'd been simulating it all along, I'd have given up much sooner.
- Separation of concerns makes it easier to iterate.
I like to think I was taking the path of least resistance to start with, but frankly some thought about the architecture beforehand would have gone a long way, even if my lack of knowledge about fintech meant I'd need to add a bunch of risk controls later or something. With Strategy, Agent, and Backend separate, you can change one without breaking the others. Testing strategy logic doesn't require standing up execution infrastructure. Changing risk rules doesn't touch signal generation. Obviously. So anyways I might do that next time.
Though it's just so easy to create a single monolithic script. Maybe next lesson I'll learn it's bad to try to separate out responsibilities before you know what belongs where.
- What costs am I ignoring?
No but seriously I should have added slippage simulation earlier. Maybe I'd have had some success seeking out strategies that make fewer trades if I hadn't pigeonholed myself earlier. But then again maybe my torpid HFT strategy was doomed from the start. Either way, it would have behooved me to figure out what friction I wasn't accounting for.
Compromises and Concessions
While I'm not unhappy with the architecture, many expedient decisions made along the way still echo through the codebase. Some were pragmatic tradeoffs, others would probably be worth fixing if I was planning to continue with this:
Data & Execution Fidelity:
- 1-Minute bars: I get no tick data, no quote data, no sub-minute signals. Intrabar price movements and partial fills aren't modeled. That was mainly because, frankly, that was the data I had available for backtesting. I pulled some CSVs and contorted my project around using them.
- IEX feed vs SIP: Live trading uses Alpaca's free IEX feed instead of the consolidated SIP tape (which costs money). Having already made the 1-minute bar concession, I claim that IEX is fine for this particular strategy. Everything is predicated around not needing high resolution. But of course the key thing I'm trading off for is not spending money in this bullet point.
- Polling vs streaming: The live agent polls for 1-minute bars on a 15-second interval rather than using websockets. This keeps the agent loop structurally identical to backtests (simple synchronous iteration) but is pretty questionable. A proper websocket implementation, however, would need async/await and queue management. Still would be worth doing if I was gonna keep working on this.
Architecture Decisions:
- State in strategy: The strategy maintains state on recent exits and some info for regime gates. I claim this is fine because these are part of signal generation (philosophically: "should I trade right now?"). As mentioned above, execution state (positions, P&L, backoff) lives in the agent. Philosophically the question is "at the account level can I afford to trade?", but honestly that's cope and the agent is definitely participating in strategy. I think probably what I'd wanna do is break the risk management out into another component, but honestly maybe the best thing to do would be to wait until this setup starts causing trouble like with the previous refactors. Hard to say.
- Simplified live execution: Live trades use market orders with assumed immediate fills. No partial fill modeling, no realistic fee/slippage modeling in production. In the agent version, limit exits assume full fills if price touches the limit (the idea being: if that was profitable, then I could try simulating partial fills, but it wasn't so I didn't)
- Fixed position sizing: Uses a fixed target_dollars parameter to size positions. No portfolio-aware sizing, no margin considerations, no position scaling based on volatility or recent performance. This was just expedient. Another thing I'd have considered attempting to build out if I got more promising results.
Tooling & Testing:
- Agent backtester parity: The agent-based backtester doesn't actually implement all the features that its predecessor had. The legacy backtester has more features (partial limit fills, exact EOD session slicing) that never got ported, as mentioned above, because I didn't think they were necessary given what I deemed to be insufficient results (mainly against slippage).
- Optuna parallelism: No built-in distributed tuning. To run Optuna studies in parallel, I would just manually launch multiple processes in different tmux windows all pointing at the same Postgres database. I only did so many of these studies, and so I never actually bothered even rigging up some sort of parallel kickoff script.
- Restart handling: Live agent can rehydrate open positions on restart, but entry time is approximated to "now" rather than the actual entry timestamp. If I was doing more live or paper trading, I'd fix this, but as it is, it is what it is.
Cost Modeling:
- Backtests vs live: Backtests can model slippage and commissions. Live paper assumes zero fees and no explicit slippage model. Live with real money also doesn't take into account fees or slippage. It just believes in itself (or theoretically it does. I never did run live with real money.)
Anyways some of these would definitely need addressing for serious production use, but as it stands none of these are the reason the project is dead in the water (except perhaps data resolution, but that would be a fundamentally different algorithm and testing infrastructure, so).
The Code
Repos are now public:
- traderbot: The momentum/peak-skip experiment (archived)
feature/integrate-strategy: (Stale branch) Original naive mean reversion-
master: Momentum strategy implementation -
mr-traderbot: The VWAP mean-overshoot experiment
- Full Agent/Backend architecture
- Bayesian parameter tuning, walk-forward validation
- Live trading via Alpaca
Epilogue
So: The bot doesn't make money. Tragically, I have failed. The market makers took all the imaginary funds I risked in hundreds of thousands of backtests, and all I have to show for it is this project. Importantly, however, I learned how to use a lot of new words and phrases, and I might have even learned what they meant too. I also got some practical exposure to the joys of refactoring. The project itself functions as intended as well, which is itself a genre of success.
Moving forward into the future, I'll give architecture some more (a lot more) thought earlier on. Part of the process here involved me bootstrapping roughly 0 knowledge of trading into this, but there are still a lot of quite obvious things that I could have saved a lot of time on by putting in some initial thought. Even a novice's spec would have better than what I had, which was no spec.
Aside from that, from a practical perspective on algorithmic trading specifically, one thing I might do next time is simply start running a strategy with a token amount of actual money. With stuff like this you can actually directly purchase data about your own code from the financial markets.