Realistic Backtesting: Accounting for Slippage, Commissions, and Liquidity

Realistic Backtesting: Accounting for Slippage, Commissions, and Liquidity

Backtesting is the bedrock of algorithmic strategy development. It promises a simulated glimpse into how a trading system would have performed historically. Yet, a vast chasm separates a promising backtest from a profitable live strategy. The primary culprit? Unrealistic assumptions. Ignoring the three pillars of trading friction—slippage, commissions, and liquidity—transforms a backtest from a predictive tool into a misleading mirage. This article dissects each friction point, providing the mathematical frameworks and practical methodologies required to build backtests that reflect the brutal realities of the market.

The Slippage Spectrum: Beyond a Fixed Percentage

Slippage is the difference between the expected price of a trade and the actual price at which it is executed. Novice backtests often apply a flat slippage deduction (e.g., half a tick). This is dangerously naive. Slippage is not static; it is a dynamic function of volatility, order size relative to market depth, and execution speed.

The Slippage Formula

A more robust model incorporates the concept of market impact. For a limit order book (LOB), slippage can be approximated by:
[
S = frac{text{Order Volume}}{text{Market Depth at Best Bid/Ask}} times frac{text{Spread}}{2}
]
Where:

  • Spread is the difference between bid and ask.
  • Market Depth is the volume available at the best price levels.

For example, if you attempt to buy 500 shares of a stock with a bid-ask spread of $0.10 and only 200 shares offered at the ask, your order will “walk the book.” The first 200 shares execute at the ask, but the remaining 300 shares execute at higher prices (the next ask level). This creates variance-dependent slippage. High volatility days amplify this effect, as market makers widen spreads to compensate for risk.

Historical Slippage Modeling

To backtest realistically, access historical tick data or trade and quote (TAQ) data. Calculate the average slippage for your strategy’s trade sizes during specific market regimes (e.g., pre-market, post-earnings, low-volume periods). Alternatively, use a slippage queue model: assume your order joins the back of the order book, and execution occurs only when prior orders are filled. This reveals that slippage can be zero on liquid, slow days but catastrophic during flash crashes or news events.

Commission Structures: The Silent Wealth Eroder

Commissions are often treated as a linear cost, yet they vary wildly by asset class, broker, and order type. For a realistic backtest, you must simulate the exact fee schedule, including hidden costs.

Per-Share vs. Per-Trade Models

Retail brokers charge a flat fee (e.g., $0.005 per share), while institutional desks charge a lower rate plus market access fees. High-frequency strategies are disproportionately hurt by per-trade minima (e.g., $1.00 minimum per order). Failing to account for this can make a strategy appear profitable when it is actually loss-making on small sizes.

The SEC Fee and Regulatory Costs

In the U.S., stock sales incur an SEC Section 31 fee (currently $13.60 per $1,000,000 of principal). Options and futures have exchange and clearing fees. In a realistic backtest, these should be calculated as a percentage of notional value, not just trade count. For example, a $100,000 short sale incurs approximately $1.36 in SEC fees. While small, these accumulate.

Simulating Tiered Commissions

Use a piecewise function in your backtesting engine:

def commission(volume, price):
    notional = volume * price
    if volume <= 500:
        return 1.00  # Minimum fee
    else:
        return (volume * 0.005) + (notional * 0.0000136)

This ensures that small trades are disproportionately taxed, mirroring real broker behavior.

Liquidity: The Invisible Constraint

Liquidity determines whether your backtest’s hypothetical trades are executable. A strategy that looks fantastic on daily OHLC data may be impossible to implement if it relies on entering large positions in thinly traded assets.

The Liquidity Score

Quantify liquidity using the Amihud Illiquidity Ratio:
[
text{Illiquidity} = frac{text{Daily Absolute Return}}{text{Dollar Volume}}
]
A high ratio indicates that small trades cause large price moves. For backtesting, reject any trade signal that exceeds a certain percentage of the average daily volume (ADV). Common thresholds:

  • 1% of ADV: Safe for most mid-cap stocks.
  • 5% of ADV: Requires careful execution and likely significant slippage.
  • >10% of ADV: Implies market impact that will destroy the strategy’s edge.

The Bid-Ask Bounce Effect

In illiquid markets, the bid-ask spread creates a false signal known as the bid-ask bounce. When you backtest using only closing prices, a stock might appear to return significantly, but this return is simply movement between bid and ask. To correct this, use mid-prices (average of bid and ask) for entry calculations, then simulate the actual fill price (ask for buys, bid for sells) plus the spread cost.

Volume Profile Constraints

Liquidity is not constant throughout the day. A strategy that trades in the first 30 minutes of the open faces dramatically higher liquidity than one trading in the last 15 minutes. Use time-of-day volume profiles to weight slippage accordingly. For instance, a stock might have 30% of its daily volume in the first hour but only 5% in the lunch hour. Adjust your slippage multiplier inversely to volume.

Integrating the Three Factors into a Unified Model

Realistic backtesting requires a simulator that processes all three inputs simultaneously. Here is a step-by-step integration framework:

Step 1: Define the Execution Algorithm

Assume your strategy generates a signal to buy 1,000 shares of XYZ at $50.00. The algorithm must simulate the order’s path through the LOB.

Step 2: Calculate Base Slippage

Using historical tick data, determine the spread and depth at the time of the signal. If the spread is $0.05 and depth at ask is 600 shares, your base slippage for the first 600 shares is zero (assuming you pay the ask). The remaining 400 shares will slip to the next price level, say $50.06. Weighted average fill price:
[
(600 times 50.00 + 400 times 50.06) / 1000 = $50.024
]

Step 3: Add Commissions and Fees

Apply your commission model:
[
text{Commission} = (1000 times 0.005) + (1000 times 50.024 times 0.0000136) = $5.00 + $0.68 = $5.68
]
Total cost per share: $0.00568.

Step 4: Adjust for Liquidity Constraints

If the trade size (1,000 shares) exceeds 5% of the average 5-minute volume (e.g., 15,000 shares), apply a liquidity penalty multiplier to slippage. For our example, 1,000 / 15,000 = 6.67%, which exceeds 5%. Apply a multiplier of 1.5 to the market impact portion of slippage. The new weighted average fill becomes $50.036.

Step 5: Account for Partial Fills

In illiquid markets, your entire order may not fill. Modify the backtest to only execute the filled portion. This can drastically reduce the strategy’s win rate, as unfilled positions often miss the largest moves.

Data Frequency and Survivorship Bias

A backtest is only as good as its data. Survivorship bias occurs when using only stocks that exist today, excluding those that were delisted or went bankrupt. This inflates returns by removing failures. Use a comprehensive point-in-time dataset that includes delisted securities.

Tick-Level Data vs. Minute Bars

Slippage and liquidity are invisible on daily charts. For strategies holding positions for less than a day, use second-level tick data. For longer-term strategies, minute bars with volume-weighted average price (VWAP) are acceptable, but you must still estimate intra-bar slippage. A common method: assume your trade occurs at the high or low of the bar plus a spread component, rather than the open or close.

The Slippage-Volatility Feedback Loop

A critical nuance: slippage is not independent of volatility. A large market sell order can trigger stop-losses, creating a cascade of additional slippage. Advanced backtests model this using volume-synchronized probability of informed trading (VPIN) or order flow toxicity. When VPIN is high, liquidity providers withdraw, widening spreads and increasing slippage. Incorporate a VPIN indicator to dynamically adjust slippage during high-toxicity periods.

Practical Code Snippet for Realistic Backtest

Below is a Python pseudocode snippet integrating these factors:

def realistic_fill(signal_size, current_price, spread, bid_depth, ask_depth, volume_5min, avg_volume_5min):
    # Liquidity penalty
    if vol_5min > 0.05 * avg_vol_5min:
        liq_mult = 1.5 + (vol_5min / avg_vol_5min) * 0.2
    else:
        liq_mult = 1.0

    # Slippage calculation
    if signal_size > 0:  # Buy
        shares_at_ask = ask_depth
        if signal_size <= shares_at_ask:
            fill_price = current_price + (spread / 2) * liq_mult
        else:
            shares_slipped = signal_size - shares_at_ask
            depth_penalty = (shares_slipped / shares_at_ask) * 0.01  # 1% slip per depth unit
            fill_price = current_price + (spread / 2) * liq_mult + depth_penalty * current_price
    else:  # Sell
        shares_at_bid = bid_depth
        if abs(signal_size) <= shares_at_bid:
            fill_price = current_price - (spread / 2) * liq_mult
        else:
            shares_slipped = abs(signal_size) - shares_at_bid
            depth_penalty = (shares_slipped / shares_at_bid) * 0.01
            fill_price = current_price - (spread / 2) * liq_mult - depth_penalty * current_price

    # Commissions
    dollar_vol = fill_price * abs(signal_size)
    sec_fee = dollar_vol * 0.0000136
    per_share_fee = abs(signal_size) * 0.005
    total_fee = max(1.00, per_share_fee) + sec_fee

    return round(fill_price, 2), round(total_fee, 2)

Stress Testing the Friction Model

A backtest with slippage and commissions should be stress-tested under adverse scenarios. Run the simulation with 2x, 5x, and 10x the base slippage estimates. If the strategy’s Sharpe ratio drops from 2.0 to below 0.5 under 5x slippage, it is fragile. Also, test during periods of known illiquidity, such as the 2010 Flash Crash or the March 2020 COVID-19 selloff. A robust strategy will show reduced but still positive returns during these events.

The Hidden Cost of Order Routing

In real trading, not all orders reach the same exchange. Payment for order flow (PFOF) or dark pool routing can improve or worsen fill prices. Backtesting software rarely accounts for this. A conservative approach is to assume your order receives mid-market price plus one spread for marketable orders. For limit orders, assume a 50% chance of being filled, with the fill happening at the limit price.

Macro-Level Liquidity Metrics

For portfolio-level backtests, aggregate liquidity across all positions. If a strategy holds 50 stocks, but the total notional value exceeds 15% of the market’s average daily liquidity, the strategy is uninvestable. Use the liquidity-weighted portfolio turnover metric: sum the absolute value of each trade’s notional divided by the respective stock’s ADV. A ratio above 0.1 indicates severe market impact.

Avoiding Over-Optimization of Friction Parameters

It is tempting to tune slippage and commission parameters to make a backtest look better (e.g., setting slippage to 0.01 ticks). This is a form of overfitting. Instead, hold these parameters constant across all strategies in the same asset class. Only vary them for specific execution algorithms (e.g., VWAP vs. TWAP). Document all assumptions explicitly in the backtest report.

The Liquidity Constraint on Strategy Frequency

High-frequency strategies (holding periods under a minute) are most sensitive to liquidity. If your backtest shows a win rate >60% but uses 1-second bars without accounting for LOB depth, it is almost certainly overestimating returns. For such strategies, use Cumulative Volume Delta (CVD) to simulate how many contracts are actually traded at each price level. If the CVD at the ask is insufficient to cover your hypothetical order, the trade does not execute.

Exchange-Specific Nuances

Different exchanges have unique liquidity characteristics. Nasdaq stocks typically have tighter spreads but less depth than NYSE-listed stocks. Futures contracts like ES have deep liquidity but variable spreads at rollover dates. Options have wide bid-ask spreads with no central limit order book for some strikes. Backtest with an exchange-specific liquidity lookup table to apply correct depth and spread values.

The Role of Backtest Horizon

Friction costs compound over time. A strategy that trades daily for 10 years with 0.1% slippage per trade will see a 50% reduction in final capital compared to a frictionless backtest. Use a compound interest decay formula to visualize this: (1 – friction_per_trade)^number_of_trades. For example, 1,000 trades with 0.1% friction reduces final equity by over 63% (0.999^1000 = 0.367). This starkly illustrates why realistic friction modeling is non-negotiable.

Something went wrong. Please refresh the page and/or try again.

Discover more from DNS Research

Subscribe now to keep reading and get access to the full archive.

Continue reading