Building an Arbitrage Bot: Finding Arbitrage Opportunities

In the fast-evolving world of decentralized finance (DeFi), automated trading strategies like arbitrage have become essential tools for maximizing returns. One of the most effective ways to capitalize on market inefficiencies is by building an arbitrage bot capable of identifying and exploiting price discrepancies across decentralized exchanges. This article walks you through the core process of detecting arbitrage opportunities between liquidity pools that trade the same token pairs—focusing on ETH-based pairs for simplicity and efficiency.

We’ll cover token pair selection, derive the mathematical model for optimal trade sizing, and implement a practical algorithm to surface profitable opportunities. By the end, you'll understand how to pre-select viable trading pairs and compute potential profits using real on-chain data.

Selecting Token Pairs for Arbitrage

Defining the Arbitrage Strategy Scope

Before scanning for opportunities, it's crucial to define the operational boundaries of your arbitrage bot. The safest and most straightforward strategy involves ETH-centric arbitrage, where both legs of a trade involve Ethereum (or its wrapped version, WETH). Since gas fees on Ethereum are paid in ETH, concluding trades with ETH ensures you maintain liquidity for transaction costs.

However, this widely adopted approach also means increased competition—popular ETH-based arbitrage routes are often saturated, reducing profitability over time. Still, for beginners, focusing on ETH pairs provides a stable foundation due to deeper liquidity and fewer risks associated with volatile or low-cap tokens.

For this implementation:

We only consider token pairs involving WETH.
We limit ourselves to direct two-pool arbitrage (i.e., no multi-hop routes).
We ignore opportunities requiring more than two swaps to reduce complexity and execution risk.

👉 Discover how to optimize your DeFi trading strategy with real-time data analysis.

While future enhancements could include stablecoin inventory management or statistical arbitrage on illiquid "shitcoins," this guide sticks to atomic, risk-free arbitrage within well-established pools.

Filtering Eligible Token Pairs

To identify eligible pairs, we start by fetching all liquidity pools from major DEX factory contracts (e.g., Uniswap V2). Using event logs, we extract deployed pairs and filter those containing WETH (0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2).

Next, we group pools by their token pair. Only pairs listed on two or more distinct pools are retained—since arbitrage requires at least two price sources to compare.

Here’s a simplified version of the filtering logic:

WETH = "0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2"
pair_pool_dict = {}

for pool in pairDataList:
    token0, token1 = pool['token0'], pool['token1']
    if WETH not in (token0, token1):
        continue
    pair = tuple(sorted([token0, token1]))
    if pair not in pair_pool_dict:
        pair_pool_dict[pair] = []
    pair_pool_dict[pair].append(pool)

# Keep only pairs with multiple pools
eligible_pairs = {k: v for k, v in pair_pool_dict.items() if len(v) >= 2}

At the time of analysis, this process yielded:

1,431 unique ETH-based pairs
3,081 total pools
The most-traded pair: WETH/USDT, with 16 active pools

This volume of data is manageable—reserves for all pools can be fetched in under a second using public RPC endpoints.

Detecting Profitable Arbitrage Opportunities

Understanding Price Discrepancies

An arbitrage opportunity arises when two pools trading the same token pair display different prices. However, not every discrepancy is exploitable. Factors like pool reserves (liquidity depth) and transaction gas costs determine whether a trade will be profitable after fees.

Our goal is to calculate the maximum net profit achievable from a two-swap sequence:

Buy Token Y with ETH in Pool A (where ETH is cheaper)
Sell Token Y for ETH in Pool B (where ETH is more valuable)

The challenge lies in determining the optimal input size—the amount of ETH that maximizes profit before gas expenses.

Mathematical Model for Optimal Trade Size

Automated market makers (AMMs) like Uniswap V2 use the constant product formula: x * y = k. When a swap occurs, reserves shift, altering the effective price. This non-linear behavior means larger trades erode potential gains due to slippage.

Let:

(a1, b1) = reserves of Pool A (ETH, Token)
(a2, b2) = reserves of Pool B (ETH, Token)
fee = 0.003 (0.3% swap fee)

The output of a single swap is given by:

swap_output(x, a, b) = b * (1 - a / (a + x * (1 - fee)))

For two consecutive swaps (A → B), gross profit as a function of input x becomes:

profit(x) = swap_output(swap_output(x, a1, b1), b2, a2) - x

Using calculus, we find the value of x that maximizes profit by solving d(profit)/dx = 1. The solution yields the optimal trade size:

import math

def optimal_trade_size(reserves1, reserves2, fee=0.003):
    a1, b1 = reserves1
    a2, b2 = reserves2
    numerator = math.sqrt(a1 * b1 * a2 * b2 * (1 - fee)**4 * (b1 * (1 - fee) + b2)**2)
    numerator -= a1 * b2 * (1 - fee) * (b1 * (1 - fee) + b2)
    denominator = ((1 - fee) * (b1 * (1 - fee) + b2)) ** 2
    return numerator / denominator

This formula allows us to precisely compute the best input amount for any given pool configuration.

Implementing the Arbitrage Scanner

With the mathematical foundation in place, we now scan all eligible pairs and pool combinations.

Step-by-Step Execution Flow

Fetch reserves for all 3,081 eligible pools.
For each token pair:
- Reorder reserves so WETH is always first.
- Compare every pool combination (A vs B).
- Skip invalid cases (zero reserves or identical pools).
Compute optimal input and gross profit using the derived formulas.
Store all opportunities in a list.

After processing, we identified 1,791 potential arbitrage routes.

Estimating Net Profitability

Gross profit isn't enough—we must subtract gas costs. A basic estimate assumes:

Each swap consumes ~43k gas
Two swaps + base transaction = ~107k gas total

Using current gas prices:

gas_price = w3.eth.gas_price
for opp in opportunities:
    opp['net_profit'] = opp['profit'] - (107000 * gas_price / 1e18)

Sorting by net profit reveals only 57 initially positive opportunities. However, many involve toxic tokens—malicious ERC-20 contracts designed to trap traders by restricting sells or enabling rug pulls.

After manual filtering:

42 legitimate opportunities remain
Average net profit: ~0.004 ETH (~$7.60 at $1,900/ETH)
Input sizes range from 0.008 to 0.27 ETH

👉 Learn how top traders use smart contract simulations to avoid failed executions.

These values represent best-case scenarios; actual gas costs vary based on contract complexity and network congestion.

Frequently Asked Questions

How do I detect toxic tokens in liquidity pools?

Toxic tokens often manipulate balance tracking or restrict transfers. To detect them:

Check for functions like isBlacklisted, sellTax, or maxSellAmount in the contract.
Use tools that analyze token behavior on known malicious patterns.
Avoid tokens with extremely low trading volume or single large holders.

Why focus only on two-pool arbitrage?

Two-pool arbitrage is atomic and risk-free—it executes within one transaction. Multi-hop routes increase complexity, slippage risk, and failure probability. Starting simple ensures reliability before scaling.

Can I run this bot profitably on mainnet?

Possibly—but competition is fierce. Most low-hanging opportunities are claimed within milliseconds by specialized bots. To succeed, you need:

Low-latency infrastructure
Access to private RPCs or MEV relays
Advanced filtering to avoid front-running

What’s next after finding an opportunity?

Next steps involve:

Simulating transaction execution via eth_call
Building a flash loan-capable smart contract
Submitting transactions via MEV bundlers or RPCs

👉 Start simulating your arbitrage strategies in a secure environment today.

How accurate is the gas cost estimation?

The 107k gas estimate is a lower bound. Real-world usage may exceed this due to:

Complex token logic
Contract storage updates
Variability in pool implementations
Precise cost estimation requires simulating the full transaction path.

Should I include stablecoin pairs?

Yes—but with caution. Stablecoins like USDC/DAI often exhibit small but frequent mispricings. However:

Profits are typically smaller
Requires holding stablecoin inventory
Risk of depegging during execution

They’re excellent for diversification once your core ETH strategy is stable.

By combining rigorous mathematical modeling with efficient data processing, this framework lays the groundwork for a functional DeFi arbitrage bot. While raw profitability may seem limited after filtering and gas costs, optimization through better infrastructure and expanded strategies can unlock significant gains. In the next article, we'll build the smart contract that executes these trades—turning theory into action.