ChainETL

Extract blocks, transactions, logs, and token transfers from any EVM chain. Load into PostgreSQL or JSON Lines. Resumable syncs, reorg detection, and batch processing built in.

Supported chains: Ethereum, Base, Polygon, Arbitrum

Quick Start

pip install chainetl
cp .env.example .env  # Set RPC URL + database

# Sync 10 Ethereum blocks to Postgres
chainetl sync --chain ethereum --start-block 18000000 --count 10

# Sync to JSON Lines (no database needed)
chainetl sync --chain polygon --start-block 50000000 --count 100 --destination jsonl

# Resume from checkpoint
chainetl sync --chain ethereum --resume --count 1000

# Check status
chainetl status --chain ethereum

Architecture

chainetl/
├── extractors/           # Chain-specific data extraction
│   ├── evm.py            # Unified EVM extractor (shared by all chains)
│   ├── ethereum.py       # Ethereum (thin wrapper)
│   ├── base_l2.py        # Base L2 (thin wrapper)
│   ├── polygon.py        # Polygon (thin wrapper)
│   └── arbitrum.py       # Arbitrum (thin wrapper)
├── loaders/              # Output destinations
│   ├── postgres.py       # PostgreSQL via SQLAlchemy
│   └── jsonl.py          # JSON Lines file output
├── models/               # Pydantic data models
│   ├── block.py          # Block (number, hash, timestamp, txs)
│   ├── transaction.py    # Transaction (legacy + EIP-1559)
│   ├── log.py            # Event logs
│   ├── token_transfer.py # ERC-20/721 transfers
│   └── checkpoint.py     # Sync progress
├── utils/
│   ├── rpc.py            # JSON-RPC client with retry
│   ├── retry.py          # Exponential backoff
│   └── token_parser.py   # ERC-20/721 event detection
└── cli.py                # Typer CLI

How Extraction Works

ChainETL uses the standard Ethereum JSON-RPC interface (eth_getBlockByNumber, eth_getTransactionReceipt, etc.). Since all EVM chains implement the same spec, a single EVMExtractor class handles every chain.

Extraction happens in layers:

  1. Block metadata (number, hash, timestamp, parent hash)
  2. Transaction data (from, to, value, gas, input data)
  3. Transaction receipts (logs, gas used, status)
  4. Token transfers (parsed from ERC-20/ERC-721 Transfer events)

Python SDK

from chainetl.extractors.ethereum import EthereumExtractor

extractor = EthereumExtractor(rpc_url="https://eth.llamarpc.com")

# Single block
block = extractor.extract_block(18000000)

# Block with full data (transactions + logs + token transfers)
block, txs, logs, transfers = extractor.extract_block_with_full_data(18000000)
print(f"{len(txs)} transactions, {len(transfers)} token transfers")

Data Models

All data is modeled with Pydantic v2:

  • Block — number, hash, parent_hash, timestamp, transaction hashes
  • Transaction — full EIP-1559 support (maxFeePerGas, maxPriorityFeePerGas), contract creation (to=None), signature (v, r, s)
  • Log — address, topics (indexed params), data (non-indexed), log_index
  • TokenTransfer — ERC-20 (fungible) and ERC-721 (NFT) parsed from log topics
  • Checkpoint — chain, last_synced_block, last_synced_hash, synced_at, status

All models have a from_rpc() classmethod that handles hex→int conversion from raw RPC responses.

CLI Reference

CommandDescription
chainetl sync --chain ethereum --start-block N --count NSync blocks to destination
chainetl sync --destination jsonlOutput to JSON Lines files
chainetl sync --resume --count 1000Resume from last checkpoint
chainetl status --chain ethereumShow sync status + checkpoint
chainetl chainsList supported blockchains

Configuration

# .env
ETHEREUM_RPC_URL=https://eth.llamarpc.com
BASE_RPC_URL=https://mainnet.base.org
POLYGON_RPC_URL=https://polygon-rpc.com
ARBITRUM_RPC_URL=https://arb1.arbitrum.io/rpc
DATABASE_URL=postgresql://localhost/chainetl_dev
LOG_LEVEL=INFO

Adding a New Chain

Any EVM-compatible chain. Three files to touch:

# 1. extractors/optimism.py
from chainetl.extractors.evm import EVMExtractor

class OptimismExtractor(EVMExtractor):
    def __init__(self, rpc_url: str) -> None:
        super().__init__(rpc_url, chain="optimism")

# 2. Add to config.py:
optimism_rpc_url: HttpUrl = HttpUrl("https://mainnet.optimism.io")

# 3. Add to cli.py SUPPORTED_CHAINS:
"optimism": (OptimismExtractor, str(settings.optimism_rpc_url))