ChainETL
Extract blocks, transactions, logs, and token transfers from any EVM chain. Load into PostgreSQL or JSON Lines. Resumable syncs, reorg detection, and batch processing built in.
Supported chains: Ethereum, Base, Polygon, Arbitrum
Quick Start
pip install chainetl
cp .env.example .env # Set RPC URL + database
# Sync 10 Ethereum blocks to Postgres
chainetl sync --chain ethereum --start-block 18000000 --count 10
# Sync to JSON Lines (no database needed)
chainetl sync --chain polygon --start-block 50000000 --count 100 --destination jsonl
# Resume from checkpoint
chainetl sync --chain ethereum --resume --count 1000
# Check status
chainetl status --chain ethereumArchitecture
chainetl/
├── extractors/ # Chain-specific data extraction
│ ├── evm.py # Unified EVM extractor (shared by all chains)
│ ├── ethereum.py # Ethereum (thin wrapper)
│ ├── base_l2.py # Base L2 (thin wrapper)
│ ├── polygon.py # Polygon (thin wrapper)
│ └── arbitrum.py # Arbitrum (thin wrapper)
├── loaders/ # Output destinations
│ ├── postgres.py # PostgreSQL via SQLAlchemy
│ └── jsonl.py # JSON Lines file output
├── models/ # Pydantic data models
│ ├── block.py # Block (number, hash, timestamp, txs)
│ ├── transaction.py # Transaction (legacy + EIP-1559)
│ ├── log.py # Event logs
│ ├── token_transfer.py # ERC-20/721 transfers
│ └── checkpoint.py # Sync progress
├── utils/
│ ├── rpc.py # JSON-RPC client with retry
│ ├── retry.py # Exponential backoff
│ └── token_parser.py # ERC-20/721 event detection
└── cli.py # Typer CLIHow Extraction Works
ChainETL uses the standard Ethereum JSON-RPC interface (eth_getBlockByNumber, eth_getTransactionReceipt, etc.). Since all EVM chains implement the same spec, a single EVMExtractor class handles every chain.
Extraction happens in layers:
- Block metadata (number, hash, timestamp, parent hash)
- Transaction data (from, to, value, gas, input data)
- Transaction receipts (logs, gas used, status)
- Token transfers (parsed from ERC-20/ERC-721 Transfer events)
Python SDK
from chainetl.extractors.ethereum import EthereumExtractor
extractor = EthereumExtractor(rpc_url="https://eth.llamarpc.com")
# Single block
block = extractor.extract_block(18000000)
# Block with full data (transactions + logs + token transfers)
block, txs, logs, transfers = extractor.extract_block_with_full_data(18000000)
print(f"{len(txs)} transactions, {len(transfers)} token transfers")Data Models
All data is modeled with Pydantic v2:
- Block — number, hash, parent_hash, timestamp, transaction hashes
- Transaction — full EIP-1559 support (maxFeePerGas, maxPriorityFeePerGas), contract creation (to=None), signature (v, r, s)
- Log — address, topics (indexed params), data (non-indexed), log_index
- TokenTransfer — ERC-20 (fungible) and ERC-721 (NFT) parsed from log topics
- Checkpoint — chain, last_synced_block, last_synced_hash, synced_at, status
All models have a from_rpc() classmethod that handles hex→int conversion from raw RPC responses.
CLI Reference
| Command | Description |
|---|---|
chainetl sync --chain ethereum --start-block N --count N | Sync blocks to destination |
chainetl sync --destination jsonl | Output to JSON Lines files |
chainetl sync --resume --count 1000 | Resume from last checkpoint |
chainetl status --chain ethereum | Show sync status + checkpoint |
chainetl chains | List supported blockchains |
Configuration
# .env
ETHEREUM_RPC_URL=https://eth.llamarpc.com
BASE_RPC_URL=https://mainnet.base.org
POLYGON_RPC_URL=https://polygon-rpc.com
ARBITRUM_RPC_URL=https://arb1.arbitrum.io/rpc
DATABASE_URL=postgresql://localhost/chainetl_dev
LOG_LEVEL=INFOAdding a New Chain
Any EVM-compatible chain. Three files to touch:
# 1. extractors/optimism.py
from chainetl.extractors.evm import EVMExtractor
class OptimismExtractor(EVMExtractor):
def __init__(self, rpc_url: str) -> None:
super().__init__(rpc_url, chain="optimism")
# 2. Add to config.py:
optimism_rpc_url: HttpUrl = HttpUrl("https://mainnet.optimism.io")
# 3. Add to cli.py SUPPORTED_CHAINS:
"optimism": (OptimismExtractor, str(settings.optimism_rpc_url))