Glossary

Archive Node

A specialized full node that stores every historical blockchain state at each block. Unlike standard full nodes, archive nodes enable deep historical queries — such as past balances, contract states, and storage roots. They’re essential for explorers, indexers, MEV searchers, and analytics platforms. Due to massive storage requirements (e.g., Ethereum > 25 TB), they’re typically deployed on bare metal or high-performance environments.

Bare Metal

Dedicated physical server hardware with no virtualization layer. Bare metal gives full control over compute, storage, and networking, offering low latency, predictable performance, and strong reliability. It’s frequently used for validators, sequencers, RPC clusters, indexers, and MEV routers where every millisecond matters. Often combined with colocation for optimal network proximity.

Client

Software that connects to and interacts with a blockchain network.

Node Clients (e.g., Geth) run the protocol, validate blocks, and maintain state.
Application Clients (SDKs or libraries) let wallets, dApps, RPC endpoints, sequencers, and other systems communicate with nodes.

Clients define how networks achieve consensus and how external apps access blockchain state.

Cluster

A group of interconnected nodes or servers that act as a single system to provide scalability, high availability, and performance. Clusters power RPC endpoints, validator sets, sequencers, and indexers. They enable horizontal scaling, automatic failover, geo-distribution, and centralized orchestration, forming the backbone of production-grade infrastructure.

Colocation (Co-Lo)

Colocation is the practice of placing your own physical servers inside professionally managed data centers near major internet exchange points, chain hubs, or low-latency peering locations. Instead of relying on virtualized cloud infrastructure, you control the hardware, network configuration, and placement, while the facility provides power, cooling, security, and connectivity.

This setup gives teams deterministic network performance, minimal latency to peers, reduced egress costs, and hardware-level optimization.

Example: Hosting a Geth validator in a Tokyo Equinix TY2 data center to sit closer to Ethereum peers and propagation routes.

Edge Computing

Edge computing is about where and how workloads run, not just where machines are located. It distributes software — such as RPC endpoints, indexers, caching layers, and sequencers — closer to users or data sources to reduce latency, improve redundancy, and optimize cost. It can run on co-lo hardware, cloud, or both.

Example: Deploying a caching RPC node in Singapore to serve Asia users while the main node remains in Tokyo.

Nuance: Co-lo = physical placement. Edge = workload distribution.

Egress

The process of transferring data out of a cloud environment, often to another provider or the public internet. In Web3, egress is a major cost factor and can introduce latency and jitter. Teams reduce these issues through private interconnects or colocation, gaining more predictable network performance and avoiding hyperscaler egress fees.

Full Node

A server that stores and validates the entire blockchain ledger from genesis. It maintains state, verifies transactions and blocks, and propagates data across the network. Full nodes provide the foundation for RPC services, explorers, sequencers, and DeFi apps, ensuring network security and data availability.

Geth

Geth (Go Ethereum) is a software client and the most widely used implementation of Ethereum. It can run in full, light, or archive mode, exposing RPC and WebSocket APIs. Geth handles chain sync, transaction propagation, validation, and account management. It’s the reference implementation for Ethereum infra and is often deployed in containerized or bare metal setups.

Hyperscaler

A hyperscaler is a large-scale cloud service provider that offers vast, elastic compute, storage, and networking resources globally. Examples include Amazon Web Services, Google Cloud, and Microsoft Azure. Hyperscalers provide flexibility and coverage but often come with trade-offs like unpredictable latency, noisy neighbors, and high egress costs — making them less ideal for deterministic Web3 infra compared to co-lo or edge setups.

Ingestion

The entry point of the blockchain data pipeline. Ingestion involves pulling raw data — transactions, logs, events, and state updates — from nodes into processing systems. Efficient ingestion enables real-time indexing, analytics, and trading, especially for latency-sensitive workloads like MEV searchers, oracles, or cross-chain routers.

Note: Cloud providers often make ingestion free or cheap to attract workloads; the real cost typically comes from egress.

Indexing

The process of organizing raw blockchain data into structured, queryable formats. Indexing powers explorers, dashboards, and analytics platforms by enabling fast data retrieval without rescanning the entire chain. A strong indexing layer improves performance and makes blockchain data usable for applications like sequencers and intent routers.

Kubernetes (K8s)

Kubernetes is an open-source orchestration platform for managing containerized workloads. It’s widely used to run RPC clusters, validators, sequencers, and indexing services. Kubernetes automates rollouts, enables scaling, improves uptime, and standardizes infra operations across regions and chains.

Light Node (Light Client)

A lightweight client that stores only minimal blockchain data (headers and proofs), fetching additional data from full nodes as needed. It’s ideal for mobile devices and other low-resource environments, allowing fast sync and broad network participation without storing the entire chain.

Load Balancing

The process of distributing network traffic or workloads across multiple servers to avoid bottlenecks and ensure optimal performance. In Web3, load balancers improve RPC throughput, provide automatic failover, enable geo-routing, and help clusters scale reliably under high demand.

Managed Service

A fully operated infrastructure offering provided by a third party. Managed services handle provisioning, scaling, upgrades, and monitoring for workloads like RPC endpoints, validator nodes, sequencers, and indexers. This lets teams focus on product instead of infrastructure, often backed by SLAs for reliability.

RPC Node

A node that exposes RPC interfaces for applications to read blockchain data and broadcast transactions. RPC nodes power wallets, explorers, and protocols. They can run as full or archive nodes, and their latency, uptime, and capacity directly shape user experience.

Virtualization

A computing method that abstracts physical hardware into multiple virtual machines or containers, allowing multiple workloads to share the same physical server. Virtualization enables flexible scaling and efficient resource allocation but can add latency and reduce determinism compared to bare metal. It’s widely used in cloud environments to deploy validators, sequencers, and RPC services at scale.