Skip to main content

How StateMesh Works

StateMesh runs customer workloads in various geographic regions to make the network globally available to users worldwide. Each geographic region can have one or more clusters of nodes called availability zones. Currently, the following regions and zones are available or planned for the near future:

RegionAvailability ZoneStatus
Europeeu-central-1
Live
Europeeu-north-1
Planned
Singaporesg-1
Live
US Eastus-east-1
Planned
US Westus-west-1
Planned
Hong Konghk-1
Planned
Indiain-1
Planned
tip

We are always open to adding more regions and zones based on customer demand.

Architecture

The StateMesh network is comprised or Regions, Availability Zones, Nodes and the StateMesh Blockchain.

StateMesh Network Architecture
Figure: StateMesh Network Architecture

The entire system is an interplay of hardware and software components that work together to provide a reliable, secure and performant platform for running customer workloads. The network is designed to be highly available, fault-tolerant and scalable. It is built on top of proven technologies and best practices in the industry.

Regions

A region is a geographical that has one or more availability zones.

Regions and Availability Zones
Figure: Regions and Availability Zones

Regions maintain a global view of their network and are responsible for securing inter-zone communication or even provide inter-zone communication if necessary. Each region has its own mesh private network that is not accessible from the public internet. It's a VPN-like network optimized for low latency and high throughput, with ChaCha20 encryption with Poly1305 authentication, dynamic route assignments and MTU management and dynamic membership. The mesh network runs in kernel space to provide the best performance.

The mesh network introduces a latency compared to the public internet, it's impact is negligible for most workloads. Our benchmarks show that the extra latency is around 0.1ms and the throughput is around 50% of the public network bandwidth. Over a 10Gbps public network, the mesh network can provide around 5Gbps of throughput.

Availability Zones

Each availability zone is a cluster of nodes that are geographically close to each other. Each node is operated by a node operator, and node operators are responsible for running the node software and for maintaining the node's uptime.

While a technical solution is already in place to enable cross-zone communication at the region level, we could not find a valid use-case to enable it. We are open to enabling cross-zone communication if there is a valid use-case for it.

The main software components of an availability zone are:

  • Master Nodes: cluster services for crowd-sourcing compute resources, work distribution, and task execution
  • Metric & Billing services: services that collect metrics, compute billing information and handle payments
  • Ingress Routers: services that routes incoming requests to the correct node in the cluster, with policies for traffic shaping and rate limiting
  • Network Storage Services: services that aggregate storage from nodes and attaches them as block devices to customer workloads. Handles replication, backup, and recovery.
  • BYOS: Bring Your Own Storage services that allow customers to bring their own S3-compatible storage to the network. This includes advanced metadata caching and storage
  • Nodes: the actual compute nodes that run customer workloads and/or provide storage
Architecture of an Availability Zone
Figure: Architecture of an Availability Zone

Node Cluster

By far the most complex software component of an Availability Zone is the node cluster. At its core it uses Kubernetes to orchestrate customer workloads like containers, storage, networking and virtual machines, but is heavily customized for edge computing and to fit the unique needs of the StateMesh network.

This component handles the following tasks:

  • Task scheduling and execution: The node cluster is responsible for scheduling and executing customer workloads. It uses a custom scheduler that takes into account the Time Tower of the node operator, the customer's requirements, the node's capabilities and the current load of the node.
  • Node monitoring and health checks: The node cluster monitors the health of the nodes and the workloads running on them. It uses a custom health check system that is optimized for low latency and high throughput.
  • Intra-zone routing and networking: Handles IP assignments, ingress and routes intra-zone container and VM traffic. It uses eBPF to provide the best performance and minimal latency.
  • Load balancing and failover: Manages the load balancing and automatic failover of customer workloads for including HTTP, TCP and UDP traffic.
  • Workflow storage management: Manages the storage of customer workloads, including block storage, object storage and file storage. It uses a custom storage system that is optimized for low latency and high throughput.
  • Node operator management: Manages the node operators, their Time Towers, their rewards and their onboarding.
  • Security policies: Enforces security policies like customer workload isolation and access control lists.

The node cluster software also handles nodes coming online and going offline, and is able to quickly recover from node failures. If a node fails, the node cluster will automatically reschedule the workloads running on the failed node to other nodes in the cluster in a matter of seconds.

Ingress routing

Ingress routing services handle north-south traffic and routes it to the correct node and container in the cluster. It also handles rate limiting, traffic shaping and DDoS protection.

Each workload running on StateMesh can be exposed to the outside world using a generated public DNS address over a TLS connection (eg: https://app-abcdef-xjlfg.eu-central-1.statemesh.net. Routing is performed based on hostname and path for HTTPS services and TLS SNI for TCP services. Because UDP traffic does not support TLS, routing is based on the incoming port.

StateMesh Ingress routing
Figure: StateMesh Ingress routing

Nodes

Worker nodes are the actual compute nodes that run customer workloads. They can be virtual machines or physical machines with an x86-64 or amd64 CPU architecture. arm64 is on the roadmap and planned in next releases.

Nodes are operated by node operators. Each node operator is responsible for maintaining the node's uptime and keeping a minimum MESH balance to confirm the node's presence on the network.

info

MESH is used only for paying gas fees associated with confirming the node's presence on the network.

Nodes must run a fresh installation of the Ubuntu 22.04 operating system. Other configurations are not supported.

Blockchain

StateMesh uses its own L1 blockchain to store and validate node states using Time Towers. At the end of each chain epoch, the tower registry is reconfigured for each availability zone to reflect jailed nodes and elected node operators for the next epoch.

The blockchain is also used to handle automated stablecoin payments and bridging stablecoins from/to Statemesh to other blockchains.

Time Towers

Each node operator must run the StateMesh Tower Service service on his machine to join the global miner pool. Being in the miner pool is necessary to be able to join a cluster. The higher the Time Tower, the more likely the node is to be elected for task execution.