πŸΊπŸ’Ύ WolfDisk

Distributed File System for Linux

Overview

WolfDisk is a distributed file system that uses the same proven consensus infrastructure as WolfScale to replicate files across Linux machines. Mount a shared directory and have your data automatically synchronized across all nodes.

βœ… Status: WolfDisk Phase 2 is complete! Clustering infrastructure including node roles, leader election, and auto-discovery is ready for testing.
πŸ“

FUSE Integration

Mount as a regular Linux directory. All standard file operations work seamlessly.

πŸ”„

File Replication

Changes automatically sync across all nodes in the cluster.

πŸ”

Deduplication

Content-addressed storage with SHA256 - same content stored only once.

⚑

Chunk-Based

Large files split into 4MB chunks for efficient transfer and sync.

☁️

S3-Compatible API

WolfDisk exposes an S3-compatible endpoint. Mount from WolfStack using rust-s3 (pure Rust, works on IBM Power/ppc64le). Compatible with AWS S3 tools, Cyberduck, and rclone.

πŸ–₯️

WolfStack Integration

Mount WolfDisk S3 buckets directly from the WolfStack dashboard. Native storage manager support with sync and auto-mount on boot.

Operating Modes

Mode Description Best For
Replicated Data copied to N nodes with quorum writes High availability, disaster recovery
Shared Single leader, others access as clients Team collaboration, shared home dirs

Node Roles

WolfDisk supports four node roles to match your deployment needs:

Role Storage Replication Use Case
Leader βœ… Yes Broadcasts to followers Primary write node
Follower βœ… Yes Receives from leader Read replicas, failover
Client ❌ No None (mount-only) Access shared drive remotely
Auto βœ… Yes Dynamic election Default - lowest ID becomes leader
πŸ’‘ Client Mode: Perfect for workstations that just need to access the shared filesystem without storing data locally. The client mounts the drive and forwards operations to the leader.

Quick Install

Install WolfDisk with an interactive installer:

bash
curl -sSL https://raw.githubusercontent.com/wolfsoftwaresystemsltd/WolfScale/main/wolfdisk/setup.sh | bash

The installer will prompt you for:

  • Node ID β€” Unique identifier (defaults to hostname)
  • Role β€” auto, leader, follower, or client
  • Bind IP address β€” IP to listen on (auto-detected)
  • Discovery β€” Auto-discovery, manual peers, or standalone
  • Mount path β€” Where to mount the filesystem
⚠️ Proxmox Users: If running in an LXC container, enable FUSE in container options: Options β†’ Features β†’ FUSE

Architecture

text
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Linux Applications                        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                   mount /mnt/wolfdisk                        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                     FUSE (fuser)                             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                   WolfDisk Core                              β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚   β”‚ File Indexβ”‚  β”‚  Chunks   β”‚  β”‚ Replication Engine      β”‚ β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚               Network Layer (WolfScale-based)                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Components

Component Description
FUSE Layer Intercepts file operations (read, write, mkdir, etc.)
File Index Metadata mapping paths to chunk references
Chunk Store Content-addressed storage with SHA256 hashing
Replication Uses WolfScale's Raft-based consensus

Configuration

Edit /etc/wolfdisk/config.toml:

toml
[node]
id = "node1"
role = "auto"    # auto, leader, follower, or client
bind = "0.0.0.0:9500"
data_dir = "/var/lib/wolfdisk"

[cluster]
# Auto-discovery via UDP multicast:
discovery = "udp://239.255.0.1:9501"
# Or specify peers manually:
# peers = ["192.168.1.10:9500", "192.168.1.11:9500"]

[replication]
mode = "shared"      # or "replicated"
factor = 3           # Number of copies (replicated mode)
chunk_size = 4194304 # 4MB chunks

[mount]
path = "/mnt/wolfdisk"
allow_other = true

CLI Reference

Command Description
wolfdisk init Initialize the data directory
wolfdisk mount -m /mnt/wolfdisk Mount the filesystem
wolfdisk unmount -m /mnt/wolfdisk Unmount the filesystem
wolfdisk status Show node configuration

wolfdiskctl (Control Utility)

Query the running service for cluster information:

Command Description
wolfdiskctl status Show live status from running service
wolfdiskctl list servers List all discovered servers in the cluster
wolfdiskctl stats Live cluster statistics (refreshes every second)

Systemd Service

bash
# Start the service
sudo systemctl start wolfdisk

# Check status
sudo systemctl status wolfdisk

# View logs
sudo journalctl -u wolfdisk -f

Leader Failover

WolfDisk automatically handles leader failures with fast failover:

  • Heartbeat Detection β€” Nodes monitor the leader with 2-second timeout
  • Automatic Election β€” Lowest node ID becomes the new leader
  • Seamless Transition β€” Followers continue serving reads during failover
text
Initial State:
  node-a (leader) ←→ node-b (follower) ←→ node-c (follower)

node-a goes down:
  ❌ node-a         node-b detects timeout (2s)
                    node-b becomes leader (next lowest ID)

node-a returns:
  node-a syncs from node-b (gets missed changes)
  node-a becomes leader again (lowest ID)

Sync and Catchup

When a node starts or recovers from downtime, it automatically syncs with the leader:

  • Version Tracking β€” Every write increments the index version
  • Delta Sync β€” Only modified/new/deleted files are transferred
  • Efficient Catchup β€” A node that was down briefly only receives missed changes

Multiple Drives

You can mount multiple independent WolfDisk filesystems on the same node by running separate instances, each with its own configuration file. Each instance needs:

  • Unique node ID β€” e.g., node1-drive2
  • Separate bind port β€” e.g., 9501 instead of 9500
  • Separate data directory β€” e.g., /var/lib/wolfdisk2
  • Separate mount point β€” e.g., /mnt/wolfdisk2

Example: Second Drive Config

Create /etc/wolfdisk/config2.toml:

toml
[node]
id = "node1-drive2"
role = "auto"
bind = "0.0.0.0:9501"
data_dir = "/var/lib/wolfdisk2"

[cluster]
discovery = "udp://239.255.0.1:9502"
# Or: peers = ["192.168.1.10:9501", "192.168.1.11:9501"]

[replication]
mode = "shared"

[mount]
path = "/mnt/wolfdisk2"
allow_other = true

Mount the Second Drive

bash
# Mount the second drive with its own config
wolfdisk mount -m /mnt/wolfdisk2 --config /etc/wolfdisk/config2.toml

# Check status
wolfdisk status --config /etc/wolfdisk/config2.toml
πŸ’‘ Tip: Each WolfDisk instance is completely independent β€” separate cluster, separate data, separate port. Peers for each drive must use the same port across all nodes (e.g., all nodes use 9501 for drive 2).

WolfDisk vs Other Solutions

Feature NFS GlusterFS Ceph WolfDisk
Replication ❌ None βœ… Yes βœ… Yes βœ… Yes
Setup Complexity Easy Medium Complex Easy
Deduplication ❌ No ❌ No βœ… Yes βœ… Yes
Metadata Server Single None Required None
WolfScale Integration ❌ No ❌ No ❌ No βœ… Native
S3-Compatible API ❌ No ❌ No βœ… Yes (RGW) βœ… Yes (built-in)

Roadmap

βœ…
Phase 1: Foundation

FUSE filesystem, chunk storage, CLI, setup scripts

βœ…
Phase 2: Clustering

Leader election, node discovery, node roles, cluster state

βœ…
Phase 3: Failover & Sync

Leader failover (2s), delta sync, thin-client mode, live stats CLI

βœ…
Phase 4: Full Replication

Write replication (leader β†’ followers), read caching, chunk fetching

βœ…
Phase 5: S3 Integration

S3-compatible API endpoint, WolfStack Storage Manager integration via rust-s3 (pure Rust), sync to/from S3 buckets, rclone.conf import

πŸ”„
Phase 6: Production

Benchmarks, stress testing, multi-node validation