Skip to content

Fix CLUSTER NODES and CLUSTER SHARDS metadata output#1657

Merged
vazois merged 3 commits into
mainfrom
fix/cluster-metadata-endpoint-1650
Apr 1, 2026
Merged

Fix CLUSTER NODES and CLUSTER SHARDS metadata output#1657
vazois merged 3 commits into
mainfrom
fix/cluster-metadata-endpoint-1650

Conversation

@vazois

@vazois vazois commented Mar 31, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #1650

This PR fixes the metadata output for CLUSTER NODES, CLUSTER SLOTS, and CLUSTER SHARDS commands to align with Redis-compatible field names, ordering, and endpoint type handling.

Important: cluster-preferred-endpoint-type behavior

The --cluster-preferred-endpoint-type option does not replace IP addresses with hostnames in fields that are defined as IP fields. The ip field in CLUSTER SHARDS and the <ip:port@cport> portion of CLUSTER NODES always contain the node's IP address regardless of this setting.

What cluster-preferred-endpoint-type does control:

  • CLUSTER SLOTS: determines which value appears in the primary endpoint position vs. metadata
  • CLUSTER SHARDS: determines the value of the endpoint field (IP, hostname, or ?)
  • -MOVED / -ASK redirects: determines the address used in redirect messages

Changes

CLUSTER NODES (GetNodeInfo)

  • Only append ,hostname in the address field when hostname is non-empty (was unconditionally appending empty value)
  • No change to <ip:port@cport> format — this always uses the IP address per Redis specification

CLUSTER SHARDS (GetShardsInfo)

  • Replaced address field with Redis-compatible ip, endpoint, and optional hostname fields
  • Added ClusterPreferredEndpointType support — endpoint = IP (default), hostname (when Hostname), or ? (when Unknown)
  • Role now outputs master/slave instead of enum names PRIMARY/REPLICA
  • Dynamic field count (14 base + 2 when hostname present)
  • Refactored to use StringBuilder to reduce allocations

CLUSTER SLOTS

  • Refactored to use StringBuilder to reduce allocations (no behavioral changes)

Endpoint validation (Options.cs)

  • Fixed --cluster-announce-ip validation to accept any IP when bind address is 0.0.0.0 or ::, since the server listens on all interfaces

Tests

  • ClusterShardsTest: 5 test cases covering Ip/Hostname/Unknown endpoint types with and without announce hostname
  • ClusterNodesHostnameTest: 2 test cases validating hostname formatting in CLUSTER NODES address field
  • Improved ClusterShards parser with even-length guard and explicit role validation

CLUSTER SHARDS field ordering

id, port, ip, endpoint, [hostname], role, replication-offset, health
  • ip: always present — the node's IP address (never replaced by hostname)
  • endpoint: always present — equals ip (Ip type), hostname (Hostname type), or ? (Unknown type)
  • hostname: only present when non-empty

- CLUSTER NODES: Only append ,hostname when hostname is non-empty
- CLUSTER SHARDS: Replace 'address' field with Redis-compatible 'ip',
  'endpoint', and optional 'hostname' fields
- CLUSTER SHARDS: Honor ClusterPreferredEndpointType for endpoint field
  (ip address when Ip, hostname when Hostname)
- CLUSTER SHARDS: Fix role output to use 'master'/'slave' instead of
  enum names 'PRIMARY'/'REPLICA'
- Update NodeInfo struct and ClusterShards parser in test utilities
  for dynamic key-value field parsing
- Add ClusterShardsTest and ClusterNodesHostnameTest unit tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 31, 2026 20:06

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Garnet’s cluster metadata serialization to better match Redis-compatible output for CLUSTER NODES / CLUSTER SHARDS, and adds/updates cluster tests to validate the new field layout and hostname handling.

Changes:

  • Adjust CLUSTER NODES formatting to only append ,hostname when a hostname is present.
  • Rework CLUSTER SHARDS node metadata to emit Redis-compatible ip, endpoint, and optional hostname fields, and pass ClusterPreferredEndpointType into the formatting path.
  • Update cluster test utilities and add new tests for CLUSTER SHARDS and hostname formatting in CLUSTER NODES.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
test/Garnet.test.cluster/ClusterTestUtils.cs Updates shard parsing/representation to match new CLUSTER SHARDS node fields (ip/endpoint/hostname).
test/Garnet.test.cluster/ClusterManagementTests.cs Adds tests for CLUSTER SHARDS field expectations and CLUSTER NODES hostname suffix formatting.
libs/cluster/Session/RespClusterBasicCommands.cs Threads ClusterPreferredEndpointType into the CLUSTER SHARDS response generation call.
libs/cluster/Server/ClusterConfig.cs Adjusts CLUSTER NODES hostname suffix emission and redefines CLUSTER SHARDS node field set/ordering and role strings.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread libs/cluster/Server/ClusterConfig.cs
Comment thread test/Garnet.test.cluster/ClusterManagementTests.cs Outdated
Comment thread test/Garnet.test.cluster/ClusterTestUtils.cs
Comment thread libs/cluster/Server/ClusterConfig.cs Outdated
vazois and others added 2 commits March 31, 2026 13:18
Replace string concatenation with StringBuilder in GetShardsInfo,
GetSlotsInfo, and their helper methods (AppendFormattedSlotInfo,
AppendNodeNetworkingInfo) to reduce intermediate string allocations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix endpoint validation (Options.cs): allow cluster-announce-ip when
  bind address is 0.0.0.0/:: (wildcard), since the server listens on
  all interfaces. Only require port match in that case.
- Handle ClusterPreferredEndpointType.Unknown in CLUSTER SHARDS: set
  endpoint to '?' consistent with CLUSTER SLOTS and redirects.
- Add ClusterShardsTest cases for Unknown endpoint type.
- Improve ClusterNodesHostnameTest to handle both hostname-present
  and hostname-absent branches.
- Add even-length guard and explicit role validation in ClusterShards
  test parser.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vazois vazois merged commit dafb35f into main Apr 1, 2026
69 of 71 checks passed
@vazois vazois deleted the fix/cluster-metadata-endpoint-1650 branch April 1, 2026 17:53
vazois added a commit that referenced this pull request Apr 2, 2026
* Fix CLUSTER NODES and CLUSTER SHARDS metadata output (#1650)

- CLUSTER NODES: Only append ,hostname when hostname is non-empty
- CLUSTER SHARDS: Replace 'address' field with Redis-compatible 'ip',
  'endpoint', and optional 'hostname' fields
- CLUSTER SHARDS: Honor ClusterPreferredEndpointType for endpoint field
  (ip address when Ip, hostname when Hostname)
- CLUSTER SHARDS: Fix role output to use 'master'/'slave' instead of
  enum names 'PRIMARY'/'REPLICA'
- Update NodeInfo struct and ClusterShards parser in test utilities
  for dynamic key-value field parsing
- Add ClusterShardsTest and ClusterNodesHostnameTest unit tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Refactor CLUSTER SLOTS/SHARDS to use StringBuilder

Replace string concatenation with StringBuilder in GetShardsInfo,
GetSlotsInfo, and their helper methods (AppendFormattedSlotInfo,
AppendNodeNetworkingInfo) to reduce intermediate string allocations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review comments and fix endpoint validation

- Fix endpoint validation (Options.cs): allow cluster-announce-ip when
  bind address is 0.0.0.0/:: (wildcard), since the server listens on
  all interfaces. Only require port match in that case.
- Handle ClusterPreferredEndpointType.Unknown in CLUSTER SHARDS: set
  endpoint to '?' consistent with CLUSTER SLOTS and redirects.
- Add ClusterShardsTest cases for Unknown endpoint type.
- Improve ClusterNodesHostnameTest to handle both hostname-present
  and hostname-absent branches.
- Add even-length guard and explicit role validation in ClusterShards
  test parser.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
vazois added a commit that referenced this pull request Apr 2, 2026
* Add es-metadata.yml (#1641)

* Fix INFO all/default/everything returning empty response (#1645)

* Fix INFO all/default/everything returning empty response

Implement support for INFO all, default, and everything options:
- all: returns all DefaultInfo sections excluding module-generated ones
- default: returns the default set of sections (same as no-arg INFO)
- everything: returns all DefaultInfo sections including modules

Added pre-declared HashSet collections (AllInfoSet, EverythingInfoSet)
in GarnetInfoMetrics.cs derived from DefaultInfo to support these options.

Updated InfoCommand.cs to use UnionWith with the new HashSets instead of
silently skipping the ALL keyword.

Added tests for all three options, verifying correct section inclusion/exclusion.

Fixes #1643

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* make everything option use DefaultInfo

* Add null/empty guard in GetSectionHeaders test helper

Adds explicit asserts before splitting INFO output so test failures
surface a clear message instead of a NullReferenceException.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* wip; restructuring cluster tests to reduce CI duration

* separate dispose from close and configure socket to allow rapid connect

* ensure socket is disposed succesfully

* fix failing test

* update global.json

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix revivification CLI flag ordering dependency (#1644)

The validation in Options.GetServerOptions() checked the raw
enableRevivification flag (from --reviv) instead of the computed
useRevivBinsPowerOf2 state. This caused specifying --reviv alongside
explicit --reviv-bin-record-sizes and --reviv-bin-record-counts to
always fail, despite the --reviv help text documenting that it can be
overridden by the combination of these flags.

Changed the check from enableRevivification to useRevivBinsPowerOf2,
which correctly reflects that --reviv-bin-record-sizes already
overrides the power-of-2 default when present.

Co-authored-by: Hamdaan Khalid <hkhalid@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Vasileios Zois <96085550+vazois@users.noreply.github.com>

* [Bugfix] SortedSet Race Condition - Mutating under Shared Lock (#1642)

* fixed

* SortedSet Race Condition - Mutation under Shared Lock

---------

Co-authored-by: Hamdaan Khalid <hkhalid@microsoft.com>
Co-authored-by: Vasileios Zois <96085550+vazois@users.noreply.github.com>

* Bugfix: Accept Loop socket error handling (#1646)

* Bugfix Accept Loop

* race condition and tiny loggin fix

* Dispose potentially partial sockets.

* Fix accept loop zombie state: tiered error handling with backoff retry

The accept loop in GarnetServerTcp.HandleNewConnection previously treated
all SocketError != Success as fatal, disposing the SocketAsyncEventArgs and
permanently killing the accept loop. This left the server in a zombie state
where existing connections worked but no new connections were accepted.

Changes:
- Extract HandleAcceptError with three-tier error categorization:
  - Tier 1 (fatal): OperationAborted, NotSocket, etc. — stop loop
  - Tier 2 (resource pressure): TooManyOpenSockets, NoBufferSpaceAvailable,
    etc. — backoff via Timer and retry (100ms initial, 5s cap)
  - Tier 3 (client-caused transient): ConnectionReset, etc. — log and continue
- Add ScheduleAcceptRetry using Timer to avoid blocking IOCP threads
- Dispose retry timer on server shutdown
- Add AcceptLoopTests with RST flood and extended attack simulation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* PR comments

* Nit

* Nit

* simplify greatly

* easy win

* small updates

---------

Co-authored-by: Hamdaan Khalid <hkhalid@microsoft.com>
Co-authored-by: Badrish Chandramouli <badrishc@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Vasileios Zois <96085550+vazois@users.noreply.github.com>

* Skip Value Expiration Check When Scanning a Tombstoned Record (#1612)

* fix migrate write test

* prevent expiration check on tombstoned key while scanning

* fix formatting

* ensure reviv pause signal is observed through epoch protection

* make revivPauseEvent readonly

* Update test/Garnet.test.cluster/ClusterMigrateTests.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* addressing first comments

* update comment in MigrateScanFunctions

* release epoch when acquiring exlucisve SuspendConfigMerge lock

* add ReaderWriterLock custom implementation

* fix formatting

* fixing simple tests

* make pause reviv thread safe

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Bump picomatch from 2.3.1 to 2.3.2 in /website (#1649)

Bumps [picomatch](https://github.com/micromatch/picomatch) from 2.3.1 to 2.3.2.
- [Release notes](https://github.com/micromatch/picomatch/releases)
- [Changelog](https://github.com/micromatch/picomatch/blob/master/CHANGELOG.md)
- [Commits](micromatch/picomatch@2.3.1...2.3.2)

---
updated-dependencies:
- dependency-name: picomatch
  dependency-version: 2.3.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump brace-expansion from 1.1.12 to 1.1.13 in /website (#1651)

Bumps [brace-expansion](https://github.com/juliangruber/brace-expansion) from 1.1.12 to 1.1.13.
- [Release notes](https://github.com/juliangruber/brace-expansion/releases)
- [Commits](juliangruber/brace-expansion@v1.1.12...v1.1.13)

---
updated-dependencies:
- dependency-name: brace-expansion
  dependency-version: 1.1.13
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Handle SocketException on accepted socket configuration (#1648)

When a client RSTs between accept completing and socket setup (e.g. NoDelay),
the SocketException was unhandled and crashed the process. Wrap the post-accept
socket configuration in a try/catch, dispose the dead socket, and continue
the accept loop.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix CLUSTER NODES and CLUSTER SHARDS metadata output (#1657)

* Fix CLUSTER NODES and CLUSTER SHARDS metadata output (#1650)

- CLUSTER NODES: Only append ,hostname when hostname is non-empty
- CLUSTER SHARDS: Replace 'address' field with Redis-compatible 'ip',
  'endpoint', and optional 'hostname' fields
- CLUSTER SHARDS: Honor ClusterPreferredEndpointType for endpoint field
  (ip address when Ip, hostname when Hostname)
- CLUSTER SHARDS: Fix role output to use 'master'/'slave' instead of
  enum names 'PRIMARY'/'REPLICA'
- Update NodeInfo struct and ClusterShards parser in test utilities
  for dynamic key-value field parsing
- Add ClusterShardsTest and ClusterNodesHostnameTest unit tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Refactor CLUSTER SLOTS/SHARDS to use StringBuilder

Replace string concatenation with StringBuilder in GetShardsInfo,
GetSlotsInfo, and their helper methods (AppendFormattedSlotInfo,
AppendNodeNetworkingInfo) to reduce intermediate string allocations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review comments and fix endpoint validation

- Fix endpoint validation (Options.cs): allow cluster-announce-ip when
  bind address is 0.0.0.0/:: (wildcard), since the server listens on
  all interfaces. Only require port match in that case.
- Handle ClusterPreferredEndpointType.Unknown in CLUSTER SHARDS: set
  endpoint to '?' consistent with CLUSTER SLOTS and redirects.
- Add ClusterShardsTest cases for Unknown endpoint type.
- Improve ClusterNodesHostnameTest to handle both hostname-present
  and hostname-absent branches.
- Add even-length guard and explicit role validation in ClusterShards
  test parser.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Badrish Chandramouli <badrishc@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Hamdaan Khalid <42720645+hamdaankhalid@users.noreply.github.com>
Co-authored-by: Hamdaan Khalid <hkhalid@microsoft.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 1, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CLUSTER NODES ignores ClusterPreferredEndpointType=hostname and still returns worker bind addresses

3 participants