Skip to content

Cherry Pick Commits from Main#1664

Merged
vazois merged 12 commits into
devfrom
vazois/dev-backmerge
Apr 2, 2026
Merged

Cherry Pick Commits from Main#1664
vazois merged 12 commits into
devfrom
vazois/dev-backmerge

Conversation

@vazois

@vazois vazois commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

This PR attempts to back merge several commits from main

badrishc and others added 10 commits March 27, 2026 14:40
* Fix INFO all/default/everything returning empty response

Implement support for INFO all, default, and everything options:
- all: returns all DefaultInfo sections excluding module-generated ones
- default: returns the default set of sections (same as no-arg INFO)
- everything: returns all DefaultInfo sections including modules

Added pre-declared HashSet collections (AllInfoSet, EverythingInfoSet)
in GarnetInfoMetrics.cs derived from DefaultInfo to support these options.

Updated InfoCommand.cs to use UnionWith with the new HashSets instead of
silently skipping the ALL keyword.

Added tests for all three options, verifying correct section inclusion/exclusion.

Fixes #1643

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* make everything option use DefaultInfo

* Add null/empty guard in GetSectionHeaders test helper

Adds explicit asserts before splitting INFO output so test failures
surface a clear message instead of a NullReferenceException.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* wip; restructuring cluster tests to reduce CI duration

* separate dispose from close and configure socket to allow rapid connect

* ensure socket is disposed succesfully

* fix failing test

* update global.json

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The validation in Options.GetServerOptions() checked the raw
enableRevivification flag (from --reviv) instead of the computed
useRevivBinsPowerOf2 state. This caused specifying --reviv alongside
explicit --reviv-bin-record-sizes and --reviv-bin-record-counts to
always fail, despite the --reviv help text documenting that it can be
overridden by the combination of these flags.

Changed the check from enableRevivification to useRevivBinsPowerOf2,
which correctly reflects that --reviv-bin-record-sizes already
overrides the power-of-2 default when present.

Co-authored-by: Hamdaan Khalid <hkhalid@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Vasileios Zois <96085550+vazois@users.noreply.github.com>
* fixed

* SortedSet Race Condition - Mutation under Shared Lock

---------

Co-authored-by: Hamdaan Khalid <hkhalid@microsoft.com>
Co-authored-by: Vasileios Zois <96085550+vazois@users.noreply.github.com>
* Bugfix Accept Loop

* race condition and tiny loggin fix

* Dispose potentially partial sockets.

* Fix accept loop zombie state: tiered error handling with backoff retry

The accept loop in GarnetServerTcp.HandleNewConnection previously treated
all SocketError != Success as fatal, disposing the SocketAsyncEventArgs and
permanently killing the accept loop. This left the server in a zombie state
where existing connections worked but no new connections were accepted.

Changes:
- Extract HandleAcceptError with three-tier error categorization:
  - Tier 1 (fatal): OperationAborted, NotSocket, etc. — stop loop
  - Tier 2 (resource pressure): TooManyOpenSockets, NoBufferSpaceAvailable,
    etc. — backoff via Timer and retry (100ms initial, 5s cap)
  - Tier 3 (client-caused transient): ConnectionReset, etc. — log and continue
- Add ScheduleAcceptRetry using Timer to avoid blocking IOCP threads
- Dispose retry timer on server shutdown
- Add AcceptLoopTests with RST flood and extended attack simulation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* PR comments

* Nit

* Nit

* simplify greatly

* easy win

* small updates

---------

Co-authored-by: Hamdaan Khalid <hkhalid@microsoft.com>
Co-authored-by: Badrish Chandramouli <badrishc@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Vasileios Zois <96085550+vazois@users.noreply.github.com>
* fix migrate write test

* prevent expiration check on tombstoned key while scanning

* fix formatting

* ensure reviv pause signal is observed through epoch protection

* make revivPauseEvent readonly

* Update test/Garnet.test.cluster/ClusterMigrateTests.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* addressing first comments

* update comment in MigrateScanFunctions

* release epoch when acquiring exlucisve SuspendConfigMerge lock

* add ReaderWriterLock custom implementation

* fix formatting

* fixing simple tests

* make pause reviv thread safe

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Bumps [picomatch](https://github.com/micromatch/picomatch) from 2.3.1 to 2.3.2.
- [Release notes](https://github.com/micromatch/picomatch/releases)
- [Changelog](https://github.com/micromatch/picomatch/blob/master/CHANGELOG.md)
- [Commits](micromatch/picomatch@2.3.1...2.3.2)

---
updated-dependencies:
- dependency-name: picomatch
  dependency-version: 2.3.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [brace-expansion](https://github.com/juliangruber/brace-expansion) from 1.1.12 to 1.1.13.
- [Release notes](https://github.com/juliangruber/brace-expansion/releases)
- [Commits](juliangruber/brace-expansion@v1.1.12...v1.1.13)

---
updated-dependencies:
- dependency-name: brace-expansion
  dependency-version: 1.1.13
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
When a client RSTs between accept completing and socket setup (e.g. NoDelay),
the SocketException was unhandled and crashed the process. Wrap the post-accept
socket configuration in a try/catch, dispose the dead socket, and continue
the accept loop.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix CLUSTER NODES and CLUSTER SHARDS metadata output (#1650)

- CLUSTER NODES: Only append ,hostname when hostname is non-empty
- CLUSTER SHARDS: Replace 'address' field with Redis-compatible 'ip',
  'endpoint', and optional 'hostname' fields
- CLUSTER SHARDS: Honor ClusterPreferredEndpointType for endpoint field
  (ip address when Ip, hostname when Hostname)
- CLUSTER SHARDS: Fix role output to use 'master'/'slave' instead of
  enum names 'PRIMARY'/'REPLICA'
- Update NodeInfo struct and ClusterShards parser in test utilities
  for dynamic key-value field parsing
- Add ClusterShardsTest and ClusterNodesHostnameTest unit tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Refactor CLUSTER SLOTS/SHARDS to use StringBuilder

Replace string concatenation with StringBuilder in GetShardsInfo,
GetSlotsInfo, and their helper methods (AppendFormattedSlotInfo,
AppendNodeNetworkingInfo) to reduce intermediate string allocations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review comments and fix endpoint validation

- Fix endpoint validation (Options.cs): allow cluster-announce-ip when
  bind address is 0.0.0.0/:: (wildcard), since the server listens on
  all interfaces. Only require port match in that case.
- Handle ClusterPreferredEndpointType.Unknown in CLUSTER SHARDS: set
  endpoint to '?' consistent with CLUSTER SLOTS and redirects.
- Add ClusterShardsTest cases for Unknown endpoint type.
- Improve ClusterNodesHostnameTest to handle both hostname-present
  and hostname-absent branches.
- Add even-length guard and explicit role validation in ClusterShards
  test parser.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 2, 2026 02:00

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backmerge bringing in cluster/INFO/migration improvements, plus dependency/tooling updates and additional test coverage across the Garnet test suites.

Changes:

  • Add new INFO option aliases (ALL/DEFAULT/EVERYTHING) and update INFO help text and parsing behavior.
  • Update CLUSTER SHARDS / CLUSTER NODES formatting (ip/endpoint/hostname fields, preferred endpoint selection) and strengthen cluster/migration test coverage (including tombstones and cleanup waits).
  • Improve server accept-loop robustness under resource pressure and add a Tsavorite revivification pause handshake used by migration.

Reviewed changes

Copilot reviewed 23 out of 28 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
website/yarn.lock Bumps several JS dependencies/lockfile entries.
test/Garnet.test/RespInfoTests.cs Adds tests for INFO option behaviors and section sets.
test/Garnet.test/GarnetServerConfigTests.cs Adds tests ensuring --reviv parsing is ordering-independent.
test/Garnet.test.cluster/ClusterTestUtils.cs Updates CLUSTER SHARDS parsing, migration cleanup helpers, and slot parsing robustness.
test/Garnet.test.cluster/ClusterTestContext.cs Increases teardown timeout for cluster tests.
test/Garnet.test.cluster/ClusterNegativeTests.cs Adds “node known” waits to reduce cluster flakiness.
test/Garnet.test.cluster/ClusterMigrateTests.cs Improves migration assertions/waits; adds cleanup synchronization.
test/Garnet.test.cluster/ClusterManagementTests.cs Adds tests for CLUSTER SHARDS metadata and CLUSTER NODES hostname formatting.
playground/Bitmap/BitOp.cs Label/indentation adjustments in playground bitmap code.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Tsavorite.cs Adds pause handshake for revivification with epoch bump + wait.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Implementation/ContinuePending.cs Minor indentation change.
libs/storage/Tsavorite/cs/src/core/Allocator/AllocatorScan.cs Minor label indentation change.
libs/server/Servers/GarnetServerTcp.cs Adds accept error tiering + exponential backoff and improves handler-setup error handling.
libs/server/Objects/SortedSet/SortedSetObjectImpl.cs Adjusts sorted-set TTL logic to treat expired members as “not found” without bulk purge.
libs/server/Metrics/Info/InfoHelp.cs Extends INFO HELP text with ALL/DEFAULT/EVERYTHING.
libs/server/Metrics/Info/InfoCommand.cs Implements ALL/DEFAULT/EVERYTHING option parsing.
libs/server/Metrics/Info/GarnetInfoMetrics.cs Adds AllInfoSet derived from DefaultInfo minus MODULES.
libs/server/Lua/LuaTimeoutManager.cs Minor indentation change.
libs/host/Configuration/Options.cs Loosens cluster-announce endpoint matching when listening on Any/IPv6Any; fixes reviv parsing check.
libs/common/ReaderWriterLock.cs Adds a new semaphore-based reader/writer lock implementation.
libs/cluster/Session/RespClusterBasicCommands.cs Releases/acquires epoch around operations to avoid deadlocks; passes preferred endpoint type to SHARDS formatting.
libs/cluster/Server/Migration/MigrationDriver.cs Uses new Tsavorite PauseRevivification(timeout, token) during migration.
libs/cluster/Server/Migration/MigrateScanFunctions.cs Avoids treating tombstones as “expired” for migration scan filtering.
libs/cluster/Server/Gossip/Gossip.cs Switches config-merge lock type (but currently has a compile issue; see comments).
libs/cluster/Server/ClusterManager.cs Initializes/disposes the new merge lock instance.
libs/cluster/Server/ClusterConfig.cs Reworks SHARDS/SLOTS formatting to use StringBuilder, and adds ip/endpoint/hostname fields & hostname formatting fixes.
global.json Updates .NET SDK version pin.
es-metadata.yml Adds metadata config file.
Comments suppressed due to low confidence (1)

test/Garnet.test.cluster/ClusterMigrateTests.cs:626

  • The otherPorts filter uses x != sourcePort || x != targetPort, which is always true (a port cannot equal both values at once), so it won’t exclude sourcePort/targetPort from the set. Use an AND condition (exclude when equal to either) so the subsequent loop only checks the other nodes as intended.
            context.clusterTestUtils.WaitForMigrationCleanup(logger: context.logger);
            context.logger.LogDebug("6. Checking configuration update starting");
            // Check if configuration has updated by
            var otherPorts = context.clusterTestUtils.GetEndPoints().Select(x => ((IPEndPoint)x).Port).Where(x => x != sourcePort || x != targetPort);
            while (true)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread libs/cluster/Server/Gossip/Gossip.cs
Comment thread libs/common/ReaderWriterLock.cs
Comment thread libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Tsavorite.cs
Comment thread libs/server/Metrics/Info/InfoHelp.cs
Comment thread libs/server/Metrics/Info/InfoCommand.cs
Comment thread libs/common/ReaderWriterLock.cs Outdated
@vazois vazois changed the title Vazois/dev backmerge Cherry Pick Commits from Main Apr 2, 2026
vazois and others added 2 commits April 2, 2026 10:18
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@vazois vazois merged commit e8760eb into dev Apr 2, 2026
47 of 59 checks passed
@vazois vazois deleted the vazois/dev-backmerge branch April 2, 2026 22:23
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 2, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants