[MergeDups] Separate identical vs similar duplicate finding with intermediate dialog#4039
Conversation
…egration Co-authored-by: imnasnainaec <6411521+imnasnainaec@users.noreply.github.com>
Co-authored-by: imnasnainaec <6411521+imnasnainaec@users.noreply.github.com>
Co-authored-by: imnasnainaec <6411521+imnasnainaec@users.noreply.github.com>
Co-authored-by: imnasnainaec <6411521+imnasnainaec@users.noreply.github.com>
Co-authored-by: imnasnainaec <6411521+imnasnainaec@users.noreply.github.com>
|
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #4039 +/- ##
==========================================
- Coverage 74.92% 74.73% -0.19%
==========================================
Files 295 296 +1
Lines 10911 10958 +47
Branches 1366 1369 +3
==========================================
+ Hits 8175 8190 +15
- Misses 2341 2373 +32
Partials 395 395
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This comment was marked as resolved.
This comment was marked as resolved.
imnasnainaec
left a comment
There was a problem hiding this comment.
@imnasnainaec reviewed 2 files and all commit messages.
Reviewable status: 2 of 14 files reviewed, all discussions resolved.
|
Important Review skippedBot user detected. To trigger a single review, invoke the You can disable this status message by setting the
📝 WalkthroughWalkthroughThis PR implements a feature for handling identical vernacular duplicates in the merge workflow. It adds a backend API endpoint for fast identical duplicate detection, a frontend dialog informing users of completion and next steps, and refactors the merge duplicate loading logic to split identical and similar duplicate searches into two distinct phases. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Frontend as Frontend<br/>(BaseGoalScreen)
participant API as API Client<br/>(merge-api)
participant Backend as Backend<br/>(MergeController)
participant Service as MergeService
User->>Frontend: Open MergeDups Goal
Frontend->>API: findIdenticalDuplicates(projectId, maxInList, maxLists)
API->>Backend: GET /v1/projects/{id}/merge/findidenticaldups/...
Backend->>Service: GetPotentialDuplicates(..., identicalVernacular=true)
Service-->>Backend: List<List<Word>>
Backend-->>API: OkObjectResult(duplicates)
API-->>Frontend: Response with identical duplicates
Frontend->>Frontend: Show IdenticalDuplicatesDialog (Loading state)
alt Has identical duplicates found
rect rgba(100, 149, 237, 0.5)
Note over Frontend,Service: Dialog shows completion congratulations
User->>Frontend: Click "Review Deferred"
Frontend->>Frontend: Dispatch ReviewDeferredDups goal
Frontend->>Frontend: Close dialog
end
else User clicks "Continue"
rect rgba(144, 238, 144, 0.5)
Note over Frontend,Service: Initiate similar duplicates search
Frontend->>API: findSimilarDuplicates(projectId, maxInList, maxLists)
API->>Backend: GET /v1/projects/{id}/merge/findpotentialdups/...
Backend->>Service: GetPotentialDuplicates(..., identicalVernacular=false)
Service-->>Backend: List<List<Word>>
Backend-->>API: OkObjectResult(duplicates)
API-->>Frontend: Response with similar duplicates
Frontend->>Frontend: Close dialog, load similar duplicates
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/goals/DefaultGoal/BaseGoalScreen.tsx`:
- Around line 58-75: The dialog stays open because useEffect uses
setOpenDupsDialog(prev => prev || dataLoadStatus === DataLoadStatus.Loading) so
once true it never resets; change the logic in the useEffect for
GoalType.MergeDups to setOpenDupsDialog(dataLoadStatus ===
DataLoadStatus.Loading && goalType === GoalType.MergeDups && status !==
GoalStatus.Completed) (or otherwise directly tie openDupsDialog to
dataLoadStatus when the goal is active) so the IdenticalDuplicatesDialog
automatically closes when dataLoadStatus is no longer Loading; update references
in the same effect that use useEffect, setOpenDupsDialog, openDupsDialog,
dataLoadStatus, GoalType.MergeDups, and GoalStatus.Completed.
In `@src/goals/MergeDuplicates/IdenticalDuplicatesDialog.tsx`:
- Around line 47-52: The mount useEffect currently calls hasGraylistEntries()
and getFrontierWords() without handling rejections; update the effect that calls
hasGraylistEntries, getFrontierWords (and that calls setHasDeferred and
setFrontierCount) to handle promise failures — e.g., use an async IIFE or attach
.catch handlers and in the catch set safe defaults (false for setHasDeferred and
0 for setFrontierCount) and log or report the error. Ensure both
hasGraylistEntries and getFrontierWords invocations are wrapped so no unhandled
rejection can occur and state updates only happen for successful responses.
🧹 Nitpick comments (4)
Backend.Tests/Controllers/MergeControllerTests.cs (1)
157-172: Add a basic payload assertion to the success test.This strengthens the test by ensuring the endpoint returns a non-null payload, not just an OK status.
✅ Suggested tweak
var result = _mergeController.FindIdenticalPotentialDuplicates(ProjId, 5, 10, false).Result; Assert.That(result, Is.InstanceOf<OkObjectResult>()); + var ok = result as OkObjectResult; + Assert.That(ok, Is.Not.Null); + Assert.That(ok!.Value, Is.Not.Null);src/goals/Redux/tests/GoalRedux.test.ts (1)
37-65: AlignfindSimilarDuplicatesmock with its async signature.Right now it returns
undefined, which can be brittle if the action creator ever awaits or chains it. Mocking it as a resolved promise makes the tests more robust.♻️ Suggested test mock setup
+const mockFindSimilarDuplicates = jest.fn(); + jest.mock("backend", () => ({ addGoalToUserEdit: (...args: any[]) => mockAddGoalToUserEdit(...args), addStepToGoal: jest.fn(), createUserEdit: () => mockCreateUserEdit(), findIdenticalDuplicates: () => mockFindIdenticalDuplicates(), - findSimilarDuplicates: jest.fn(), + findSimilarDuplicates: () => mockFindSimilarDuplicates(), getGraylistEntries: () => Promise.resolve([]), getUserEditById: (...args: any[]) => mockGetUserEditById(...args), hasGraylistEntries: jest.fn(), retrieveDuplicates: () => mockRetrieveDuplicates(), })); function setMockFunctions(): void { mockAddGoalToUserEdit.mockResolvedValue(0); mockCreateUserEdit.mockResolvedValue(mockUser()); mockFindIdenticalDuplicates.mockResolvedValue([]); + mockFindSimilarDuplicates.mockResolvedValue(undefined); mockGetUserEditById.mockResolvedValue(mockUserEdit(true)); mockRetrieveDuplicates.mockResolvedValue(goalDataMock.plannedWords); }src/goals/Redux/GoalActions.ts (2)
174-188: Redundant state access.
currentProjectState.projectis already destructured at line 153-154. Line 181 unnecessarily callsgetState()again.♻️ Suggested simplification
} else if ( currentGoal.goalType === GoalType.MergeDups && dataLoadStatus !== DataLoadStatus.Success ) { // All identical-vernacular duplicates have been processed. // Initialize similar-vernacular duplicate finding in the backend. dispatch(setDataLoadStatus(DataLoadStatus.Loading)); - const currentProj = getState().currentProjectState.project; await Backend.findSimilarDuplicates( maxWordsInMergeList, maxNumSteps(currentGoal.goalType), - currentProj.protectedDataMergeAvoidEnabled === OffOnSetting.On + currentProjectState.project.protectedDataMergeAvoidEnabled === OffOnSetting.On ); return; }
260-279: Wrap switch case declaration in a block.The
const dupsdeclaration inside the switch case can potentially be accessed from other cases. While this case returns immediately, wrapping in a block is best practice and satisfies the linter.♻️ Proposed fix
switch (goalType) { - case GoalType.MergeDups: + case GoalType.MergeDups: { // Catch failure and pass to caller to allow for error dispatch. const dups = dataLoadStatus === DataLoadStatus.Success ? await Backend.retrieveDuplicates().catch(() => {}) : await Backend.findIdenticalDuplicates( maxWordsInMergeList, maxNumSteps(goalType), project.protectedDataMergeAvoidEnabled === OffOnSetting.On ).catch(() => {}); return dups ? checkMergeData(dups) : Promise.reject(); + } case GoalType.ReviewDeferredDups:
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/goals/Redux/GoalActions.ts`:
- Around line 255-271: The MergeDups case in loadGoalData declares local
variable dups directly in the switch case which leaks scope; wrap the entire
case body in a block (i.e., add { ... } after case GoalType.MergeDups:) so
variables like dups are block-scoped, keep the existing async awaits and the
return dups ? checkMergeData(dups) : Promise.reject() inside that block, and
ensure no other switch cases are affected.
- Around line 174-187: Wrap the awaited Backend.findSimilarDuplicates call
inside a try/catch in the asyncAddGoal flow (the branch checking
currentGoal.goalType === GoalType.MergeDups) so rejections don't leave the UI in
Loading; on catch, dispatch setDataLoadStatus(DataLoadStatus.Failure) (matching
the loadGoalData pattern), optionally log or pass the error for diagnostics, and
ensure the function returns after handling the error to avoid continuing as if
successful.
jasonleenaylor
left a comment
There was a problem hiding this comment.
@jasonleenaylor reviewed 21 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @copilot[bot]).
jasonleenaylor
left a comment
There was a problem hiding this comment.
@jasonleenaylor reviewed 2 files and all commit messages, and made 1 comment.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @copilot[bot]).
imnasnainaec
left a comment
There was a problem hiding this comment.
@imnasnainaec reviewed all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @copilot[bot]).
When all identical-vernacular duplicates are processed, users need feedback and a choice before the system searches for similar duplicates (which can take minutes on large projects).
With deferred duplicates, with over 1000 frontier entries, still loading:

With deferred duplicates, done loading:

Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.
This change is
Summary by CodeRabbit
New Features
API
Utilities
Tests
✏️ Tip: You can customize this high-level summary in your review settings.