[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not finding all duplicates. #1204

Open
pantropia opened this issue Feb 27, 2024 · 9 comments
Open

Not finding all duplicates. #1204

pantropia opened this issue Feb 27, 2024 · 9 comments
Labels
bug Bug reports.

Comments

@pantropia
Copy link

Describe the bug
Some exact duplicates not being found in Standard mode

To Reproduce

  1. Use BeyondCompare to compare two folders and ensure that there are no files in folder A which do not exist in folder B and no content mismatches
  2. Open DupeGuru and set folder A as Normal, folder B as Reference
  3. Scan, remove all duplicates found
  4. Clear cache, scan again - no duplicates found
  5. Refresh BeyondCompare - find that many duplicates do still exist

Expected behavior
All exact duplicates should be found.
Thousands of others were, so what's up with these ones? They definitely are exact duplicates because I only just made them.

Screenshots
Screenshot 2024-02-27 082632

Here's a screenshot of the duplicates as shown in BeyondCompare after the scan.

Desktop (please complete the following information):

  • Windows 11 23H2
  • Ryzen 7, 64GB RAM.
  • V4.3.1

Additional context
In this case I'm scanning .eml files.
I noticed an old copy of DupeGuru lurking in a backup of an old desktop and to my surprise it ran when I tried it v3.something. Same result - though I note that it picked up my settings, so I don't know if perhaps it might have been giving me the v3 interface but using the backend from 4.3.1
I thought maybe if there were duplicates in the references folder it might cause an issue, so I cleared the cache, set it as Normal, deduped it, cleared the cache and retried to make sure there were no duplicates showing then cleared the cache again, set it back to Reference and ran the deduplication against the other folder again - nothing found.
I looked at BeyondCompare again - over 600 exact duplicates still.
I cleared the cache again, ran it once more with both folders set to Normal, and it found several duplicates - less than a screenfull - all in the folder which was originally the reference. Only one had a check-box against it.
So I closed the app, reopened it, ran the dedupe again on those two folders... and it found over 15000 duplicates, all in the folder which was originally the reference. I had not marked anything to be removed from results previously.
Now, it's finding no duplicates whatsoever - There are 613 exact duplicates according to BeyondCompare, which made the original copy.
image

@pantropia pantropia added the bug Bug reports. label Feb 27, 2024
@pantropia
Copy link
Author

OK - Path/filename length may be relevant: I moved folders to the root of the drives, renamed folders to be shorter etc. It looks like the max path length is 260 characters.

@arsenetar
Copy link
Owner

This Microsoft doc and instructions towards the end should resolve this issue for you https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry based on the information you have provided.

@pantropia
Copy link
Author

Aaaah of course. Windows 11 both supporting and not supporting something at the same time. Makes perfect sense. I was very not looking forward to trying to reduce all the paths over 259 characters as there are rather a lot of them.
I made the registry change and it found 22 more results, but couldn't send them to the recycle bin. The error came up quite small, something about the data being sent being too small. I thought I'd screenshotted it, but I don't see it on my system.
The link mentions changing the app manifest - is that relevant to dupeguru and if so, where do I find it?

@arsenetar
Copy link
Owner

That error for the recycle operation seems odd, if you get it again, can you capture it?

@pantropia
Copy link
Author

If it comes up again, yes. At the moment I'm just running some scans on another folder, but I can redo the export to folder A (which I'd deleted as it was finally empty) to do some tests on.

@pantropia
Copy link
Author

I haven't seen that error again yet but the last scan I ran found a couple of screenfuls of items, which I sent to the recycle bin. When I opened the bin to clear it out, none of those items were in there, just a couple of things I'd deleted manually yesterday.
It does seem to be the case that after running a few scans in a row without closing and reopening the app (even if I clear the cache) it will do Something Weird - but it's been something different each time.

@pantropia
Copy link
Author

Screenshot 2024-03-03 071939
I couldn't see anything obvious in the debug log.
This was after I'd given up on deleting all of the results at once, saved the result set, restarted the app, reloaded, and tried deleting I think about 1/10th of the list (it's deduping two exports of all my mail from two different clients which use different folder structures.)

@pantropia
Copy link
Author

image
This is all the errors I ended up with after deleting in batches.

@pantropia
Copy link
Author

I've had several out-of-memory crashes as well, mostly after several hours at the 'fiddling with results' stage, and when closed after such a crash, the app is still showing in task manager until I end it. It looks like what it's having a problem with is a folder which has a lot of zero-size files in it. I've been able to do the subfolders and each one has had more than 15000 of those zero-size files, so I can see why doing Many such folders would be an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug reports.
Projects
None yet
Development

No branches or pull requests

2 participants