Quantcast
Channel: 2BrightSparks
Viewing all articles
Browse latest Browse all 9303

FoC Optimization: Duplicate checking - Don't hash everything

$
0
0
First requested in 2010.

Currently the program hashes all files, because it performs the hash before sorting them by size.
There is a zero percent chance that two files of different size are the same, and a zero percent chance that a file that doesn't haven any other files of the same size is a duplicate.

So hashing a huge file when it is the only one of that size is a waste of time.

Please only hash files that are of the same size.

I just searched for *.iso > 7GB and it took 16 hours to tell me there were no duplicates.

Viewing all articles
Browse latest Browse all 9303

Trending Articles