Duplicate Files Search Performance Options

DupScout is optimized for modern multi-core and multi-CPU servers and is capable of scanning directories and searching duplicate files using a number of parallel processing threads. In order to enable multi-threaded duplicate files search, open the duplicate files search options dialog, select the 'Advanced' tab and set the maximum number of parallel threads to use to scan directories and the number of parallel processing threads to search duplicate files.

Duplicate Files Search Performance Options

For example, when searching duplicate files located on SSD disks, the performance of a duplicate files search operation reaches up to 3,100 Files/Sec for a single processing thread and scales very well up to 14,000 Files/Sec when the same operation is performed using 8 parallel processing threads.

Duplicate Files Search Performance SSD Disks

In order to enable categorization and filtering of duplicate files by the user name and display of the amount of duplicate disk space and the number of duplicate files per user, the option to process and display duplicate files user names should be enabled. But, this option significantly impacts the performance of the duplicate files search operation and in order to mitigate the performance degradation, the duplicate files search command should be configured to use 4-8 parallel processing threads.

Process and Show Duplicate Files User Names Option

For example, when searching duplicate files located on SSD disks with the option to process and display user names enabled, the duplicate files search operation reaches up to 2,400 Files/Sec for a single processing thread and scales very well up to 10,800 Files/Sec when the same duplicate files search operation is performed using 8 parallel processing threads.

Duplicate Files Search Performance SSD Disks With User Names

Searching duplicate files over the network is a demanding operation and in order to be effective a high-speed, low-latency network is required and the duplicate files search command should be configured to use a number of parallel directory scanning and processing threads. For example, when searching duplicate files over the network with a single processing thread, a duplicate files search operation reaches up to 320 Files/Sec and scales up to 1,144 Files/Sec when the same duplicate files search operation is performed using 8 parallel processing threads.

Duplicate Files Search Performance Over Network

Searching duplicate files over the network with the option to process and display user names enabled is a time consuming operation because it takes a significant amount of time to inquire files user names from remote servers. In order to mitigate the performance degradation, the duplicate files search command should be configured to use at least 4-8 parallel processing threads, which will inquire duplicate files user names for multiple files simultaneously.

Duplicate Files Search Performance Over Network With User Names

For example, when searching duplicate files over the network with the option to process and display duplicate files user names enabled, a duplicate files search operation manages to reach up to 110 Files/Sec for a single processing thread and scales up to 592 Files/Sec when the same duplicate files search operation is performed using 8 parallel processing threads.