I have about 6TB of files across a number of networked
computers running on Mac OS 10.6 that I want to do two things with.
1. Find every duplicate of each file (from my investigations thus far
there is probably one or more duplicates for about 50% of the 10
million or so files) -- the number of duplicates varies from 1â50.
2. Specify one of the directories as the Master Directory, and then
copy all non-duplicates to the appropriate place in that Master
Directory -- the criteria for the determination of which folder a
file is to be copied to will depend on:
(a) whether the file name is similar to a file name in a folder in
the Master Directory
(b) the name of the set of folders that the file is nested within --
in the majority of cases, this will provide sufficient information
about where the file should be copied to.
In most cases the process will require something like the following:
A. Locate all folders with the same name and nested structure.
B. Present the user with any partial matches of the nested structure
for manual intervention eg. A>B>C>D>filenameXX.doc and
A>B>D>filenameXX.doc
C. Test for duplicates, and discard all duplicates of files that
already reside in the relevant folder in the Master Directory.
D. Copy all non-duplicates to the appropriate folder in the Master
Directory.
I look forward to your suggestions of apps that can be used together to achieve this task.
Cheers
Peter
computers running on Mac OS 10.6 that I want to do two things with.
1. Find every duplicate of each file (from my investigations thus far
there is probably one or more duplicates for about 50% of the 10
million or so files) -- the number of duplicates varies from 1â50.
2. Specify one of the directories as the Master Directory, and then
copy all non-duplicates to the appropriate place in that Master
Directory -- the criteria for the determination of which folder a
file is to be copied to will depend on:
(a) whether the file name is similar to a file name in a folder in
the Master Directory
(b) the name of the set of folders that the file is nested within --
in the majority of cases, this will provide sufficient information
about where the file should be copied to.
In most cases the process will require something like the following:
A. Locate all folders with the same name and nested structure.
B. Present the user with any partial matches of the nested structure
for manual intervention eg. A>B>C>D>filenameXX.doc and
A>B>D>filenameXX.doc
C. Test for duplicates, and discard all duplicates of files that
already reside in the relevant folder in the Master Directory.
D. Copy all non-duplicates to the appropriate folder in the Master
Directory.
I look forward to your suggestions of apps that can be used together to achieve this task.
Cheers
Peter