删除多个文件夹中的重复文件[关闭]

Question

我有一个目录，其中包含许多子目录和子子目录等等。我想选择一个特定的文件夹说A并删除A以外的目录中的所有文件，如果这些文件出现在A和A以外的目录中。基本上我想通过保持A中的文件完整来删除重复（仅参考A）。此外，对于A中没有出现的所有文件，我希望在任何一个目录中只保留一个副本（可能基于按字典顺序排列的第一个目录名或任何其他选择标准）。

请帮我写相同的脚本。

Answer 1

有关运行此脚本的示例，请参阅下面的用法：注意：此脚本实际删除了已注释掉的文件。要启用实际删除文件，您需要在脚本中取消注释# rm "$rmfn"。

注意2：要实际删除文件，您必须为此脚本提供3rd参数：-d以使文件被删除。如果您只提供参数1：A（要保存的文件的路径）和参数2：pathB（带有子目录的路径以查找dupes），则此脚本只打印找到的重复项，以便您可以在预先形成之前验证所有内容是否符合您的要求去除。

Linux应用程序fdupes我不应该告诉你，有一个常见的应用程序fdupes旨在做你想做的事情（更加灵活和彻底测试）。

#!/bin/bash

## check input dirs both exist
[ -e "$1" ] && [ -e "$2" ] || {
    printf "\nError: invalid path. Usage %s dirA pathB\n\n" "${0##*/}"
    exit 1
}

tmp="tmp_$(date +%s).txt"               # unique temp file name
[ -f "$tmp" ] && rm "$tmp"              # test if already exists and del
find "$2" -type f > "$tmp"              # fill tmp file with possible dups

for i in $(find "$1" -type f); do       # check each file in A ($1) against tmp
    fn="${i##*/}"                       # remove path from A/filename
    if grep -q "$fn" "$tmp"; then       # test if A/file found in pathB ($2)
        if [ "$3" = -d ]; then          # if 3rd arg is '-d', really delete
            for rmfn in $(grep "$fn" "$tmp"); do        # get list of matching filenames
                printf " deleting: %s\n" "$rmfn" >&2    # print record of file deleted 
                # rm "$rmfn"            # the delete command (commented)
            done
        else                            # if no '-d', just print duplicates found
            printf "\n  Duplicate(s) found for: %s\n\n" "$fn"
            grep "$fn" "$tmp"           # output duplicate files found
        fi
    fi
done

rm "$tmp"                               # delete tmp file

exit 0

用法：

该脚本需要2个目录作为输入来扫描重复项，它需要第三个参数'-d'来实际删除找到的重复项。例：

$ bash fdupes.sh ~/scr/utl ~/scr/rmtmp/

  Duplicate found for: bay.sh

/home/david/scr/rmtmp/bay.sh

  Duplicate found for: rsthemes.sh

/home/david/scr/rmtmp/rsthemes.sh

  Duplicate found for: nocomment

/home/david/scr/rmtmp/nocomment.sh

  Duplicate found for: show-rdtcli.sh

/home/david/scr/rmtmp/show-rdtcli.sh
/home/david/scr/rmtmp/subdir1/show-rdtcli.sh
<snip>

实际上删除重复项（在取消注释rm之后）：

$ bash fdupes.sh ~/scr/utl ~/scr/rmtmp/ -d

 deleting: /home/david/scr/rmtmp/bay.sh
 deleting: /home/david/scr/rmtmp/rsthemes.sh
 deleting: /home/david/scr/rmtmp/nocomment.sh
 deleting: /home/david/scr/rmtmp/show-rdtcli.sh
 deleting: /home/david/scr/rmtmp/subdir1/show-rdtcli.sh

Answer 2

C＃：using System.IO;

string filePathA =.... //get file path in folder A
string filePathB =.... //get file path in folder B
string fileNameA = filePathA.Replace(Server.MapPath("~/ImagesA/"), "");
string fileNameB = filePathA.Replace(Server.MapPath("~/ImagesB/"), "");

if(fileNameA.ToString() == fileNameB.ToString())
    File.Delete(filePathB);

Answer 3

在Shell脚本中，您可以执行以下操作：

for f in origin/* ; do rm final/*$(basename $f) ; done

简单，并完成工作。对于第二部分，您可以进行嵌套迭代，比较所有文件以确保它们是唯一的。

for o in origin/* ; do
    for f in final/* ; do
        if [$(basename $f) -eq $(basename $o)] ; then
            rm $f
        else # if the file is not in A, but you want to check if its unique

            # do a new iteration
            for d in final/* ; do

                # if the file name is equal but they are not the same file, delete de second. 
                if [$(basename $f) -eq $(basename $d)] ; then
                    if [$f -ne $d] ; then
                        rm $d
                    fi
                fi                            
        fi

done

Answer 4

试试下面的代码。这将创建一个带有Cyrillic sum的文件，按cksum值排序文件内容，而不是迭代文件列表并删除重复文件。

find . -type f -exec cksum {}  \; > cksum.txt
cat cksum.txt |sort -n > filelist

oldSum=""
oldFile=""
while read sum lines filename
do
      echo "oldSum=$oldSum"
      echo "sum=$sum"
      echo "oldFile=$oldFile"
      echo "filename=$filename"
      if [[ "$sum" != "$oldSum" ]] ; then
        oldSum="$sum"
        oldFile=$filename
        echo 
        continue
      fi
      echo "rm -f $filename"
      rm -f $filename
      echo
done < filelist

删除多个文件夹中的重复文件[关闭]

问题描述投票：0回答：4

4个回答

最新问题

删除多个文件夹中的重复文件[关闭]

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4