如何对编号文件进行排序并将其复制到递增文件夹中

问题描述 投票:0回答:1

因此,我有名为 1, 2, ... 19500.fa 的基因文件,并希望将它们分类到下游管道的文件夹 200, 400, 600... 19600 中。我知道如何做到这一点,但它非常可怕:

for file in "${files[@]}"; do

    base_name=$(basename "$file")
    gene_number=$(echo "$base_name" | cut -d'_' -f2 | cut -d'.' -f1)
    to_path= (path to folder containing 200, 400, ... 19600 folders)
    
    #if it's gene_200.fa, 400.fa etc. copy into that dir
    if (( $gene_number%200 == 0)); then 
        cp file $to_path/$gene_number/$file
    elif (( $gene_number < 200 )); then 
        cp file $to_path/200/$file
    elif (( $gene_number > 19400)); then 
        cp file $to_path/19600/$file
    # the endless pain of 200-400, 400-600, 600-800 ... 19200-19400
    elif (( $gene_number > 200 && $gene_number < 400)); then 
        cp file $to_path/19600/$file
    elif ....

我的问题是:是否有一种不那么繁琐的方法可以做到这一点,而无需将任何一个文件复制到多个文件夹中? (例如,如果我仅按基因编号排序< file name a file named gene_3.fa would be copied into all folders)

bash directory-structure
1个回答
0
投票

您可以这样做,只需将

delta
值更改为
200
并根据需要添加
cp
mv

#!/usr/bin/env bash

delta=5
for file in gene_{1..20}.fa; do
    if [[ "$file" =~ .*_([0-9]+).* ]]; then
        gene_number="${BASH_REMATCH[1]}"
    fi
    bucket=$(( gene_number / delta ))
    bucket=$(( bucket * delta + delta ))
    echo "$file -> $bucket"
done

$ ./tst.sh
gene_1.fa -> 5
gene_2.fa -> 5
gene_3.fa -> 5
gene_4.fa -> 5
gene_5.fa -> 10
gene_6.fa -> 10
gene_7.fa -> 10
gene_8.fa -> 10
gene_9.fa -> 10
gene_10.fa -> 15
gene_11.fa -> 15
gene_12.fa -> 15
gene_13.fa -> 15
gene_14.fa -> 15
gene_15.fa -> 20
gene_16.fa -> 20
gene_17.fa -> 20
gene_18.fa -> 20
gene_19.fa -> 20
gene_20.fa -> 25
© www.soinside.com 2019 - 2024. All rights reserved.