snakemake总是报告“第44行中的MissingOutputException,5秒后丢失文件:

问题描述 投票:1回答:2

我总是通过snakemake在我的RNAs-seq管道中获得相同的错误报告:

MissingOutputException in line 44 of /root/s/r/snakemake/my_rnaseq_data/Snakefile:
Missing files after 5 seconds:
03_align/wt2.bam
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.

这是我的Snakefile:

SBT=["wt1","wt2","epcr1","epcr2"]

rule all:
    input:
        expand("02_clean/{nico}_1.paired.fq", nico=SBT),
        expand("02_clean/{nico}_2.paired.fq", nico=SBT),
        expand("03_align/{nico}.bam", nico=SBT)

rule trim:
    input:
        "01_raw/{nico}_1.fastq",
        "01_raw/{nico}_2.fastq"
    output:
        "02_clean/{nico}_1.paired.fq.gz",
        "02_clean/{nico}_1.unpaired.fq.gz",
        "02_clean/{nico}_2.paired.fq.gz",
        "02_clean/{nico}_2.unpaired.fq.gz",
    shell:
        "java -jar /software/Trimmomatic-0.36/trimmomatic-0.36.jar PE -threads 16 {input[0]} {input[1]} {output[0]} {output[1]} {output[2]} {output[3]} ILLUMINACLIP:/software/Trimmomatic-0.36/adapters/TruSeq3-PE-2.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 &"

rule gzip:
    input:
        "02_clean/{nico}_1.paired.fq.gz",
        "02_clean/{nico}_2.paired.fq.gz"
    output:
        "02_clean/{nico}_1.paired.fq",
        "02_clean/{nico}_2.paired.fq"
    run:
        shell("gzip -d {input[0]} > {output[0]}")
        shell("gzip -d {input[1]} > {output[1]}")

rule map:
    input:
        "02_clean/{nico}_1.paired.fq",
        "02_clean/{nico}_2.paired.fq"
    output:
        "03_align/{nico}.sam"
    log:
        "logs/map/{nico}.log"
    threads: 40
    shell:
        "hisat2 -p 20 --dta -x /root/s/r/p/A_th/WT-Al_VS_WT-CK/index/tair10 -1 {input[0]} -2 {input[1]} -S {output} >{log} 2>&1 &"

rule sort2bam:
    input:
        "03_align/{nico}.sam"
    output:
        "03_align/{nico}.bam"
    threads:30
    shell:
        "samtools sort -@ 20 -m 20G -o {output} {input} &"

一切都很好,直到我添加“规则sort2bam”部分。

当我干运行时,它会很好。但是当我执行它时,它会在问题描述时报告错误。令人惊讶的是,它运行的任务是报告它停留在后台。但是它总是运行一个任务。就像这样:

rule sort2bam:
    input: 03_align/epcr1.sam
    output: 03_align/epcr1.bam
    jobid: 11
    wildcards: nico=epcr1

Waiting at most 5 seconds for missing files.
MissingOutputException in line 45 of /root/s/r/snakemake/my_rnaseq_data/Snakefile:
Missing files after 5 seconds:
03_align/epcr1.bam
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
[Sat Apr 27 06:10:22 2019]
rule sort2bam:
    input: 03_align/wt1.sam
    output: 03_align/wt1.bam
    jobid: 9
    wildcards: nico=wt1

Waiting at most 5 seconds for missing files.
MissingOutputException in line 45 of /root/s/r/snakemake/my_rnaseq_data/Snakefile:
Missing files after 5 seconds:
03_align/wt1.bam
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

[Sat Apr 27 06:23:13 2019]
rule sort2bam:
    input: 03_align/wt2.sam
    output: 03_align/wt2.bam
    jobid: 6
    wildcards: nico=wt2

Waiting at most 5 seconds for missing files.
MissingOutputException in line 44 of /root/s/r/snakemake/my_rnaseq_data/Snakefile:
Missing files after 5 seconds:
03_align/wt2.bam
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

我不知道我的代码有什么问题?任何理想?提前致谢!

pipeline snakemake rna-seq
2个回答
2
投票

正如你所知,&就是问题所在。控制操作符&使你的命令在子shell中在后台运行,这导致snakemake认为作业是完整的,而事实上并非如此。在您的情况下,似乎不需要使用它。

来自man bash关于使用&(从this answer偷来):

如果命令由控制操作符&终止,则shell在子shell中在后台执行命令。 shell不等待命令完成,返回状态为0。


1
投票

我知道如何解决,但我不知道它为什么会起作用!只需删除'&'即可

samtools sort -@ 20 -m 20G -o {output} {input} &
© www.soinside.com 2019 - 2024. All rights reserved.