Snakemake 中缺少输出异常错误

问题描述 投票:0回答:1

我使用的是snakemake版本7.30.1

我正在尝试使用snakemake --cores 4运行我的snakemake工作流程。Snakemake似乎能够找到输入文件,并且似乎开始完成工作流程中第一条规则的步骤,但随后由于某种原因退出MissingOutputExcpetion 错误指出无法找到样本列表中两个样本中第二个样本的输出文件。这似乎不是文件本身的问题,因为当我切换文件的顺序时,新的第一个示例会运行,而新的第二个示例不会运行。我也尝试过更改延迟,但没有帮助。

我正在尝试在我的第一条规则中运行 fastp 来获取两个样本和两次读取。输出应生成文件 M31A_150k_1_final.fq、M28B_150k_1_final.fq、M31A_150k_2_final.fq、M28B_150k_2_final.fq:

base_path = "/Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/"
Define list of sample names

samples = ["M31A_150k" , "M28B_150k"]

rule all:
input:
expand(base_path + "bai/{sample}_all.bam.bai", sample=samples),
expand(base_path + "bai/{sample}_forward.bam.bai", sample=samples),
expand(base_path + "bai/{sample}_reverse.bam.bai", sample=samples),
expand(base_path + "bigwig/{sample}.bw", sample=samples),
expand(base_path + "bigwig/{sample}_forward.bw", sample=samples),
expand(base_path + "bigwig/{sample}_reverse.bw", sample=samples)

rule fastp_adaptors:
input:
R1 = expand(base_path + "testfiles/{sample}_1.fq", sample=samples),
R2 = expand(base_path + "testfiles/{sample}_2.fq", sample=samples)

output:
R1_final = expand(base_path + "trimmed/{sample}_1_final.fq", sample=samples),
R2_final = expand(base_path + "trimmed/{sample}_2_final.fq", sample=samples)

shell:
"""
fastp -w 8 --dont_eval_duplication -i {input.R1} -I {input.R2} -t 10 -F 10 -o {output.R1_final} -O {output.R2_final} --detect_adapter_for_pe
"""

这是我收到的错误日志:

valeriaaizen@Valerias-MacBook-Pro \~/D/c/n/snakemake-attempt (main)\> snakemake --cores 4 (myenv_x86)
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 4
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads

all 1 1 1
bowtie2 1 1 1
deeptools_bigwigall 1 1 1
deeptools_bigwigforward 1 1 1
deeptools_bigwigreverse 1 1 1
fastp_adaptors 1 1 1
merge_83163 1 1 1
merge_99147 1 1 1
reverse 1 1 1
samtools_indexall 1 1 1
samtools_indexforward 1 1 1
samtools_sort 1 4 4
samtools_sort147 1 1 1
samtools_sort163 1 1 1
samtools_sort83 1 1 1
samtools_sort99 1 1 1
total 16 1 4

Select jobs to execute...

\[Thu Sep 7 14:39:53 2023\]
rule fastp_adaptors:
input: /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M31A_150k_1.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M28B_150k_1.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M31A_150k_2.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M28B_150k_2.fq
output: /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_1_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_1_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_2_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_2_final.fq
jobid: 4
reason: Missing output files: /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_1_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_2_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_1_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_2_final.fq
resources: tmpdir=/var/folders/4c/h8ky28xj143dkssjycttn5lr0000gn/T

Detecting adapter sequence for read1...

    Illumina TruSeq Adapter Read 1
    AGATCGGAAGAGCACACGTCTGAACTCCAGTCA

Detecting adapter sequence for read2...
No adapter detected for read2

Read1 before filtering:
total reads: 150000
total bases: 22500000
Q20 bases: 21987079(97.7204%)
Q30 bases: 21372363(94.9883%)

Read2 before filtering:
total reads: 150000
total bases: 22500000
Q20 bases: 21768444(96.7486%)
Q30 bases: 21103172(93.7919%)

Read1 after filtering:
total reads: 136856
total bases: 18856683
Q20 bases: 18594358(98.6088%)
Q30 bases: 18347138(97.2978%)

Read2 after filtering:
total reads: 136856
total bases: 17587532
Q20 bases: 17259790(98.1365%)
Q30 bases: 16852551(95.821%)

Filtering result:
reads passed filter: 273712
reads failed due to low quality: 2162
reads failed due to too many N: 18
reads failed due to too short: 24108
reads with adapter trimmed: 35295
bases trimmed due to adapters: 2204956

Insert size peak (evaluated by paired-end reads): 150

JSON report: fastp.json
HTML report: fastp.html

fastp -w 8 --dont_eval_duplication -i /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M31A_150k_1.fq /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M28B_150k_1.fq -I /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M31A_150k_2.fq /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/testfiles/M28B_150k_2.fq -t 10 -F 10 -o /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_1_final.fq /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_1_final.fq -O /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_2_final.fq /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_2_final.fq --detect_adapter_for_pe
fastp v0.22.0, time used: 8 seconds
Waiting at most 5 seconds for missing files.
MissingOutputException in rule fastp_adaptors in file /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/Snakefile, line 35:
Job 4 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
/Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_1_final.fq
/Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M28B_150k_2_final.fq
Removing output files of failed job fastp_adaptors since they might be corrupted:
/Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_1_final.fq, /Users/valeriaaizen/Documents/code/notebooks/snakemake-attempt/trimmed/M31A_150k_2_final.fq
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-09-07T143950.741220.snakemake.log
wildcard missing-data snakemake sequencing
1个回答
0
投票
rule fastp_adaptors:
    input:
        R1 = expand(base_path + "testfiles/{sample}_1.fq", sample=samples),
        R2 = expand(base_path + "testfiles/{sample}_2.fq", sample=samples)

    output:
        R1_final = expand(base_path + "trimmed/{sample}_1_final.fq", sample=samples),
        R2_final = expand(base_path + "trimmed/{sample}_2_final.fq", sample=samples)

    shell:
        """ 
        fastp -w 8 --dont_eval_duplication -i {input.R1} -I {input.R2} -t 10 
        -F 10 -o {output.R1_final} -O {output.R2_final} --detect_adapter_for_pe
        """

我猜

fastp_adaptors
必须在每个对fastq文件上运行一次(在你的情况下总共运行两次)。但是,由于您的输入和输出指令中有
expand
,所以
fastp_adaptors
在所有对上仅运行一次,从而导致错误。因此,请尝试删除
expand
中的
fastp_adaptors
。 (如果你是snakemake的新手,这是让初学者感到困惑的事情之一)

© www.soinside.com 2019 - 2024. All rights reserved.