如何在没有互联网的情况下在 SLURM 集群计算节点上使用 Snakemake 包装器?

问题描述 投票:0回答:0

我正在尝试在 SLURM 集群上的管道中使用包装器,其中计算节点无法访问互联网。

我首先使用

--conda-create-envs-only
运行管道,然后更改
wrapper:
指令以指向包含
environment.yaml
文件的本地文件夹。 作业在没有特定错误的情况下失败。具有相同配置但没有包装器的测试规则有效。如果我切换回
wrapper:
指令以在线查看,则带有包装器的规则在登录节点上工作正常。

我在跑步:

snakemake --profile myprofile --cores 40 --use-conda

示例规则:

# Run FastQC on the fastq.gz files
rule fastqc_fastq_gz:
    input:
        input_dir + "{sample}_{read}_001.fastq.gz",
    output:
        html = output_dir + "fastqc/{sample}_{read}_fastqc.html",
        zip = output_dir + "fastqc/{sample}_{read}_fastqc.zip",
    params: 
        extra = "--quiet",
    log:
        output_dir + "logs/fastqc/{sample}_{read}.log",
    threads: 1,
    resources:
        mem_mb = 1024,   
    wrapper:
        # "file:///path/envs/v1.31.0/bio/fastqc/"   # <- Run with compute nodes
        "v1.31.0/bio/fastqc"                        # <- Run normally to download env

我也尝试过使用更多资源,同样的行为。

我的简介:

cluster:
  mkdir -p logs/{rule} &&
  sbatch
    --partition={resources.partition}
    --qos={resources.qos}
    --cpus-per-task={threads}
    --mem={resources.mem_mb}
    --job-name=smk-{rule}-{wildcards}
    --output=logs/{rule}/{rule}-{wildcards}-%j.out
    --error=logs/{rule}/{rule}-{wildcards}-.%j.err
    --account=<account>
    --time={resources.time}
    --parsable
default-resources:
  - partition=<partition>
  - qos=sbatch
  - mem_mb="490G"
  - tmpdir="/path/to/temp/"
  - time="0-10:00:00"
max-jobs-per-second: 10
max-status-checks-per-second: 1
latency-wait: 60
jobs: 16
keep-going: True
rerun-incomplete: True
printshellcmds: True
scheduler: greedy
use-conda: True
cluster-status: status-sacct.sh

投稿记录如下:

Error in rule fastqc_fastq_gz:
    jobid: 38
    input: /path/raw/S9_S9_R2_001.fastq.gz
    output: /path/pipeline_out/fastqc/S9_S9_R2_fastqc.html, /path/pipeline_out/fastqc/S9_S9_R2_fastqc.zip
    log: /path/pipeline_out/logs/fastqc/S9_S9_R2.log (check log file(s) for error details)
    conda-env: /path/.snakemake/conda/a116f377bbaddedd93b228a3d4f74b1d_
    cluster_jobid: 1491196

Error executing rule fastqc_fastq_gz on cluster (jobid: 38, external: 1491196, jobscript: /path/.snakemake/tmp.l0raomfw/snakejob.fastqc_fastq_gz.38.sh). For error details see the cluster log and the log files of the involved rule(s).

tmp 文件不存在。 工作日志简单阅读:

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 80
Rules claiming more threads will be scaled down.
Provided resources: mem_mb=1000, mem_mib=954, disk_mb=7876, disk_mib=7512
Select jobs to execute...

[Date Time]
rule fastqc_fastq_gz:
    input: /path/raw/S9_S9_R2_001.fastq.gz
    output: /path/pipeline_out/fastqc/S9_S9_R2_fastqc.html, /path/pipeline_out/fastqc/S9_S9_R2_fastqc.zip
    log: /path/pipeline_out/logs/fastqc/S9_S9_R2.log
    jobid: 0
    reason: Missing output files: /path/pipeline_out/fastqc/S9_S9_R2_fastqc.zip, /path/pipeline_out/fastqc/S9_S9_R2_fastqc.html
    wildcards: sample=S9_S9, read=R2
    threads: 2
    resources: mem_mb=1000, mem_mib=954, disk_mb=7876, disk_mib=7512, tmpdir=/path/tmp/snakemake, partition=el7taskp, qos=sbatch, time=0-40:00:00

[Date Time]
Error in rule fastqc_fastq_gz:
    jobid: 0
    input: /path/raw/230319/S9_S9_R2_001.fastq.gz
    output: /path/pipeline_out/fastqc/S9_S9_R2_fastqc.html, /path/pipeline_out/fastqc/S9_S9_R2_fastqc.zip
    log: /path/pipeline_out/logs/fastqc/S9_S9_R2.log (check log file(s) for error details)
    conda-env: /path/.snakemake/conda/a116f377bbaddedd93b228a3d4f74b1d_

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

FastQC 日志为空。

bioinformatics slurm snakemake hpc
© www.soinside.com 2019 - 2024. All rights reserved.