我一直在尝试将我的SLURM脚本转换为snakemake,我对此不太熟悉。我有来自 5 个池的双端读取。每组 fastqz 都按池位于其自己的目录中。这是我当前的代码。我继续收到缺少输出文件的错误,但无法弄清楚原因。
RUNPOOLS, POOLS = glob_wildcards('/mnt/lz01/mel/cld1061/AScott_Data/raw_data/{runpool}/{pool}_R1_001.fastq.gz')
rule all:
input:
expand('02demulti_output/{runpool}/{runpool}.1.fq.gz', runpool=RUNPOOLS),
expand('02demulti_output/{runpool}/{runpool}.2.fq.gz', runpool=RUNPOOLS),
expand('10-logs/process_radtags/{runpool}.log', runpool=RUNPOOLS)
### demultiplexing
rule demultiplex:
input:
fastq_r1=expand('/mnt/lz01/mel/cld1061/AScott_Data/raw_data/{runpool}/{pool}_R1_001.fastq.gz', runpool=RUNPOOLS, pool=POOLS),
fastq_r2=expand('/mnt/lz01/mel/cld1061/AScott_Data/raw_data/{runpool}/{pool}_R2_001.fastq.gz', runpool=RUNPOOLS, pool=POOLS)
output:
directory('02demulti_output/{runpool}'),
'02demulti_output/{runpool}/{runpool}.1.fq.gz',
'02demulti_output/{runpool}/{runpool}.2.fq.gz',
'10-logs/process_radtags/{runpool}.log'
log:
'10-logs/process_radtags/{runpool}.log'
shell:
'''mkdir -p {output[0]};
module load anaconda/colsa
conda activate stacks-2.5
process_radtags -P -i gzfastq -1 {input.fastq_r1} -2 {input.fastq_r2} -o {output[0]} -b ../01-info_files/barcodes.txt -r -D --index_null --disable_rad_check --retain_header --barcode_dist_1 1 2> {log}'''
非常感谢任何帮助。谢谢!
几条评论:
directory
输出。 Snakemake 将为您创建输出的父目录。expand
使用了通配符的所有组合,在这里你可能需要传递它zip
。这是一个常见问题,您可以在文档中找到。我还将发布有关将 slurm 脚本转换为 Snakemake 管道的研讨会材料,希望它有所帮助! https://github.com/troycomi/snakemake-training