Snakemake:如何在所有规则中包含配置参数

问题描述 投票:0回答:3

我在 Python 和使用 Snakemake 方面还很陌生,我正在尝试使用配置文件更轻松地修改我的参数并将它们输出到不同的文件中,但我一直停留在规则的定义中,并出现错误:

'expression cannot contain assignment' or 'Missing files for rule all'

我知道 Snakemake 或 expand 不能很好地处理“参数”。部分,但如果我更改它,例如扩展部分中的“

m_qual=config["mqual"]
{mlength}
”,则该文件被认为丢失/以后无法识别。

你能提出一些解释的替代方案吗:)?和/或一些文档可能比 Snakemake 教程页面上的文档更先进一些?

config.yaml

adapt1: 'AGAT exemple'
mlength: 15
mqual: 45

蛇档案

configfile:"config.yaml" 

print(config)

SAMPLES= ["HgSim"]

rule all:
    input:
        expand("01_fastQC/{samples}_fastqc.log",samples=SAMPLES),
expand("Resume_Stats/AR.mlength{params.m_length}.mqual{params.qual}/{samples}.settings",samples=SAMPLES,params.m_length=config["mlength"],params.m_qual=config["mqual"])

        
rule fastQC:
    input: "reads/{samples}.fq"
    output: "01_fastQC/{samples}_fastqc.log"
    shell: "fastqc {input} > {output}"


rule AdapterRemoval_se:
    input: "reads/{samples}.fq"

    output: 
        Settings="Resume_Stats/AR.mlength{params.m_length}.mqual{params.m_qual}/{samples}.settings",
        Truncated="02_AdapterRemoval_pe/mlength{params.mlength}.mqual{params.mqual}/{samples}.truncated",
        Discarded="02_AdapterRemoval_pe/mlength{params.mlength}.mqual{params.mqual}/{samples}.discarded"
    
    threads: 20   

    params:
        adapter_1=config["adapt1"],
        m_length=config["mlength"],
        m_qual=config["mqual"]
    
    shell: "AdapterRemoval --file1 {input}  --discarded {output.Discarded}  --settings {output.Settings} --output1 {output.Truncated} --adapter1 {params.adapter_1} --threads {threads} --minlength {params.m_length} --minquality {params.m_qual} --trimns --trimqualities "

parameters config wildcard snakemake
3个回答
1
投票

使用

configfile:"config.yaml"
声明配置文件后,您可以像这样访问参数:
mlength = config["mlength"]
。然后将值存储在变量中,您可以以任何 pythonic 方式访问它们,例如使用 f-strings:
f"Resume_Stats/AR.mlength{mlength}..."
。或者你可以简单地连接字符串:
"Resume_Stats/AR.mlength" + str(mlength) + "..."

如果是 f 字符串,您将面临另一个问题,

expand
的通配符和参数需要双花括号。所以你的 Snakefile 的开头可能是这样的:

configfile:"config.yaml" 

SAMPLES= ["HgSim"]

rule all:
    input:
        expand("01_fastQC/{samples}_fastqc.log", samples=SAMPLES),
        expand(f"Resume_Stats/AR.mlength{mlength}.mqual{mqual}/{{samples}}.settings", samples=SAMPLES)

您仍然可以按照您的方法扩展参数,但您需要将

params.m_length
重命名为一个简单的标识符,例如
mlength

configfile:"config.yaml" 

SAMPLES= ["HgSim"]

rule all:
    input:
        expand("01_fastQC/{samples}_fastqc.log", samples=SAMPLES),
        expand("Resume_Stats/AR.mlength{mlength}.mqual{mqual}/{samples}.settings", mlength=config["mlength"], mqual=config["mqual"], samples=SAMPLES)

0
投票

您可以修改脚本以使用参数。

configfile:"config.yaml"

SAMPLES= ["HgSim"]

rule all:
   input:
     expand("01_fastQC/{samples}_fastqc.log",samples=SAMPLES),
     expand("Resume_Stats/AR.mlength{mlength}.mqual{mqual}/{samples}.settings", samples=SAMPLES, mlength=config["mlength"], mqual=config["mqual"])

rule fastQC:
    input:
        "reads/{samples}.fq"
    output:
        touch("01_fastQC/{samples}_fastqc.log")

rule AdapterRemoval_se:
    input:
        "reads/{samples}.fq"
    output:
        Settings="Resume_Stats/AR.mlength{mlength}.mqual{mqual}/{samples}.settings",
        Truncated="02_AdapterRemoval_pe/mlength{mlength}.mqual{mqual}/{samples}.truncated",
        Discarded="02_AdapterRemoval_pe/mlength{mlength}.mqual{mqual}/{samples}.discarded"
    params:
        adapt1=config["adapt1"],
        mlength=config["mlength"],
        mqual=config["mqual"]
    shell:
        """
        echo {input} {params.adapt1} {params.mlength} {params.mqual}| tee {output}
        """

0
投票

谢谢你的回答,但我觉得我的理解还有点不清楚。我修改了脚本如下所示:

configfile:"config.yaml" 
print(config)

SAMPLES= ["HgSim"]

rule all:
    input:
        expand("01_fastQC/{samples}_fastqc.log",samples=SAMPLES),
        expand("Resume_Stats/AR.mlength{mlength}.mqual{mqual}/{samples}.settings", samples=SAMPLES, mlength=config["mlength"], mqual=config["mqual"])

rule fastQC:
    input: "reads/{samples}.fq"
    output: "01_fastQC/{samples}_fastqc.log"
    shell: "fastqc {input} > {output}"

rule AdapterRemoval_se:
    input: "reads/{samples}.fq"
    output:
        Settings="Resume_Stats/AR.mlength{params.mlength}.mqual{params.mqual}/{samples}.settings",
        Truncated="02_AdapterRemoval_pe/mlength{params.mlength}.mqual{params.mqual}/{samples}.truncated",
        Discarded="02_AdapterRemoval_pe/mlength{params.mlength}.mqual{params.mqual}/{samples}.discarded"

    threads: 20   

    params:
        adapt1=config["adapt1"],
        mlength=config["mlength"],
        mqual=config["mqual"]
    
    shell: "AdapterRemoval --file1 {input}  --discarded {output.Discarded}  --settings {output.Settings} --output1 {output.Truncated} --adapter1 {params.adaper_1} --threads {threads} --minlength {params.mlength} --minquality {params.mqual} --trimns --trimqualities "

但我仍然有错误:

MissingInputException in line 11 of /*/Snakefile : # in the rule all line 
Missing input files for rule all:
Resume_Stats/AR.mlength15.mqual35/HgSim.settings"

所以我认为问题是如何在“规则 AdapterRemoval_se”中定义参数

对于规则中的其他修改:

f"Resume_Stats/AR.mlength{mlength}.mqual{mqual}/{{samples}}.settings"

我有错误:

Name error in line 14 of /*/Snakefile : # rule all line 
name length is not defined
© www.soinside.com 2019 - 2024. All rights reserved.