是否可以为ruleorder提供一个函数,像这样?
def determine_ruleorder(wildcards, config):
samples_file = pd.read_table(config['sample_file'], sep='\t')
sample_layout = samples_file.loc[samples_file['sample'] == wildcards.sra, 'layout'].values[0]
if sample_layout == "single":
return "fasterq_dump_se > fasterq_dump_pe"
elif sample_layout == "paired":
return "fasterq_dump_pe > fasterq_dump_se"
else:
raise ValueError(f"Unknown sample type for SRA '{wildcards.sra}': {sample_layout}")
ruleorder: determine_ruleorder
在这种情况下,我收到错误:
UnknownRuleException:
Error in ruleorder definition. There is no rule named determine_ruleorder
这将是示例文件:
sample layout
SRR6308266 paired
SRR1525658 single
SRR6308195 paired
SRR6206507 paired
SRR22957809 paired
SRR22957815 paired
SRR21169022 paired
SRR13066396 paired
SRR7904254 paired
SRR6308265 paired
SRR1525659 single
SRR1525660 single
好像没有。规则顺序会影响工作流程的全局执行,而输入函数需要单个通配符来评估,因此它似乎不太可能被实现。
但是,在这种情况下您不需要这样做。如果输入函数引发值错误,则尝试下一个合理的规则:
ruleorder:
dump_se > dump_pe
def dump_se_input(wildcards):
samples_file = pd.read_table(config['sample_file'], sep='\t')
sample_layout = samples_file.loc[samples_file['sample'] == wildcards.sra, 'layout'].values[0]
if sample_layout == "single":
return FILES
raise ValueError("May be paired")
rule dump_se:
input: dump_se_input
...
def dump_pe_input(wildcards):
samples_file = pd.read_table(config['sample_file'], sep='\t')
sample_layout = samples_file.loc[samples_file['sample'] == wildcards.sra, 'layout'].values[0]
if sample_layout == "paired":
return FILES
raise ValueError(f"Unknown sample type for SRA '{wildcards.sra}': {sample_layout}")
rule dump_pe:
input: dump_pe_input
...
因此首先根据 SE 规则评估样本。如果它们配对,则值错误会导致 PE 规则运行。它还进行了一项检查,该检查将引发(未捕获的)错误,警告用户样本不符合要求。