在此上下文中无法解析符号:>Chr01

问题描述 投票:0回答:1

我有一个 FASTA 文件:

> cat test.fasta
>Chr01
TTGTTGGAGGCGGAAGATTCGTCTTCAACAGTGATATAATTGATATTCGACCTTCTTATGGAAATGCATACTAGTGACAC
TTGCTTGTATCCTAAAATTTCACTTGGTTGCCTTGTTGCAGCATTTGGTGGGAGATTGCCTAACTTTGATGGCTCATTGG
GATTGTATCCAACTTTTAGAAATATCTTGTAATCGTTAGGGTCAAAATCTTCATCTATTGGCTTTGTAGGGAGTGCCCCA
TTCGATAAAGGATTTTGGGCTACAAACCCTATAAATGGCTTAGGGGAAAACTTTACTGCCTCAATTCGTTTTACCTGAAG
>Chr02
AGTTAGATCCTTTAGCGTGTTAGTTTGGAGATTAGAAGATTCGCCCCCGTCTTTCTTCCTTTTAGTGACATATTGGAGCG
GGGAAACTTACTTTCTTTCCATAAGATACAATACCCCCTTTATGAGATTTATTCAGGTTGGTGTGTACCTCCTTAGTAAT
AGGTTTGGCTTCATCAGTAGCCACCTCTGCTCTTTTTGTTATGGGATCTTCATTATTGCTTTTCATGCTATCATCATCTT
TTAGCTCTTTCACAATGCAATTCTTTAAGTATAACTTTGCATCGGCGAAGTGTGACTCAGCCTCGGTGAATGGCTCATCA
TCAGCAAATATCTTCTTCTCGACTTTACCGTCGTAGTATTTAGACATTGATGGTAGGTAGATGGAACTACTTTATTCTCA
TGTATCCAAGGCCTTCCAAGAAAGACGTTGTAAGAAGACTTTGCATCGATCGCATGTAGCCATGCACTTGATTTCATATC
>Scaf1
TTCAATGGTAATTTCCAATCTGATCGCACCTATGGCCCTTTGGCCCCCCTTGGTTGAATCGTTGGATCATCATACGACTT
TCTGAGAGTTCGTTCATGGGAATGCAAACTTTTTTCACAGTTCGAATTGGCAAAATGTTCGCTGAGGATCCTCCATCAAC
CAAAATTTGATTTACCCTTTCATCACGCATATAGCCAACTAGGTAAAATGGGTGGTTACAAAGAGTGTCACCTAGCAGAA
GATCGTCATTTATGAACATGACTTTTTCTTCCCAACTATTAACTTGTTGAGGAGTGGAATCGATGAGCTTTTCTGGAGAT
>Scaf2
AGTTCTAATGGTAGGTCATCACTCTTTTCTTCCCTTTTGTCTTCATGGTAACAAGAGGCATCGATTCCCTCATGGGAAAT
CTTCGTGCGGAACCAAGATGGAAAGAAATGCTCCAAGGTTATTGTATGTCGTGGCTTTTGAGGGTGGTGCATCTCCACTT
TTTCTTTCTTCAAAAGCCTAACAAGATTCTTTTCTGTAGGTCTTTTTACTGTTATTTTCCTTGTTGGTTGTTCTATTGAT
TCTTTTTCTAGGATCCTTTTGTGGTGCCTACGACAAGTCACCAACGTCCAACCTTCATTATCATCCGATTAATCGTCTCC

我想让上述 FASTA 文件兼容 PanSN-spec 规范:

[sample_name][delim][haplotype_id][delim][contig_or_scaffold_name]

sample_name := string
delim := #
haplotype_id := number
contig_or_scaffold_name := string

输出兼容 PanSN-spec FASTA 文件:

>test1#1#Chr01
TTGTTGGAGGCGGAAGATTCGTCTTCAACAGTGATATAATTGATATTCGACCTTCTTATGGAAATGCATACTAGTGACAC
TTGCTTGTATCCTAAAATTTCACTTGGTTGCCTTGTTGCAGCATTTGGTGGGAGATTGCCTAACTTTGATGGCTCATTGG
GATTGTATCCAACTTTTAGAAATATCTTGTAATCGTTAGGGTCAAAATCTTCATCTATTGGCTTTGTAGGGAGTGCCCCA
TTCGATAAAGGATTTTGGGCTACAAACCCTATAAATGGCTTAGGGGAAAACTTTACTGCCTCAATTCGTTTTACCTGAAG
>test1#1#Chr02
AGTTAGATCCTTTAGCGTGTTAGTTTGGAGATTAGAAGATTCGCCCCCGTCTTTCTTCCTTTTAGTGACATATTGGAGCG
GGGAAACTTACTTTCTTTCCATAAGATACAATACCCCCTTTATGAGATTTATTCAGGTTGGTGTGTACCTCCTTAGTAAT
AGGTTTGGCTTCATCAGTAGCCACCTCTGCTCTTTTTGTTATGGGATCTTCATTATTGCTTTTCATGCTATCATCATCTT
TTAGCTCTTTCACAATGCAATTCTTTAAGTATAACTTTGCATCGGCGAAGTGTGACTCAGCCTCGGTGAATGGCTCATCA
TCAGCAAATATCTTCTTCTCGACTTTACCGTCGTAGTATTTAGACATTGATGGTAGGTAGATGGAACTACTTTATTCTCA
TGTATCCAAGGCCTTCCAAGAAAGACGTTGTAAGAAGACTTTGCATCGATCGCATGTAGCCATGCACTTGATTTCATATC
>test1#1#Scaf1
TTCAATGGTAATTTCCAATCTGATCGCACCTATGGCCCTTTGGCCCCCCTTGGTTGAATCGTTGGATCATCATACGACTT
TCTGAGAGTTCGTTCATGGGAATGCAAACTTTTTTCACAGTTCGAATTGGCAAAATGTTCGCTGAGGATCCTCCATCAAC
CAAAATTTGATTTACCCTTTCATCACGCATATAGCCAACTAGGTAAAATGGGTGGTTACAAAGAGTGTCACCTAGCAGAA
GATCGTCATTTATGAACATGACTTTTTCTTCCCAACTATTAACTTGTTGAGGAGTGGAATCGATGAGCTTTTCTGGAGAT
>test1#1#Scaf2
AGTTCTAATGGTAGGTCATCACTCTTTTCTTCCCTTTTGTCTTCATGGTAACAAGAGGCATCGATTCCCTCATGGGAAAT
CTTCGTGCGGAACCAAGATGGAAAGAAATGCTCCAAGGTTATTGTATGTCGTGGCTTTTGAGGGTGGTGCATCTCCACTT
TTTCTTTCTTCAAAAGCCTAACAAGATTCTTTTCTGTAGGTCTTTTTACTGTTATTTTCCTTGTTGGTTGTTCTATTGAT
TCTTTTTCTAGGATCCTTTTGTGGTGCCTACGACAAGTCACCAACGTCCAACCTTCATTATCATCCGATTAATCGTCTCC

Clojure 脚本将获取以下参数:

  • 输入FASTA文件名,
  • 输出FASTA文件名,
  • 对于示例名称,例如 test1 和
  • 对于 haplotype_id 例如 1

以下脚本:

(ns convertFASTA2PanSN
  (:require [clojure.java.io :as io]))

(defn transform-header [header sample-name haplotype-id]
  (str ">" sample-name "#" haplotype-id "#" (subs header 1)))

(defn process-fasta-file [input-file output-file sample-name haplotype-id]
  (with-open [rdr (io/reader input-file)
              wrt (io/writer output-file)]
    (doseq [line (line-seq rdr)]
      (if (.startsWith line ">")
        (let [new-header (transform-header line sample-name haplotype-id)]
          (.write wrt (str new-header "\n")))
        (.write wrt (str line "\n"))))))

(defn -main [& args]
  (let [input-file (args 0)
        output-file (args 1)
        sample-name (args 2)
        haplotype-id (args 3)]
    (process-fasta-file input-file output-file sample-name haplotype-id)))

(set! *main-cli-fn* -main)

导致错误

Unable to resolve symbol: >Chr01 in this context

如何解决?

clojure
1个回答
0
投票

您正在运行您的

test.fasta
文件而不是 clojure 代码。

由于您没有提到如何运行它(clj,bb,lein,uberjar, ...)是为了给出具体的建议。但它看起来像这样:

$ $CLOJURE test.fasta

虽然它应该看起来像:

$ $CLOJURE script.clj test.fasta
© www.soinside.com 2019 - 2024. All rights reserved.