我有一个 FASTA 文件:
> cat test.fasta
>Chr01
TTGTTGGAGGCGGAAGATTCGTCTTCAACAGTGATATAATTGATATTCGACCTTCTTATGGAAATGCATACTAGTGACAC
TTGCTTGTATCCTAAAATTTCACTTGGTTGCCTTGTTGCAGCATTTGGTGGGAGATTGCCTAACTTTGATGGCTCATTGG
GATTGTATCCAACTTTTAGAAATATCTTGTAATCGTTAGGGTCAAAATCTTCATCTATTGGCTTTGTAGGGAGTGCCCCA
TTCGATAAAGGATTTTGGGCTACAAACCCTATAAATGGCTTAGGGGAAAACTTTACTGCCTCAATTCGTTTTACCTGAAG
>Chr02
AGTTAGATCCTTTAGCGTGTTAGTTTGGAGATTAGAAGATTCGCCCCCGTCTTTCTTCCTTTTAGTGACATATTGGAGCG
GGGAAACTTACTTTCTTTCCATAAGATACAATACCCCCTTTATGAGATTTATTCAGGTTGGTGTGTACCTCCTTAGTAAT
AGGTTTGGCTTCATCAGTAGCCACCTCTGCTCTTTTTGTTATGGGATCTTCATTATTGCTTTTCATGCTATCATCATCTT
TTAGCTCTTTCACAATGCAATTCTTTAAGTATAACTTTGCATCGGCGAAGTGTGACTCAGCCTCGGTGAATGGCTCATCA
TCAGCAAATATCTTCTTCTCGACTTTACCGTCGTAGTATTTAGACATTGATGGTAGGTAGATGGAACTACTTTATTCTCA
TGTATCCAAGGCCTTCCAAGAAAGACGTTGTAAGAAGACTTTGCATCGATCGCATGTAGCCATGCACTTGATTTCATATC
>Scaf1
TTCAATGGTAATTTCCAATCTGATCGCACCTATGGCCCTTTGGCCCCCCTTGGTTGAATCGTTGGATCATCATACGACTT
TCTGAGAGTTCGTTCATGGGAATGCAAACTTTTTTCACAGTTCGAATTGGCAAAATGTTCGCTGAGGATCCTCCATCAAC
CAAAATTTGATTTACCCTTTCATCACGCATATAGCCAACTAGGTAAAATGGGTGGTTACAAAGAGTGTCACCTAGCAGAA
GATCGTCATTTATGAACATGACTTTTTCTTCCCAACTATTAACTTGTTGAGGAGTGGAATCGATGAGCTTTTCTGGAGAT
>Scaf2
AGTTCTAATGGTAGGTCATCACTCTTTTCTTCCCTTTTGTCTTCATGGTAACAAGAGGCATCGATTCCCTCATGGGAAAT
CTTCGTGCGGAACCAAGATGGAAAGAAATGCTCCAAGGTTATTGTATGTCGTGGCTTTTGAGGGTGGTGCATCTCCACTT
TTTCTTTCTTCAAAAGCCTAACAAGATTCTTTTCTGTAGGTCTTTTTACTGTTATTTTCCTTGTTGGTTGTTCTATTGAT
TCTTTTTCTAGGATCCTTTTGTGGTGCCTACGACAAGTCACCAACGTCCAACCTTCATTATCATCCGATTAATCGTCTCC
我想让上述 FASTA 文件兼容 PanSN-spec 规范:
[sample_name][delim][haplotype_id][delim][contig_or_scaffold_name]
与
sample_name := string
delim := #
haplotype_id := number
contig_or_scaffold_name := string
输出兼容 PanSN-spec FASTA 文件:
>test1#1#Chr01
TTGTTGGAGGCGGAAGATTCGTCTTCAACAGTGATATAATTGATATTCGACCTTCTTATGGAAATGCATACTAGTGACAC
TTGCTTGTATCCTAAAATTTCACTTGGTTGCCTTGTTGCAGCATTTGGTGGGAGATTGCCTAACTTTGATGGCTCATTGG
GATTGTATCCAACTTTTAGAAATATCTTGTAATCGTTAGGGTCAAAATCTTCATCTATTGGCTTTGTAGGGAGTGCCCCA
TTCGATAAAGGATTTTGGGCTACAAACCCTATAAATGGCTTAGGGGAAAACTTTACTGCCTCAATTCGTTTTACCTGAAG
>test1#1#Chr02
AGTTAGATCCTTTAGCGTGTTAGTTTGGAGATTAGAAGATTCGCCCCCGTCTTTCTTCCTTTTAGTGACATATTGGAGCG
GGGAAACTTACTTTCTTTCCATAAGATACAATACCCCCTTTATGAGATTTATTCAGGTTGGTGTGTACCTCCTTAGTAAT
AGGTTTGGCTTCATCAGTAGCCACCTCTGCTCTTTTTGTTATGGGATCTTCATTATTGCTTTTCATGCTATCATCATCTT
TTAGCTCTTTCACAATGCAATTCTTTAAGTATAACTTTGCATCGGCGAAGTGTGACTCAGCCTCGGTGAATGGCTCATCA
TCAGCAAATATCTTCTTCTCGACTTTACCGTCGTAGTATTTAGACATTGATGGTAGGTAGATGGAACTACTTTATTCTCA
TGTATCCAAGGCCTTCCAAGAAAGACGTTGTAAGAAGACTTTGCATCGATCGCATGTAGCCATGCACTTGATTTCATATC
>test1#1#Scaf1
TTCAATGGTAATTTCCAATCTGATCGCACCTATGGCCCTTTGGCCCCCCTTGGTTGAATCGTTGGATCATCATACGACTT
TCTGAGAGTTCGTTCATGGGAATGCAAACTTTTTTCACAGTTCGAATTGGCAAAATGTTCGCTGAGGATCCTCCATCAAC
CAAAATTTGATTTACCCTTTCATCACGCATATAGCCAACTAGGTAAAATGGGTGGTTACAAAGAGTGTCACCTAGCAGAA
GATCGTCATTTATGAACATGACTTTTTCTTCCCAACTATTAACTTGTTGAGGAGTGGAATCGATGAGCTTTTCTGGAGAT
>test1#1#Scaf2
AGTTCTAATGGTAGGTCATCACTCTTTTCTTCCCTTTTGTCTTCATGGTAACAAGAGGCATCGATTCCCTCATGGGAAAT
CTTCGTGCGGAACCAAGATGGAAAGAAATGCTCCAAGGTTATTGTATGTCGTGGCTTTTGAGGGTGGTGCATCTCCACTT
TTTCTTTCTTCAAAAGCCTAACAAGATTCTTTTCTGTAGGTCTTTTTACTGTTATTTTCCTTGTTGGTTGTTCTATTGAT
TCTTTTTCTAGGATCCTTTTGTGGTGCCTACGACAAGTCACCAACGTCCAACCTTCATTATCATCCGATTAATCGTCTCC
Clojure 脚本将获取以下参数:
以下脚本:
(ns convertFASTA2PanSN
(:require [clojure.java.io :as io]))
(defn transform-header [header sample-name haplotype-id]
(str ">" sample-name "#" haplotype-id "#" (subs header 1)))
(defn process-fasta-file [input-file output-file sample-name haplotype-id]
(with-open [rdr (io/reader input-file)
wrt (io/writer output-file)]
(doseq [line (line-seq rdr)]
(if (.startsWith line ">")
(let [new-header (transform-header line sample-name haplotype-id)]
(.write wrt (str new-header "\n")))
(.write wrt (str line "\n"))))))
(defn -main [& args]
(let [input-file (args 0)
output-file (args 1)
sample-name (args 2)
haplotype-id (args 3)]
(process-fasta-file input-file output-file sample-name haplotype-id)))
(set! *main-cli-fn* -main)
导致错误
Unable to resolve symbol: >Chr01 in this context
如何解决?
您正在运行您的
test.fasta
文件而不是 clojure 代码。
由于您没有提到如何运行它(clj,bb,lein,uberjar, ...)是为了给出具体的建议。但它看起来像这样:
$ $CLOJURE test.fasta
虽然它应该看起来像:
$ $CLOJURE script.clj test.fasta