Conditional Probability-Python

问题描述 投票:3回答:2

我正在研究此python问题:

给出以字符串形式存储的DNA碱基{A,C,G,T}的序列,在数据结构中返回条件概率表,以便可以查询一个碱基(b1),然后查询第二个碱基(b1)。 b2),以获得第二个碱基在第一个碱基之后立即出现的概率p(b2 | b1)。 (假设seq的长度> = 3,并且从未一起见过的任何b1和b2的概率为0。忽略b1将在字符串末尾跟随的概率。)

您可以使用collections模块,但不能使用其他库。

但是我遇到了障碍:

word = 'ATCGATTGAGCTCTAGCG'

def dna_prob2(seq):
    tbl = dict()
    levels = set(word)
    freq = dict.fromkeys(levels, 0)
    for i in seq:
        freq[i] += 1
    for i in levels:
        tbl[i] = {x:0 for x in levels}
    lastlevel = ''
    for i in tbl:
        if lastlevel != '':
             tbl[lastlevel][i] += 1
        lastlevel = i
    for i in tbl:
        print(i,tbl[i][i] / freq[i])
    return tbl

tbl['T']['T'] / freq[i] 

基本上,最终结果应该是您在上面看到的最后一行tbl。但是,当我尝试在print(i,tbl[i][i] /freq[i)中执行该操作并运行dna_prob2(word)时,我得到的所有结果均为0.0s。

很想知道这里是否有人可以帮忙。

谢谢!

python probability
2个回答
0
投票
def makeprobs(word): singles = {} probs = {} thedict={} ll = len(word) for i in range(ll-1): x1 = word[i] x2 = word[i+1] singles[x1] = singles.get(x1, 0)+1.0 thedict[(x1, x2)] = thedict.get((x1, x2), 0)+1.0 for i in thedict: probs[i] = thedict[i]/singles[i[0]] return probs

0
投票
word = 'ATCGATTGAGCTCTAGCG' def dna_prob2(seq): tbl = dict() levels = set(seq) freq = dict.fromkeys(levels, 0) for i in seq: freq[i] += 1 for i in levels: tbl[i] = {x:0 for x in levels} lastlevel = '' for i in seq: if lastlevel != '': tbl[lastlevel][i] += 1 lastlevel = i return tbl, freq condfreq, freq = dna_prob2(word) print(condfreq['T']['T']/freq['T']) print(condfreq['G']['A']/freq['A']) print(condfreq['C']['G']/freq['G'])

希望这会有所帮助。

© www.soinside.com 2019 - 2024. All rights reserved.