如何根据给定的概率矩阵生成随机序列?

问题描述 投票:2回答:2

下面的脚本为给定列表生成概率矩阵:

transitions = ['A', 'B', 'B', 'C', 'B', 'A', 'D', 'D', 'A', 'B', 'A', 'D']

def rank(c):
   return ord(c) - ord('A')

T = [rank(c) for c in transitions]

#create matrix of zeros

M = [[0]*4 for _ in range(4)]

for (i,j) in zip(T,T[1:]):
   M[i][j] += 1

#now convert to probabilities:
for row in M:
   n = sum(row)
   if n > 0:
       row[:] = [f/sum(row) for f in row]

#print M:
for row in M:
   print(row)

输出

[0.0, 0.5, 0.0, 0.5]
[0.5, 0.25, 0.25, 0.0]
[0.0, 1.0, 0.0, 0.0]
[0.5, 0.0, 0.0, 0.5]

我现在想做相反的事情,并在概率矩阵之后创建一个新的A B C D过渡列表。我该如何做到这一点?

python matrix probability
2个回答
0
投票

在我看来,您正在尝试创建马尔可夫模型。作为一名生物信息学专业的学生,​​我碰巧拥有(隐藏)马尔可夫模型的经验,因此我将使用嵌套字典来简化矩阵的使用。请注意,我已经导入了numpy.random函数。

希望这会有所帮助!

import numpy.random as rnd

alphabet = ['A', 'B', 'C', 'D']
transitions = ['A', 'B', 'B', 'C', 'B', 'A', 'D', 'D', 'A', 'B', 'A', 'D']

# Create probability matrix filled with zeroes
# Matrix consists of nested libraries
prob_matrix = {}
for i in alphabet:
    prob_matrix[i] = {}
    for j in alphabet:
        prob_matrix[i][j] = 0.0

def rank(c):
   return ord(c) - ord('A')

# fill matrix with numbers based on transitions list
T = [rank(c) for c in transitions]
for (i,j) in zip(T,T[1:]):
    prob_matrix[alphabet[i]][alphabet[j]] += 1

# convert to probabilities
for row in prob_matrix:
   total = sum([prob_matrix[row][column] for column in prob_matrix[row]])
   if total > 0:
       for column in prob_matrix[row]:
           prob_matrix[row][column] /= total

# generate first random sequence letter
outputseq = rnd.choice(alphabet, None)

# generate rest of string based on probability matrix
for i in range(11):
    probabilities = [prob_matrix[outputseq[-1]][j] for j in alphabet]
    outputseq += rnd.choice(alphabet, None, False, probabilities)

# output generated sequence
print(outputseq)

0
投票

随机库的choices函数可能会有所帮助。由于该问题并不表示如何选择第一个字母,因此在这里选择它的可能性与原始列表的内容相同。

因为Python 3.6 choices接受带有权重的参数。严格地不必标准化它们。

random.choices

完整的代码可以在某种程度上推广到与任何类型的节点一起使用,而不仅仅是连续的字母:

import random

letter = random.choice(transitions)  # take a starting letter with the same weights as the original list
new_list = [letter]
for _ in range(len(transitions) - 1):
    letter = chr(random.choices(range(4), weights=M[rank(letter)])[0] + ord('A'))
    new_list.append(letter)
print(new_list)

示例输出:from _collections import defaultdict import random transitions = ['A', 'B', 'B', 'C', 'B', 'A', 'D', 'D', 'A', 'B', 'A', 'D'] nodes = sorted(set(transitions)) # a list of all letters used M = defaultdict(int) # dictionary counting the occurrences for each transition i,j) for (i, j) in zip(transitions, transitions[1:]): M[(i, j)] += 1 # dictionary with for each node a list of frequencies for the transition to a next node T = {i: [M[(i, j)] for j in nodes] for i in nodes} # node = random.choice(transitions) # chose the first node randomly with the same probability as the original list node = random.choice(nodes) # chose the first node randomly, each node with equal probability new_list = [node] for _ in range(9): node = random.choices(nodes, T[node])[0] new_list.append(node) print(new_list)

© www.soinside.com 2019 - 2024. All rights reserved.