我将同一蛋白质家族的多个蛋白质结构与 pymol 中的参考结构对齐,现在我想计算参考结构和每个目标结构之间每个“对齐”残基对的 Phi psi 值的差异,我知道如何提取 phi使用 phi_psi 命令获取 psi 值,并且我有序列比对,但我不知道如何提取参考和目标之间每个比对残基的 Delta Phi Psi 值。
这里是使用 PyMOL 获取并生成多重比对的代码Super:
super 对齐两个选择。它执行与序列无关(与对齐不同)的基于结构的动态编程对齐,然后进行一系列细化循环,旨在通过消除具有高相对变异性的配对(就像对齐一样)来提高拟合度。对于序列相似性较低的蛋白质,super 比align 更稳健。
import pymol
from pymol import cmd
import time
list_pdbs = ['1xp8' , '3hr8', '1xmv']
pymol.finish_launching()
time.sleep(5)
for _i in list_pdbs :
try :
cmd.fetch(_i)
except :
print('cant load ', _i)
cmd.remove('hetatm')
sel = 'name CA'
ref = list_pdbs[1]
method = 'super'
cmd.do(f"extra_fit {sel} , {ref} , {method}, object = aln_super_all")
cmd.do('set seq_view , 1')
cmd.select('aln_super_all')
print(cmd.get_type('aln_super_all'))
好吧,我试着回答你一半的问题。我做了一个测试答案,展示了如何获得与 PyMOL 对齐的两个结构之间的 delta-phi 、 delta-psi ,这里是我的代码,不确定这是最快/最干净的方法,但我只能找出一种方法。 如果您将更多结构与参考对齐,您最终会得到一个 raw_alignment,如下所示:
[('3hr8', 1988), ('1xmv', 1764), ('1xp8', 1776)]
[('3hr8', 1997), ('1xmv', 1773), ('1xp8', 1785)]
[('3hr8', 2001), ('1xmv', 1777), ('1xp8', 1789)]
[('3hr8', 2017), ('1xp8', 1808)]
[('3hr8', 2028), ('1xmv', 1804), ('1xp8', 1817)]
[('3hr8', 2049), ('1xp8', 1831)]
[('3hr8', 2058), ('1xmv', 1829), ('1xp8', 1839)]
[('3hr8', 2093), ('1xmv', 1860)]
[('3hr8', 2134), ('1xmv', 1906)]
[('3hr8', 2141), ('1xmv', 1914)]
[('3hr8', 2205), ('1xmv', 1972)]
[('3hr8', 2217), ('1xmv', 1978)]
[('3hr8', 2229), ('1xmv', 1990)]
[('3hr8', 2295), ('1xmv', 2021)]
[('3hr8', 2345), ('1xp8', 2081)]
[('3hr8', 2352), ('1xp8', 2089)]
[('3hr8', 2361), ('1xmv', 2085), ('1xp8', 2094)]
[('3hr8', 2372), ('1xmv', 2099), ('1xp8', 2106)]
[('3hr8', 2380), ('1xp8', 2114)]
[('3hr8', 2389), ('1xmv', 2116), ('1xp8', 2119)]
[('3hr8', 2397), ('1xmv', 2124)]
所以你必须使用
cmd.iterate
来选择所有可能的参考目标结构对并获取它们的CA原子耦合索引提取相应的resi(残基编号/索引)并使用phi_psi
命令来获取两个感兴趣的二面体,如下所示:
import pymol
from pymol import cmd
from pymol import stored
import time
pymol.finish_launching()
# get two structures
# fetch from remote
cmd.fetch('3hr8 ')
cmd.fetch('3hr8 1xp8')
# load from locale
# cmd.load('3hr8.cif' , '3hr8')
# cmd.load('1xp8.cif', '1xp8')
time.sleep(3)
cmd.remove('hetatm')
### Super and get raw alignment ###
cmd.do('super 1xp8, 3hr8, object = aln')
print('super ended !!!!')
### get raw alignment ###
raw_aln = cmd.get_raw_alignment('aln')
cmd.do('set seq_view , 1')
# print residue pairs (atom index)
for idx1, idx2 in raw_aln:
print('%s`%d -> %s`%d' % tuple(idx1 + idx2))
## To print residue numbers instead of atom indices:
def myfunc(resi,model,index,name):
if name == 'CA':
idx2resi[model, index] = resi
else :
idx2resi[model, index] = None
idx2resi = {}
cmd.iterate('aln', 'myfunc(resi,model,index,name)', space={'idx2resi': idx2resi , 'myfunc': myfunc})
# print(len(idx2resi), len(raw_aln))
errors = []
# print residue pairs (residue number)
for idx1, idx2 in raw_aln:
# print(idx1 , idx2)
if idx2resi[idx1] and idx2resi[idx2] :
a = idx1[0]+'...'+idx2resi[idx1]
b = idx2[0]+'...'+idx2resi[idx2]
a_die = cmd.phi_psi(f'{idx1[0]} and resi {idx2resi[idx1]}' )
if a_die :
# print(a_die , ' OK')
pass
else :
a_die[idx1] = (0,000000 , 0,00000000)
# print('\n\n error set to 0 !!!!!!!!' , a_die)
errors.append((idx1 , idx2resi[idx1]))
b_die = cmd.phi_psi(f'{idx2[0]} and resi {idx2resi[idx2]}' )
if b_die :
# print(b_die , ' OK')
pass
else :
b_die[idx2] = (0,000000 , 0,00000000)
# print('\n\n error set to 0 !!!!!!!!' , b_die)
errors.append((idx2, idx2resi[idx2]))
# print(a, b)
# print(a_die, b_die)
stored.a_die_resi = str()
stored.b_die_resi = str()
cmd.iterate(f"{idx1[0]} and resi {idx2resi[idx1]}" , "stored.a_die_resi = resn" )
cmd.iterate(f"{idx2[0]} and resi {idx2resi[idx2]}" , "stored.b_die_resi = resn" )
# print('%s -> %s' % (a , b ), ' phi_psi :' , a_die)
s = (f"{a} {stored.a_die_resi} ---> phi_psi: {a_die[idx1]}"
f" {b} {stored.b_die_resi} ---> phi_psi: {b_die[idx2]}"
f" DELTA : {abs(a_die[idx1][0] -b_die[idx2][0])} , {abs(a_die[idx1][1] -b_die[idx2][1])}")
print(s)
print('\n\n Errors : \n ', errors)
如果出于同样的原因(结构中的间隙),其中一个残基不返回 phi_psi,其值被分配为 0 并附加到错误列表中,可能最好将其从结果中删除,但已经花费了很多时间回答这个...所以...)。
最后一位输出:
3hr8...313 VAL ---> phi_psi: (-64.3411865234375, -44.5438232421875) 1xp8...319 ILE ---> phi_psi: (-64.2035903930664, -27.23161506652832) DELTA : 0.13759613037109375 , 17.31220817565918
3hr8...314 GLN ---> phi_psi: (-61.05774688720703, -39.42118453979492) 1xp8...320 ALA ---> phi_psi: (-75.51972198486328, -28.401968002319336) DELTA : 14.46197509765625 , 11.019216537475586
3hr8...315 PHE ---> phi_psi: (-56.17056655883789, -43.013572692871094) 1xp8...321 TYR ---> phi_psi: (-71.33622741699219, -41.687713623046875) DELTA : 15.165660858154297 , 1.3258590698242188
3hr8...316 LEU ---> phi_psi: (-62.212162017822266, -40.03855895996094) 1xp8...322 ILE ---> phi_psi: (-69.78271484375, -35.62084197998047) DELTA : 7.570552825927734 , 4.417716979980469
3hr8...317 LYS ---> phi_psi: (-62.3416633605957, -42.122344970703125) 1xp8...323 ALA ---> phi_psi: (-59.84389114379883, -19.88322639465332) DELTA : 2.497772216796875 , 22.239118576049805
3hr8...318 ASP ---> phi_psi: (-77.59044647216797, -13.573332786560059) 1xp8...324 GLU ---> phi_psi: (-95.1947250366211, -7.630105972290039) DELTA : 17.604278564453125 , 5.9432268142700195
Errors :
[(('1xp8', 1386), '223')]
这里有一张 PyMOL 的漂亮图片,显示了对齐/叠加原子之间的对应关系:
重要
此问题交叉发布到生物信息学 StackExchange 中的如何计算 Pymol 中参考和目标蛋白质结构之间对齐的残基对之间的 Delta Phi Psi