如何使用Bio.PDB分别从PDB文件中保存每个配体？

Question

我有一个PDB文件列表。我想使用BioPython的Bio.PDB模块提取所有文件的配体（即杂原子），并将每个配体分别保存到PDB文件中。

我尝试了一些解决方案，例如：Remove heteroatoms from PDB，我试图适应于保留杂原子。但是我得到的只是文件中所有配体都在同一个文件中。

我也尝试过这样的事情：

def accept_residue(residue):
    """ Recognition of heteroatoms - Remove water molecules """ 
    res = residue.id[0]
    if res != " ": # Heteroatoms have some flags, that's why we keep only residue with id != " "
        if res != "W": # Don't take in consideration the water molecules
            return True


def extract_ligands(path):
    """ Extraction of the heteroatoms of .pdb files """
    for element in os.listdir(path+'/data/pdb'):
        i=1
        if element.endswith('.pdb'):
            if not element.startswith("lig_"):
                pdb = PDBParser().get_structure(element[:-4], path+'/data/pdb/'+element)
                io = PDBIO()
                io.set_structure(pdb)
                for model in pdb:
                    for chain in model:
                        for residue in chain:
                            if accept_residue(residue):
                                io.save("lig_"+element[:-4]+"_"+str(i)+".pdb", accept_residue(residue))
                                i += 1 # Counter for the result filename



# Main
path = mypath

extract_ligands(path)

显然，它引发了一个错误：

AttributeError: 'bool' object has no attribute 'accept_model'

我知道这是因为我的“ io.save”中的“ accept_residue（）”。但是我没有找到任何合乎逻辑的解决方案来做我想做的事...

最后，我使用chain.detach_child尝试了这样的解决方案：

                    ...
                    for chain in model:
                        for residue in chain:
                            res = residue.id[0]
                            if res == " " or res == "W": 
                                chain.detach_child(residue.id)
                        if len(chain) == 0:
                            model.detach_child(chain.id)
                     ...

[在我看来，它将“分离”所有不是杂原子的残基（res.id [0] ==“”）和所有水（res.id [0] ==“ W”）。但总的来说，所有残留物和水仍然存在并且有越野车。

所以，有可能做我需要做的事情吗？（从我所有文件中提取所有配体，并将其分别保存在pdb文件中）

（对我的英语不好，最终对我在Python中的不好技能很抱歉：/）

Answer 1

你很近。

但是您必须提供Select类作为io.save的第二个参数。看一下文档注释。它说此自变量应提供accept_model，accept_chain，accept_residue和accept_atom。

我创建了一个从ResidueSelect继承的类Bio.PDB.PDBIO.Select。这样，我只需要重写我们需要的方法即可。对于我们的链和残基。

因为我们只想将当前残基保存在当前链中，所以我为构造函数提供了两个各自的参数。

import os

from Bio.PDB import PDBParser, PDBIO, Select


def is_het(residue):
    res = residue.id[0]
    return res != " " and res != "W"


class ResidueSelect(Select):
    def __init__(self, chain, residue):
        self.chain = chain
        self.residue = residue

    def accept_chain(self, chain):
        return chain.id == self.chain.id

    def accept_residue(self, residue):
        """ Recognition of heteroatoms - Remove water molecules """
        return residue == self.residue and is_het(residue)


def extract_ligands(path):
    """ Extraction of the heteroatoms of .pdb files """

    for pfb_file in os.listdir(path + '/data/pdb'):
        i = 1
        if pfb_file.endswith('.pdb') and not pfb_file.startswith("lig_"):
            pdb_code = pfb_file[:-4]
            pdb = PDBParser().get_structure(pdb_code, path + '/data/pdb/' + pfb_file)
            io = PDBIO()
            io.set_structure(pdb)
            for model in pdb:
                for chain in model:
                    for residue in chain:
                        if not is_het(residue):
                            continue
                        print(f"saving {chain} {residue}")
                        io.save(f"lig_{pdb_code}_{i}.pdb", ResidueSelect(chain, residue))
                        i += 1


# Main
path = mypath

extract_ligands(path)

Btw：我试图在此过程中稍微提高代码的可读性...

如何使用Bio.PDB分别从PDB文件中保存每个配体？

问题描述投票：2回答：1

1个回答

最新问题

如何使用Bio.PDB分别从PDB文件中保存每个配体？

问题描述 投票：2回答：1

1个回答

最新问题

问题描述投票：2回答：1