欢迎关注我的CSDN:https://spike.blog.csdn/
本文地址:https://blog.csdn/caroline_wendy/article/details/129881370
蛋白质复合物或多肽链复合物是由两个或多个相关的多肽链组成的。蛋白质复合物是一种四级结构形式。蛋白质复合物中的蛋白质,通过非共价的蛋白质-蛋白质相互作用相连。这些复合物是许多生物过程的基石。通过相互接近,极大的提高复合物与底物之间的结合速度和选择性,从而提高细胞效率。许多进入细胞和分离蛋白质的技术本身,就会破坏这些大型复合物,使确定复合物组成成分的任务变得复杂。在稳定的复合物中,蛋白质之间的大型疏水界面通常埋藏超过2500平方埃的表面积。
8CZ5 抗体抗原复合物的结构,如下:
从复合物PDB中,提取抗体(Ab-HL)、重链(H)、轻链(L)、抗原(Ag)的结构,存储到不同的PDB文件中,相当于拆解PDB结构。
函数如下:
get_multiple_chains_pdb_from_complex
:提取PDB的多链get_single_chain_pdb_from_complex
:提取PDB的单链split_antibody_antigen_complex
:只针对于抗原-抗体PDB的拆分
源码如下,部分来源于ChatGPT:
def get_multiple_chains_pdb_from_complex(input_pdb, chain_ids, output_pdb):
"""
A function to extract multiple chains of pdb from an input complex pdb and combine them to one new complex pdb
codes from ChatGPT
"""
# Create a PDB parser object
parser = PDB.PDBParser(QUIET=True)
# Parse the input pdb file and get the structure object
structure = parser.get_structure("input", input_pdb)
# Create a PDB io object
io = PDB.PDBIO()
# Set the structure object to the io object
io.set_structure(structure)
# Create an empty list to store the selected chains
selected_chains = []
# Loop through the chain ids
for chain_id in chain_ids:
# Get the chain object from the structure object by the chain id
chain = structure[0][chain_id]
# Append the chain object to the selected list
selected_chains.append(chain)
# Create a select class that only accepts the selected chains
class Select(PDB.Select):
def accept_chain(self, chain_select):
if chain_select in selected_chains:
return True
else:
return False
# Save the selected chains to the output pdb file using the select class
io.save(output_pdb, Select())
def get_single_chain_pdb_from_complex(input_pdb, chain_id, output_pdb):
"""
获取多链PDB的单链PDB,代码来源于ChatGPT
:param input_pdb: 输入多链PDB路径
:param chain_id: 单链名
:param output_pdb: 输出单链PDB路径
:return: None
"""
parser = PDB.PDBParser(QUIET=True)
try:
structure = parser.get_structure("pdb", input_pdb)
except IOError:
raise Exception("Error: could not read pdb file")
chain = None
for model in structure:
for c in model:
if c.get_id() == chain_id:
chain = c
break
if chain is not None:
break
if chain is None:
raise Exception("Error: could not find chain with name", chain_id)
io = PDB.PDBIO()
class ChainSelect(PDB.Select):
def accept_chain(self, c):
return c == chain
io.set_structure(structure)
io.save(output_pdb, ChainSelect())
def split_antibody_antigen_complex(cls, input_pdb_path, pdb_name, chain_ids, out_dir):
"""
split complex:
1. save complex pdb.
2. save antibody heavy chain pdb, light chain pdb and heavy - light chain complex pdb.
3. save antigen chain pdb.
all pdbs are saved into different folder.
:param input_pdb_path: input complex pdb file
:param pdb_name: as its name.
:param chain_ids: Attention! the order is heavy light antigen.
:param out_dir: output dir, function will make subdir in it.
"""
assert len(chain_ids) == 3, \
f"length of chain_ids must be 3, just as heavy light antigen! but input is {chain_ids}!"
mkdir_if_not_exist(out_dir)
pdb_out_dir = os.path.join(out_dir, pdb_name)
mkdir_if_not_exist(pdb_out_dir)
# input
pdb_path = os.path.join(pdb_out_dir, f"{pdb_name}.pdb")
shutil.copy(input_pdb_path, pdb_path)
# complex
ab_ag_path = os.path.join(pdb_out_dir, f"{pdb_name}_{''.join(chain_ids)}_ab_ag.pdb")
ab_path = os.path.join(pdb_out_dir, f"{pdb_name}_{''.join(chain_ids[:2])}_ab.pdb")
# single chain
h_path = os.path.join(pdb_out_dir, f"{pdb_name}_{chain_ids[0]}_hc.pdb")
l_path = os.path.join(pdb_out_dir, f"{pdb_name}_{chain_ids[1]}_lc.pdb")
ag_path = os.path.join(pdb_out_dir, f"{pdb_name}_{chain_ids[2]}_ag.pdb")
cls.get_multiple_chains_pdb_from_complex(pdb_path, chain_ids, ab_ag_path)
cls.get_multiple_chains_pdb_from_complex(pdb_path, chain_ids[:2], ab_path)
cls.get_single_chain_pdb_from_complex(pdb_path, chain_ids[0], h_path)
cls.get_single_chain_pdb_from_complex(pdb_path, chain_ids[1], l_path)
cls.get_single_chain_pdb_from_complex(pdb_path, chain_ids[2], ag_path)
更多推荐
AI制药 - 从蛋白质复合物PDB中提取单链或多链PDB
发布评论