protein-database

Remove heteroatoms from PDB

让人想犯罪 __ 提交于 2020-05-01 06:09:26
问题 The heteroatoms from pdb file has to be removed. Here is the code but it did not work with my test PDB 1C4R. for model in structure: for chain in model: for reisdue in chain: id = residue.id if id[0] != ' ': chain.detach_child(id) if len(chain) == 0: model.detach_child(chain.id) Any suggestion? 回答1: The heteroatoms shouldn't be part of the chain. But you can know if a residue is a heteroatom with: pdb = PDBParser().get_structure("1C4R", "1C4R.pdb") for residue in pdb.get_residues(): tags =

How to delete lines that match elements from another file

风流意气都作罢 提交于 2019-12-25 02:46:16
问题 I am in the process of learning Perl and I am trying to figure out how to do this task. I have a folder with a bunch of text files and I have a file ions_solvents_cofactors that contains bunch of three letters list. I wrote a script that opens and reads each file in a folder and should delete those lines that under a specific column [3] matches with some element from the list. It is not working well. I have some problem at the end of the script and cant figure out what it is. Error I get is :

Biopython: How to avoid particular amino acid sequences from a protein so as to plot Ramachandran plot?

陌路散爱 提交于 2019-12-22 09:26:05
问题 I have written a python script to plot the 'Ramachandran Plot' of Ubiquitin protein. I am using biopython. I am working with pdb files. My script is as below : import Bio.PDB import numpy as np import matplotlib as mpl import matplotlib.pyplot as plt phi_psi = ([0,0]) phi_psi = np.array(phi_psi) pdb1 ='/home/devanandt/Documents/VMD/1UBQ.pdb' for model in Bio.PDB.PDBParser().get_structure('1UBQ',pdb1) : for chain in model : polypeptides = Bio.PDB.PPBuilder().build_peptides(chain) for poly

Want to pull a journal title from an RCSB Page using python & BeautifulSoup

房东的猫 提交于 2019-12-13 03:36:22
问题 I am trying to get specific information about the original citing paper in the Protein Data Bank given only the 4 letter PDBID of the protein. To do this I am using the python libraries requests and BeautifulSoup. To try and build the code, I went to the page for a particular protein, in this case 1K48, and also save the HTML for the page (by hitting command+s and saving the HTML to my desktop). First things to note: 1) The url for this page is: http://www.rcsb.org/pdb/explore.do?structureId

How to compare two txt files and then apply changes in one of them

折月煮酒 提交于 2019-12-11 19:08:18
问题 I am trying to merge two text (PDB) files. One (bigger one) contains full set of data describing the protein, second one contains very small set of data changing just small part (set of coordinates). Example: Basic file (part): ATOM 605 CD2 LEU A 92 11.727 14.051 55.011 1.00 75.51 4pxz C ATOM 606 N ARG A 93 10.555 10.636 58.260 1.00 62.79 4pxz N ATOM 607 CA ARG A 93 11.357 9.429 58.493 1.00 59.89 4pxz C ATOM 608 C ARG A 93 10.429 8.207 58.562 1.00 62.83 4pxz C ATOM 609 O ARG A 93 10.760 7.168

Protein Sequence Alignment from Protein Databank to Cosmic or Uniprot

≡放荡痞女 提交于 2019-12-11 06:40:08
问题 I would like to match up PDB files from the Protein Databank to canonical AA sequences for the protein as displayed in Cosmic or Uniprot. Specifically, what I need to do is pull from the pdb file, the carbon alpha atoms in the backbone and their xyz positions. I also need to pull their actual order in the proteins sequence. For structure 3GFT (Kras - Uniprot Accession Number P01116), this is easy, I can just take the ResSeq number. However, for some other proteins, I can't figure out how this

DNA to RNA and Getting Proteins with Perl

二次信任 提交于 2019-12-10 20:33:11
问题 I am working on a project(I have to implement it in Perl but I am not good at it) that reads DNA and finds its RNA. Divide that RNA's into triplets to get the equivalent protein name of it. I will explain the steps: 1) Transcribe the following DNA to RNA, then use the genetic code to translate it to a sequence of amino acids Example: TCATAATACGTTTTGTATTCGCCAGCGCTTCGGTGT 2) To transcribe the DNA, first substitute each DNA for it’s counterpart (i.e., G for C, C for G, T for A and A for T):

Biopython: How to avoid particular amino acid sequences from a protein so as to plot Ramachandran plot?

99封情书 提交于 2019-12-06 02:16:43
I have written a python script to plot the 'Ramachandran Plot' of Ubiquitin protein. I am using biopython. I am working with pdb files. My script is as below : import Bio.PDB import numpy as np import matplotlib as mpl import matplotlib.pyplot as plt phi_psi = ([0,0]) phi_psi = np.array(phi_psi) pdb1 ='/home/devanandt/Documents/VMD/1UBQ.pdb' for model in Bio.PDB.PDBParser().get_structure('1UBQ',pdb1) : for chain in model : polypeptides = Bio.PDB.PPBuilder().build_peptides(chain) for poly_index, poly in enumerate(polypeptides) : print "Model %s Chain %s" % (str(model.id), str(chain.id)), print