bioinformatics

Extracting only the chains that we need from a PDB file

落爺英雄遲暮 提交于 2021-01-28 07:04:15
问题 I need to extract specific chains from PDB files( Sometiems more than one chain). How to extract chains from a PDB file?. It's the same question and "marked" answer, answers my problem. But it does not work in python 3. It gives errors one after the other. Does anybody knows how can i work this in python 3? Or any other code for the same kind of problem Thank you in advance. import os from Bio import PDB class ChainSplitter: def __init__(self, out_dir=None): """ Create parsing and writing

How to segment similar looking areas (color wise) inside a image that belong to biosamples with opencv and python?

邮差的信 提交于 2021-01-28 02:03:50
问题 I'm trying to analyze images of bio-films of pseudomona, I'm doing this in order to find some kind of correlation between its grow and distribution with some independent variables. I've applied a segmentation to obtain a circular area of interest and now I was thinking in applying some color segmentation to the image with its HSV values to just leave the areas with bio-film. I've trying to think in a way to completely isolate all the areas that are important, I applied a bitwise_not to the

Unable to use biomaRt package to get Gene Symbols from Entrez IDs

喜欢而已 提交于 2021-01-27 19:06:28
问题 I am using the following code to retrieve Gene Symbols from Entrez IDs: library("biomaRt") ensembl <- useMart("ENSEMBL_MART_ENSEMBL", dataset = "hsapiens_gene_ensembl", host = "www.ensembl.org") g <- getBM(c("hgnc_symbol"), filters = "entrezgene", c(entrez), ensembl) but I get the following error: Error in value[[3L]](cond): Request to BioMart web service failed. Verify if you are still connected to the internet. Alternatively the BioMart web service is temporarily down. Traceback: 1. getBM(c

How to concatenate files that have the same beginning of a name?

旧时模样 提交于 2021-01-27 10:46:00
问题 I have a directory with a few hundred *.fasta files, such as: Bonobo_sp._str01_ABC784267_CDE789456.fasta Homo_sapiens_cc21_ABC897867_CDE456789.fasta Homo_sapiens_cc21_ABC893673_CDE753672.fasta Gorilla_gorilla_ghjk6789_ABC736522_CDE789456.fasta Gorilla_gorilla_ghjk6789_ABC627190_CDE891345.fasta Gorilla_gorilla_ghjk6789_ABC117190_CDE661345.fasta etc. I want to concatenate files that belong to the same species, so in this case Homo_sapiens_cc21 and Gorilla_gorilla_ghjk6789. Almost every species

How to concatenate files that have the same beginning of a name?

别说谁变了你拦得住时间么 提交于 2021-01-27 10:43:22
问题 I have a directory with a few hundred *.fasta files, such as: Bonobo_sp._str01_ABC784267_CDE789456.fasta Homo_sapiens_cc21_ABC897867_CDE456789.fasta Homo_sapiens_cc21_ABC893673_CDE753672.fasta Gorilla_gorilla_ghjk6789_ABC736522_CDE789456.fasta Gorilla_gorilla_ghjk6789_ABC627190_CDE891345.fasta Gorilla_gorilla_ghjk6789_ABC117190_CDE661345.fasta etc. I want to concatenate files that belong to the same species, so in this case Homo_sapiens_cc21 and Gorilla_gorilla_ghjk6789. Almost every species

error filtering data: Faceting variables must have at least one value

纵然是瞬间 提交于 2020-12-15 05:06:50
问题 I am trying to write a code by using dplyr and a yeast dataset I Read in the data with this code gdat <- read_csv(file = "brauer2007_tidy1.csv") I ommitted na's by using this gdat <- na.omit(gdat) library(ggplot2) Then I tried to filter some genes according to their column name "symbol" and used ggplot to make a plot filter(gdat, symbol=="QRI7", symbol== "CFT2", symbol== "RIB2", symbol=="EDC3", symbol=="VPS5", symbol=="AMN1" & rate=.05) %>% ggplot(aes(x=rate, y=expression, group=1, colour

Protein sequence from uniprot protein id python

假装没事ソ 提交于 2020-12-03 17:59:38
问题 I was wondering if there is way to get the sequence of proteins from uniprot protein ids. I did check few online softwares but they allow to get one sequence at a time but I have 5536 vlues. Is there any package in biopython to do this? 回答1: All the sequences from uniprot can be accesed from "http://www.uniprot.org/uniprot/" + UniprotID +.fasta. You can obtain any sequence with import requests as r from Bio import SeqIO from io import StringIO cID='P04637' baseUrl="http://www.uniprot.org

Mean square displacement python

半腔热情 提交于 2020-11-29 10:46:28
问题 I have a trajectory file from simulation of 20,000 frames with 5 ps time in between every frame, what I want to do is to calculate diffusion in 2 dimension (x and y axis). but to calculate diffusion in 2D, first I have to calculate Mean square displacement of the molecule under study. MSD calculates the average time taken by molecule to explore the system in random walks. I am very new to python programming and I would really want some help to get started this problem and to solve this