bioinformatics | 易学教程

ImportError: cannot import name _aligners [biopython]

阅读更多关于 ImportError: cannot import name _aligners [biopython]

问题 I am doing bioinformatics that has biopython dependency. Biopython always give me the following error: I hope someone could help me with this issue. Thank you! 回答1: This can occur on Biopython version >= 1.72 and has been discussed on the biopython mailing list here. This error occurs when you try and import while inside the biopython/ directory, to fix the error simply move to another directory outside the source tree and then execute your code. If the error still occurs then likely the

Find nucleotides in DNA sequence with perl

阅读更多关于 Find nucleotides in DNA sequence with perl

问题 I have the sequence DNA and I want to find nucleotide of the sequence at the position which was chosed by people. Below is the example: Enter the sequence DNA: ACTAAAAATACAAAAATTAGCCAGGCGTGGTGGCAC (the length of sequence is 33) Enter the position: (12) I hope the result is the position number 12 the nucleotides are AAA. I have no problem finding the amino acid of the position. Below is the current code I have. print "ENTER THE FILENAME OF THE DNA SEQUENCE:= "; $DNAfilename = <STDIN>; chomp

How to combine intervals data into fewer intervals in R?

阅读更多关于 How to combine intervals data into fewer intervals in R?

问题 I am trying to collapse a series of intervals into fewer, equally meaningful intervals. Consider for example this list of intervals Intervals = list( c(23,34), c(45,48), c(31,35), c(7,16), c(5,9), c(56,57), c(55,58) ) Because the intervals overlap, the same intervals can be described with few vectors. Plotting these intervals make obvious that a list of 4 vectors would be enough plot(1,1,type="n",xlim=range(unlist(Intervals)),ylim=c(0.9,1.1)) segments( x0=sapply(Intervals,"[",1), x1=sapply

perl Script to search for a motif in a multifasta file and print the complete sequence along with the header line

阅读更多关于 perl Script to search for a motif in a multifasta file and print the complete sequence along with the header line

问题 I am able to search a motif in a multi fasta file and print the line containing the motif.... but i need to print all the sequences along with the header line of the motif containing fasta sequence. Please help me i am just a beginner in perl #!usr/bin/perl -w use strict; print STDOUT "Enter the motif: "; my $motif = <STDIN>; chomp $motif; my $line; open (FILE, "data.fa"); while ($line = <FILE>) { if ($line =~ /$motif/) { print $line; } } 回答1: Try this: Bio::DB::Fasta Instructions on the page

Rename list of lists using a named list

阅读更多关于 Rename list of lists using a named list

问题 So I'm working with a list that contains other lists inside, with this structure: library(graph) library(RBGL) library(Rgraphviz) show(tree) $`SO:0001968` $`SO:0001968`$`SO:0001622` $`SO:0001968`$`SO:0001622`$`SO:0001624` $`SO:0001968`$`SO:0001622`$`SO:0001624`$`SO:0002090` [1] 1 $`SO:0001968`$`SO:0001622`$`SO:0001623` $`SO:0001968`$`SO:0001622`$`SO:0001623`$`SO:0002091` [1] 1 $`SO:0001968`$`SO:0001969` $`SO:0001968`$`SO:0001969`$`SO:0002090` [1] 1 $`SO:0001968`$`SO:0001969`$`SO:0002091` [1]

Checking if value in vector is in range of values in different length vector [duplicate]

阅读更多关于 Checking if value in vector is in range of values in different length vector [duplicate]

问题 This question already has answers here : Overlap join with start and end positions (3 answers) Closed 2 years ago . So I'm working in R and have a large dataframe that contains a vector that has genome positions like such: 2655180 2657176 2658869 And a second dataframe that has a a range of positions and a gene like such: chr1 100088228 100162167 AGL chr1 107599438 107600565 PRMT6 chr1 115215635 115238091 AMPD1 chr1 11850637 11863073 MTHFR chr1 119958143 119965343 HSD3B2 chr1 144124628

Can Biopython perform Seq.find() accounting for ambiguity codes

阅读更多关于 Can Biopython perform Seq.find() accounting for ambiguity codes

问题 I want to be able to search a Seq object for a subsequnce Seq object accounting for ambiguity codes. For example, the following should be true: from Bio.Seq import Seq from Bio.Alphabet.IUPAC import IUPACAmbiguousDNA amb = IUPACAmbiguousDNA() s1 = Seq("GGAAAAGG", amb) s2 = Seq("ARAA", amb) # R = A or G print s1.find(s2) If ambiguity codes were taken into account, the answer should be >>> 2 But the answer i get is that no match is found, or >>> -1 Looking at the biopython source code, it

Changing the x-axis of seqlogo figures in MATLAB

阅读更多关于 Changing the x-axis of seqlogo figures in MATLAB

问题 I'm making a large number of seqlogos programmatically. They are hundreds of columns wide and so running a seqlogo normally creates letters that are too thin to see. I've noticed that I only care about a few of these columns (not necessarily consecutive columns) ... most are noise but some are highly conserved. I use something like this snippet: wide_seqs = cell2mat(arrayfun(@randseq, repmat(200, [500 1]), 'uniformoutput', false)); wide_seqs(:, [17,30, 55,70,130]) = repmat(['ATCGG'], [500 1])

'StringCut' to the left or right of a defined position using Mathematica

阅读更多关于 'StringCut' to the left or right of a defined position using Mathematica

问题 On reading this question, I thought the following problem would be simple using StringSplit Given the following string, I want to 'cut' it to the left of every "D" such that: I get a List of fragments (with sequence unchanged) StringJoin @fragments gives back the original string (but is does not matter if I have to reorder the fragments to obtain this). That is, sequence within each fragment is important, and I do not want to lose any characters. (The example I am interested in is a protein

SeqIO.parse on a fasta.gz

阅读更多关于 SeqIO.parse on a fasta.gz

问题 New to coding. New to Pytho/biopython; this is my first question online, ever. How do I open a compressed fasta.gz file to extract info and perform calcuations in my function. Here is a simplified example of what I'm trying to do (I've tried different ways), and what the error is. The gzip command I'm using doesn't seem to work.? with gzip.open("practicezip.fasta.gz", "r") as handle: for record in SeqIO.parse(handle, "fasta"): print(record.id) Traceback (most recent call last): File "<ipython