bioconductor | 易学教程

How to concatenate two DNAStringSet sequences per sample in R?

阅读更多关于 How to concatenate two DNAStringSet sequences per sample in R?

问题 I have two Large DNAStringSet objects, where each of them contain 2805 entries and each of them has length of 201. I want to simply combine them, so to have 2805 entries because each of them are this size, but I want to have one object, combination of both. I tried to do this s12 <- c(unlist(s1), unlist(s2)) But that created single Large DNAString object with 1127610 elements, and this is not what I want. I simply want to combine them per sample. EDIT: Each entry in my DNASTringSet objects

Can't get end value of an IRange in R/Bioconductor

阅读更多关于 Can't get end value of an IRange in R/Bioconductor

问题 I am new to the IRanges package and am having trouble getting the end value of an IRange. I am able to get the start and width values with no problem, which has me a bit baffled, and my case/spelling of end match the header line. Has anyone else run into this or can please spot what I am doing wrong? Thanks and it is much appreciated! library(IRanges) > test=IRanges(100645,100664) > test IRanges of length 1 start end width [1] 100645 100664 20 > test@start [1] 100645 > test@width [1] 20 >

Width of the overlapped segment in GenomicRanges package

阅读更多关于 Width of the overlapped segment in GenomicRanges package

问题 I'm using GenomicRanges to find which transcripts from one experiment overlap with those coming from other one. head(to_ranges1) knowngene chr strand Start Gene 1 uc001aaa.3 chr1 + 9873 16409 DDX11L1 2 uc001aac.4 chr1 - 12361 31370 WASH7P 3 uc001aae.4 chr1 - 12361 21759 WASH7P library(GenomicRanges) object_one<-with(to_ranges, GRanges(chr, IRanges(Start,End), strand,names=knowngene,Gene=Gene) object_two<-with(to_ranges, GRanges(chr, IRanges(Start,End), strand,names=knowngene, Gene=Gene)) mm<

Genomic coordinates of HGNC gene names

阅读更多关于 Genomic coordinates of HGNC gene names

问题 I want to get coordinates of human genes from my list (consisting of hgnc genes id) using GenomicFeatures and TxDb.Hsapiens.UCSC.hg19.knownGene R packages from Bioconductor. library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb=(TxDb.Hsapiens.UCSC.hg19.knownGene) my_genes = c("INO80","NASP","INO80D","SMARCA1") select(txdb, keys = my_genes, columns=c("TXCHROM","TXSTART","TXEND","TXSTRAND"), keytype="GENEID") However, it doesn't' work because txdb doesn't take hgnc identifiers; how can it be solved?

ExpressionSet subsetting

阅读更多关于 ExpressionSet subsetting

问题 I have an ExpressionSet object that I want to subset. For example, > str(ESet) Formal class 'ExpressionSet' [package "Biobase"] .. ..@ assayData :.. ..@ phenoData : .. .. .. ..$ STATUS : num [1:210] 1 1 1 1 1 1 1 1 1 1 ... .... I want to extract a subset where STATUS==0 . I've tried: exprs(ESet@phenoData$STATUS==0) but it does not work. 回答1: You are almost there. Guessing at your data structure, I think the following should work: exprs(ESet)[ESet@phenoData$STATUS==0,] If you look at this

R: when to use setGeneric or export a s4 method in the namespace

阅读更多关于 R: when to use setGeneric or export a s4 method in the namespace

问题 I am writing a small R package with the idea to submit it to Bioconductor in the future, which is why I decided to try out s4 classes. Unfortunately I had problems understanding when I should use setGeneric or not in my package, and the documentation for the setGeneric method is for me more or less incomprehensible. Concrete example: I created a s4 class called Foo I defined a method for the [<- operator using setMethod("[","Foo", ...) I defined a method for the as.list function using

Modify r object with rpy2

阅读更多关于 Modify r object with rpy2

问题 I'm trying to use rpy2 to use the DESeq2 R/Bioconductor package in python. I actually solved my problem while writing my question (using do_slots allows access to the r objects attributes), but I think the example might be useful for others, so here is how I do in R and how this translates in python: In R I can create a "DESeqDataSet" from two data frames as follows: counts_data <- read.table("long/path/to/file", header=TRUE, row.names="gene") head(counts_data) ## WT_RT_1 WT_RT_2 prg1_RT_1

Extracting values from IRanges objects in R/Bioconductor

阅读更多关于 Extracting values from IRanges objects in R/Bioconductor

问题 I've imported a UCSC alignability track into R using import.bw() (from the rtracklayer package) but am having trouble accessing the values I need. For example: I want to provide a chromosome and a base and return the value at that position. My object is called al100: > al100 RangedData with 21591667 rows and 1 value column across 25 spaces space ranges | score <factor> <IRanges> | <numeric> 1 chr1 [10001, 10014] | 0.002777778 2 chr1 [10015, 10015] | 0.333333343 3 chr1 [10016, 10026] | 0

how use matchpattern() to find certain aminoacid in a file with many sequence(.fasta) in R

阅读更多关于 how use matchpattern() to find certain aminoacid in a file with many sequence(.fasta) in R

问题 I have a file (mydata.txt) that contains many exon sequences with fasta format. I want to find start ('atg') and stop ('taa','tga','tag') codons for each DNA sequence (considering the frame). I tried using matchPattern ( a function from the Biostrings R package) to find theses amino acids: As an example mydata.txt could be: >a atgaatgctaaccccaccgagtaa >b atgctaaccactgtcatcaatgcctaa >c atggcatgatgccgagaggccagaataggctaa >d atggtgatagctaacgtatgctag >e atgccatgcgaggagccggctgccattgactag file=read

CRAN Package Depends on Bioconductor Package Installing error

阅读更多关于 CRAN Package Depends on Bioconductor Package Installing error

问题 I manage the Depends, suggests and imports of the description file. and finally I submit my package to CRAN . But during installation the package, it only install the packages which are deposited under CRAN not for bioconductor packages. besides, it has a package dependencies error for Mac OS: check log for Mac OS what could be the problem? and how could I fixed it? Kind regards, 回答1: There is no mechanism by which install.packages() can install from Bioconductor by default in R ( at least