bioconductor

How to concatenate two DNAStringSet sequences per sample in R?

拥有回忆 提交于 2020-01-07 02:03:57
问题 I have two Large DNAStringSet objects, where each of them contain 2805 entries and each of them has length of 201. I want to simply combine them, so to have 2805 entries because each of them are this size, but I want to have one object, combination of both. I tried to do this s12 <- c(unlist(s1), unlist(s2)) But that created single Large DNAString object with 1127610 elements, and this is not what I want. I simply want to combine them per sample. EDIT: Each entry in my DNASTringSet objects

Can't get end value of an IRange in R/Bioconductor

穿精又带淫゛_ 提交于 2020-01-05 10:35:47
问题 I am new to the IRanges package and am having trouble getting the end value of an IRange. I am able to get the start and width values with no problem, which has me a bit baffled, and my case/spelling of end match the header line. Has anyone else run into this or can please spot what I am doing wrong? Thanks and it is much appreciated! library(IRanges) > test=IRanges(100645,100664) > test IRanges of length 1 start end width [1] 100645 100664 20 > test@start [1] 100645 > test@width [1] 20 >

Width of the overlapped segment in GenomicRanges package

為{幸葍}努か 提交于 2020-01-04 01:54:04
问题 I'm using GenomicRanges to find which transcripts from one experiment overlap with those coming from other one. head(to_ranges1) knowngene chr strand Start Gene 1 uc001aaa.3 chr1 + 9873 16409 DDX11L1 2 uc001aac.4 chr1 - 12361 31370 WASH7P 3 uc001aae.4 chr1 - 12361 21759 WASH7P library(GenomicRanges) object_one<-with(to_ranges, GRanges(chr, IRanges(Start,End), strand,names=knowngene,Gene=Gene) object_two<-with(to_ranges, GRanges(chr, IRanges(Start,End), strand,names=knowngene, Gene=Gene)) mm<

Genomic coordinates of HGNC gene names

为君一笑 提交于 2019-12-24 10:14:23
问题 I want to get coordinates of human genes from my list (consisting of hgnc genes id) using GenomicFeatures and TxDb.Hsapiens.UCSC.hg19.knownGene R packages from Bioconductor. library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb=(TxDb.Hsapiens.UCSC.hg19.knownGene) my_genes = c("INO80","NASP","INO80D","SMARCA1") select(txdb, keys = my_genes, columns=c("TXCHROM","TXSTART","TXEND","TXSTRAND"), keytype="GENEID") However, it doesn't' work because txdb doesn't take hgnc identifiers; how can it be solved?

ExpressionSet subsetting

◇◆丶佛笑我妖孽 提交于 2019-12-24 01:17:03
问题 I have an ExpressionSet object that I want to subset. For example, > str(ESet) Formal class 'ExpressionSet' [package "Biobase"] .. ..@ assayData :.. ..@ phenoData : .. .. .. ..$ STATUS : num [1:210] 1 1 1 1 1 1 1 1 1 1 ... .... I want to extract a subset where STATUS==0 . I've tried: exprs(ESet@phenoData$STATUS==0) but it does not work. 回答1: You are almost there. Guessing at your data structure, I think the following should work: exprs(ESet)[ESet@phenoData$STATUS==0,] If you look at this

R: when to use setGeneric or export a s4 method in the namespace

ⅰ亾dé卋堺 提交于 2019-12-23 09:49:29
问题 I am writing a small R package with the idea to submit it to Bioconductor in the future, which is why I decided to try out s4 classes. Unfortunately I had problems understanding when I should use setGeneric or not in my package, and the documentation for the setGeneric method is for me more or less incomprehensible. Concrete example: I created a s4 class called Foo I defined a method for the [<- operator using setMethod("[","Foo", ...) I defined a method for the as.list function using

Modify r object with rpy2

风格不统一 提交于 2019-12-22 18:23:27
问题 I'm trying to use rpy2 to use the DESeq2 R/Bioconductor package in python. I actually solved my problem while writing my question (using do_slots allows access to the r objects attributes), but I think the example might be useful for others, so here is how I do in R and how this translates in python: In R I can create a "DESeqDataSet" from two data frames as follows: counts_data <- read.table("long/path/to/file", header=TRUE, row.names="gene") head(counts_data) ## WT_RT_1 WT_RT_2 prg1_RT_1

Extracting values from IRanges objects in R/Bioconductor

萝らか妹 提交于 2019-12-21 23:26:02
问题 I've imported a UCSC alignability track into R using import.bw() (from the rtracklayer package) but am having trouble accessing the values I need. For example: I want to provide a chromosome and a base and return the value at that position. My object is called al100: > al100 RangedData with 21591667 rows and 1 value column across 25 spaces space ranges | score <factor> <IRanges> | <numeric> 1 chr1 [10001, 10014] | 0.002777778 2 chr1 [10015, 10015] | 0.333333343 3 chr1 [10016, 10026] | 0

how use matchpattern() to find certain aminoacid in a file with many sequence(.fasta) in R

喜你入骨 提交于 2019-12-21 06:27:14
问题 I have a file (mydata.txt) that contains many exon sequences with fasta format. I want to find start ('atg') and stop ('taa','tga','tag') codons for each DNA sequence (considering the frame). I tried using matchPattern ( a function from the Biostrings R package) to find theses amino acids: As an example mydata.txt could be: >a atgaatgctaaccccaccgagtaa >b atgctaaccactgtcatcaatgcctaa >c atggcatgatgccgagaggccagaataggctaa >d atggtgatagctaacgtatgctag >e atgccatgcgaggagccggctgccattgactag file=read

CRAN Package Depends on Bioconductor Package Installing error

て烟熏妆下的殇ゞ 提交于 2019-12-20 18:34:19
问题 I manage the Depends, suggests and imports of the description file. and finally I submit my package to CRAN . But during installation the package, it only install the packages which are deposited under CRAN not for bioconductor packages. besides, it has a package dependencies error for Mac OS: check log for Mac OS what could be the problem? and how could I fixed it? Kind regards, 回答1: There is no mechanism by which install.packages() can install from Bioconductor by default in R ( at least