Faster way to split a string and count characters using R?

后端未结

关注

 6  842

太阳男子 2021-02-01 08:51

I\'m looking for a faster way to calculate GC content for DNA strings read in from a FASTA file. This boils down to taking a string and counting the number of times that the let

6条回答

轮回少年 (楼主)

2021-02-01 09:23
I don't know that it's any faster, but you might want to look at the R package seqinR - http://pbil.univ-lyon1.fr/software/seqinr/home.php?lang=eng. It is an excellent, general bioinformatics package with many methods for sequence analysis. It's in CRAN (which seems to be down as I write this).

GC content would be:
```
mysequence <- s2c("agtctggggggccccttttaagtagatagatagctagtcgta")
    GC(mysequence)  # 0.4761905
```
That's from a string, you can also read in a fasta file using "read.fasta()".
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...