DNA to RNA and Getting Proteins with Perl

二次信任 提交于 2019-12-10 20:33:11

问题


I am working on a project(I have to implement it in Perl but I am not good at it) that reads DNA and finds its RNA. Divide that RNA's into triplets to get the equivalent protein name of it. I will explain the steps:

1) Transcribe the following DNA to RNA, then use the genetic code to translate it to a sequence of amino acids

Example:

TCATAATACGTTTTGTATTCGCCAGCGCTTCGGTGT

2) To transcribe the DNA, first substitute each DNA for it’s counterpart (i.e., G for C, C for G, T for A and A for T):

TCATAATACGTTTTGTATTCGCCAGCGCTTCGGTGT
AGTATTATGCAAAACATAAGCGGTCGCGAAGCCACA

Next, remember that the Thymine (T) bases become a Uracil (U). Hence our sequence becomes:

AGUAUUAUGCAAAACAUAAGCGGUCGCGAAGCCACA

Using the genetic code is like that

AGU AUU AUG CAA AAC AUA AGC GGU CGC GAA GCC ACA

then look each triplet (codon) up in the genetic code table. So AGU becomes Serine, which we can write as Ser, or just S. AUU becomes Isoleucine (Ile), which we write as I. Carrying on in this way, we get:

SIMQNISGREAT

I will give the protein table:

So how can I write that code in Perl? I will edit my question and write the code that what I did.


回答1:


Try the script below, it accepts input on STDIN (or in file given as parameter) and read it by line. I also presume, that "STOP" in the image attached is some stop state. Hope I read it all well from that picture.

#!/usr/bin/perl
use strict;
use warnings;

my %proteins = qw/
    UUU F UUC F UUA L UUG L UCU S UCC S UCA S UCG S UAU Y UAC Y UGU C UGC C UGG W
    CUU L CUC L CUA L CUG L CCU P CCC P CCA P CCG P CAU H CAC H CAA Q CAG Q CGU R CGC R CGA R CGG R
    AUU I AUC I AUA I AUG M ACU T ACC T ACA T ACG T AAU N AAC N AAA K AAG K AGU S AGC S AGA R AGG R
    GUU V GUC V GUA V GUG V GCU A GCC A GCA A GCG A GAU D GAC D GAA E GAG E GGU G GGC G GGA G GGG G
    /;

LINE: while (<>) {
    chomp;

    y/GCTA/CGAU/; # translate (point 1&2 mixed)

    foreach my $protein (/(...)/g) {
        if (defined $proteins{$protein}) {
            print $proteins{$protein};
        }
        else {
            print "Whoops, stop state?\n";
            next LINE;
        }
    }
    print "\n"
}


来源:https://stackoverflow.com/questions/5382442/dna-to-rna-and-getting-proteins-with-perl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!