E.coli results show that numerous sORFs were overlooked in initial annotation

sRNAs lack primary sequence common statistical signals that might be exploited by reliable detection algorithms. thus, the genome-wide annotation of sRNAs has turned out to be a more complex and demanding problem than one expected. Bacterial genes average,1000 nucleotides in sequenced genomes. Annotation of sORFs is difficult, because they are ‘‘buried’’ in an enormous pile of short random open reading frames, BAY 60-6583 which, makes them unfavorable targets for random mutagenesis. To maintain a balance between underprediction and overprediction, we usually adopt certain arbitrary cut-offs for gene prediction, such as a 100 codon minimum ORF length. This means that many sORFs are not identified, including many with important functions, such as intercellular signals, intracellular toxins, and kinase inhibitors. Systematic analysis of the prevalence of sORFs had been performed in yeast and E.coli and results show that numerous sORFs were overlooked in initial annotation. Shigella species are Gram negative, non-sporulating, facultative anaerobes that cause bacillary dysentery,EF5 a disease which remains a major worldwide health problem. They are sub-grouped into four species: Shigella dysenteriae, Shigella flexneri, Shigella boydii, and Shigella sonnei. However, multilocus enzyme electrophoresis, multilocus sequence typing, and comparative genomic hybridization suggest that Shigella diverged from E. coli in several independent events, which means it may not constitute a separate genus. Results from several Shigella genome sequencing projects suggest that many sRNAs and sORFs were overlooked during initial annotation. Huang et al. reported that the number of sRNA genes in S. dysenteriae, S. flexneri, S. boydii, and S. sonnei were 33, 40, 35, and 38, respectively. However, these results were incomplete. The majority were identified in E.coli K12, based on conservation, meaning that sRNAs unique to Shigella were missed. Therefore, we performed a systematic analysis of sRNAs in Shigella. No previous reports exist of global experimental approaches for sRNA and sORF identification in the Shigella. Here we present a combined bioinformatic and experimental approach for finding sRNAs and sORFs in Shigella.