The long march overcomes these challenges by extending the average contig length and significantly increasing the target sequence coverage obtained from high-throughput short-read sequencing Cycloheximide inquirer technologies without the cost of obtaining more reads per sample or the high error rate of directly extending read lengths. Highthroughput sequencing platforms generally require the addition of adapters to the ends of DNA fragments. The long march utilizes repeated cycles of Type IIS restriction enzyme cleavage and adapter ligation to allow extended sequencing of each library amplicon without loss of gene expression information. We have demonstrated the utility of the long march in the context of transcriptome resequencing, as well as in the context of clinical specimen metagenomics. We have also provided a theoretical framework for the application of the long march to de novo genome assembly. The long march protocol capitalizes on amplicon library redundancies resulting from biases introduced during sample preparation. These redundancies typically result in wasteful sequencing of multiple identical short reads derived from the ends of identical amplicons. For the Plasmodium falciparum and HBV samples described here, the long march extended the amount of genome coverage within a dataset of a fixed number of reads, even when that dataset was relatively small. This extension in genome coverage stems from narrowing the dynamic range of individual nucleotide coverage, since redundant reads from the initial libraries were distributed over a longer CT99021 side effects distance after the libraries were marched. In metagenomic analysis, short-read redundancy can obscure the identities of the organisms present in the sample. Characterization of microbial diversity and function from metagenomic sequence data is dependent on the identification of homology to known biological sequence. Longer contigs permit more effective detection of genetic homology to known sequences by use of BLASTN or TBLASTX. The availability of greater coverage and longer contigs from the long march improves the likelihood of successful alignment and thus discovery of both known and novel organisms in a heterogeneous metagenomic sample. The ability to assemble overlapping reads into reliable contigs is also crucial for de novo genome sequencing applications. With standard amplicon libraries, chance is relied upon to produce reads with sufficient overlap for assembly, and thus short-read datasets pose particular challenges by limiting the amount of overlap obtainable between any two reads. The long march allows read overlaps to be biased toward lengths sufficient for accurate assembly but also conservative enough to promote contig growth. Informed choice of restriction enzyme allows adjustment of the procedure��s step size to facilitate accurate assembly of a predicted number of unique sequences.