Reptile is a software developed in C++ for correcting sequencing errors in short reads from next-gen sequencing platforms. Reptile has several favorable properties:
Memory efficiency. Reptile can process input data with sizes larger than main memory. For instance, to process a 160x coverage (3.8GB) Illumina data for E. coli it requires only ~1GB memory, which is easily available in a desktop computer.
High speed. Processing Illumina data for a microbe typically takes 0.5hr ~ 2hrs, depending on the number and the quality of reads.
Can handle reads containing non-acgt characters and reads with non-equal length.
Makes simple use of quality score information.
Reptile has been developed by Xiao Yang, Karin Dorman and Srinivas Aluru.
Note: the default values of program parameters are dataset dependent, i.e., they vary as dataset changes
and hence are not “fixed” or “standard”.
The calculation of these parameters can be automated but currently, many of them need to be set manually using the method explained in the paper (there is no assumption of any information of the reference genome). In general, the default parameters are chosen based on the histograms of quality scores, tile occurrences, and so on, of the dataset under consideration.
Release 1.0 Click here to download (included are a simple documentation (readme) file, all source files and a preprocessing Perl script.)
Release 1.1 Click here to download (included are a simple documentation (readme) file, a release note, all source files) – Aug 2010