discoSNP runs in two steps: (1) detection of putative SNPs from the read datasets; (2) filtering and ranking based on coverage and base quality. Thanks to the use of the GATB-core library, the first step is able to handle very large datasets of billions of reads with a reasonable amount of memory. The processing of the mouse datasets required less than 6 GB of RAM. In comparison, the NIKS, KissSNP and Bubbleparse tools exceeded the memory limit on a server with 512 GB of RAM.
Reference
- Wong, K., Bumpstead, S., Van Der Weyden, L., Reinholdt, L. G., Wilming, L. G., Adams, D. J., and Keane, T. M. (August, 2012) Sequencing and characterization of the FVB/NJ mouse genome. Genome biology, 13(8), R72.
- Uricaru R., Rizk G., Lacroix V., Quillery E., Plantard O., Chikhi R., Lemaitre., Perterlongo P. (2015) Reference-free detection of isolated SNPs. Nucl. Acids Res. 43(2):e11