MindTheGap performs detection and assembly of DNA insertion variants in NGS read datasets with respect to a reference genome. It is designed to call insertions of any size, whether they are novel or duplicated, homozygous or heterozygous in the donor genome. It takes as input a set of reads and a reference genome. It outputs two sets of FASTA sequences: one is the set of breakpoints of detected insertion sites, the other is the set of assembled insertions for each breakpoint.
Since version 2.1.0, MindTheGap can also be used as a genome assembly finishing tool: it can fill the gaps between a set of input contigs without any a priori on their relative order and orientation. This new feature is available in the Fill module with option -contig.
More description, source code and released versions on github : https://github.com/GATB/MindTheGap
Reference
Guillaume Rizk, Anaïs Gouin, Rayan Chikhi and Claire Lemaitre. (2014) MindTheGap: integrated detection and assembly of short and long insertions. Bioinformatics, 30(24):3451-3457.
Ready-to-use executable
MindTheGap is available as a binary for immediate use on Linux and MacOSX platforms, with the following requirements:
MacOS-X 10.9 or above. (Intel 64bit processors) |
Download | |
Linux running on Intel or AMD 64bit processors. (kernel 2.6.32 or above, GLIBCXX_3.4.13 or above) |
Download |
For all other platforms or configurations, or if above binaries fail to run on your computer, you should download source code and compile it.
Source Code
MindTheGap tool is fully written in C++. | Download |
Documentation
How to use MindTheGap : all is in the README file.
The algorithms in picture : see the ECCB’14 poster mindthegap_poster.pdf
Some results
Since the publication the software was greatly improved in terms of running time but also in terms of recall and precision. Have a look at the new results here.
License
MindTheGap binaries and source code are covered by the Affero GPL version 3 license.