MindTheGap  performs detection and assembly of DNA insertion variants in NGS read datasets with respect to a reference genome. It is designed to call insertions of any size, whether they are novel or duplicated, homozygous or heterozygous in the donor genome. It takes as input a set of reads and a reference genome. It outputs two sets of FASTA sequences: one is the set of breakpoints of detected insertion sites, the other is the set of assembled insertions for each breakpoint.

Since version 2.1.0, MindTheGap can also be used as a genome assembly finishing tool: it can fill the gaps between a set of input contigs without any a priori on their relative order and orientation. This new feature is available in the Fill module with option -contig.

More description, source code and released versions on github : https://github.com/GATB/MindTheGap


Guillaume Rizk, Anaïs Gouin, Rayan Chikhi and Claire Lemaitre. (2014) MindTheGap: integrated detection and assembly of short and long insertions. Bioinformatics, 30(24):3451-3457.

Ready-to-use executable

MindTheGap is available as a binary for immediate use on Linux and MacOSX platforms, with the following requirements:

maclogo MacOS-X 10.9 or above.
(Intel 64bit processors)
linuxlogo Linux running on Intel or AMD 64bit processors.
(kernel 2.6.32 or above, GLIBCXX_3.4.13 or above)

For all other platforms or configurations, or if above binaries fail to run on your computer, you should download source code and compile it.

Source Code

cpp-logo MindTheGap tool is fully written in C++. Download


How to use MindTheGap : all is in the README file.

The algorithms in picture : see the ECCB’14 poster mindthegap_poster.pdf

Some results

Since the publication the software was greatly improved in terms of running time but also in terms of recall and precision. Have a look at the new results here.


MindTheGap binaries and source code are covered by the Affero GPL version 3 license.

Comments are closed.