New GATB-CORE version 1.0.6

The new version of the GATB-CORE library 1.0.6 is now available.

This version provides:

  • speed up from x2 to x4 for kmer counting and graph construction phases (optimizations based on minimizers and improved Bloom filters). GATB’s k-mer counter has been improved using techniques from KMC2, to achieve competitive running times compared to KMC2.
  • ability to store arbitrary information associated to each kmer of the graph, enabled by a minimal perfect hash function (costs only 2.61 bits/kmer of memory)
  • improved API with new possibilities (banks and kmers management)
  • many new snippets showing how to use the library.

The library can be downloaded here.

The reference library API is available here.

Tools based on the library can be found here.

Here, we compare version 1.0.5 and 1.0.6 on ERR599057 (Tara Ocean, 32 Gbp) with 8 cores and 4 GB memory

                         v1.0.5                 v1.0.6
---------------------------------------------------------
DSK                    109 min 29 sec       27 min 34 sec
Bloom                    2 min 37 sec        2 min 48 sec
Debloom                 55 min 43 sec       11 min 29 sec
Branching                2 min 52 sec        4 min 33 sec
total  (time & cpu)    173 min (186%)       48 min (334%)
max mem (MB)           6972 MB              4777 MB

Comments are closed.