from reads to de bruijn graph

In a few words, GATB-Core takes a set of reads as an input (Fasta or Fastq files) and builds a structure representing these reads as a De Bruijn graph. The graph is saved in a HDF5 file. It is then loaded in memory in a tool (see GATB-TOOLS) to perform various genome analysis tasks.


One of the specificity of GATB-Core is its De Bruijn graph structure used for representing reads. From an historic point of view, the De Bruijn graph used in GATB-Core comes from the Minia software. Read more about GATB-Core concepts here. GATB-Core offers state of the art performance:

Bacterial dataset Whole human dataset Large (meta-)genome (10-20 Gbp)
Graph construction time few minutes 6 hours 1-3 days
Memory usage  < 1 GB < 10 GB < 100 GB

Comments are closed