The heart of GATB is GATB-Core : a high-performance and low memory footprint C++ library.
GATB-Core natively provides the following operations:
Reads handling: |
|
K-mer: |
|
de Bruijn graph: |
|
Other optimized data structures
In addition to the de Bruijn graph data structure, GATB-Core provides several other ones that can be of interest for general purpose developments. These are:
- Open-Addressing Hash Table
- Linked-List Hash Table
- Bloom Filters. There are several flavors: basic, cache-optimized, optimized for k-mer neighbors; accessible through BloomFactory.
- Minimal Perfect Hash Function (BBHash)
The GATB-Core library serves as a layer to develop tools and pipelines to decipher NGS data:
|
Since GATB-Core is a software library, the audience is mainly developers interested in creating software to perform custom-made NGS data analysis tasks. Example usages are assembly tools, de novo variant detection, reads error correction, reads compression.
From a developper point of view, the GATB-Core library provides APIs for creating/loading/traversing de Bruijn graphs, counting kmers, etc. The provided APIs are intended to be simple to use and should allow easy development of new softwares.
GATB-Core is available through an open-source C++ API. A wrapper for Python 3 (pyGATB) is also available.
You can download GATB-Core as an archive holding the library and the header files, or by cloning a GIT repository from GitHub if you need the source code.
The best way to see how to use GATB-Core as a developper is first to have a look at the tutorials. In a second step, the reference documentation should give further details.