Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 6;21(1):285.
doi: 10.1186/s12859-020-03623-1.

Optical map guided genome assembly

Affiliations

Optical map guided genome assembly

Miika Leinonen et al. BMC Bioinformatics. .

Abstract

Background: The long reads produced by third generation sequencing technologies have significantly boosted the results of genome assembly but still, genome-wide assemblies solely based on read data cannot be produced. Thus, for example, optical mapping data has been used to further improve genome assemblies but it has mostly been applied in a post-processing stage after contig assembly.

Results: We propose OPTICALKERMIT which directly integrates genome wide optical maps into contig assembly. We show how genome wide optical maps can be used to localize reads on the genome and then we adapt the Kermit method, which originally incorporated genetic linkage maps to the miniasm assembler, to use this information in contig assembly. Our experimental results show that incorporating genome wide optical maps to the contig assembly of miniasm increases NGA50 while the number of misassemblies decreases or stays the same. Furthermore, when compared to the Canu assembler, OPTICALKERMIT produces an assembly with almost three times higher NGA50 with a lower number of misassemblies on real A. thaliana reads.

Conclusions: OPTICALKERMIT successfully incorporates optical mapping data directly to contig assembly of eukaryotic genomes. Our results show that this is a promising approach to improve the contiguity of genome assemblies.

Keywords: Genome assembly; Optical mapping.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Example DNA sequence and its optical map corresponding to XhoI restriction enzyme. XhoI recognizes ‘CTCGAG’ sites and cleaves the sequence after the first C-nucleotides. The lengths of the resulting fragments are then measured to compose the optical map
Fig. 2
Fig. 2
OPTICALKERMIT assembly workflow
Fig. 3
Fig. 3
Contig coloring example illustrating four different fragment alignment cases. Case 1: Contig fragment a aligns to reference fragment B, and is colored with its color. Case 2: Contig fragment b aligns to reference fragments C and D. It is split into two fragments b1 and b2 whose total length is equal to the length of b, keeping the proportions of C and D. Fragment b1 is colored with the color of fragment C, and b2 with the color of D. Case 3: Contig fragments c and d align to reference fragment E. Fragments c and d are merged together into one fragment cd which is colored with the color of fragment E. Case 4: Contig fragments e and f align to reference fragments F and G. Fragments e and f are transformed into two fragments ef1 and ef2 with the total length of e and f, keeping the proportions of fragments F and G. Fragment ef1 is colored with the color of fragment F, and ef2 with the color of G
Fig. 4
Fig. 4
Simplified example of the two different cases of read coloring when a reasonably strong alignment with a colored contig is found. Case 1: Left hand side coloring is based on the aligning segment only. Case 2: Right hand side coloring is based on the aligning segment extended with the lengths of the non-aligning sections of the read

References

    1. Sedlazeck FJ, Lee H, Darby CA, Schatz MC. Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nat Rev Genet. 2018;19:329–46. doi: 10.1038/s41576-018-0003-4. - DOI - PubMed
    1. Dimalanta ET, Lim A, Runnheim R, Lamers C, Churas C, Forrest DK, de Pablo JJ, Graham MD, Coppersmith SN, Goldstein S, et al. A microfluidic system for large DNA molecule arrays. Anal Chem. 2004;76(18):5293–301. doi: 10.1021/ac0496401. - DOI - PubMed
    1. Samad A, Huff EF, Cai W, Schwartz DC. Optical mapping: A novel, single-molecule approach to genomic analysis. Genome Res. 1995;5(1):1–4. doi: 10.1101/gr.5.1.1. - DOI - PubMed
    1. Lin HC, Goldstein S, Mendelowitz L, Zhou S, Wetzel J, Schwartz DC, Pop M. AGORA: Assembly guided by optical restriction alignment. BMC Bioinformatics. 2012;13:189. doi: 10.1186/1471-2105-13-189. - DOI - PMC - PubMed
    1. Alipanahi B, Salmela L, Puglisi SJ, Muggli M, Boucher C. Disentangled long-read de Bruijn graphs via optical maps. In: Schwartz R, Reinert K, editors. 17th International Workshop on Algorithms in Bioinformatics, WABI 2017. Leibniz International Proceedings in Informatics. Germany: Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik; 2017.

MeSH terms

LinkOut - more resources