1 Liolios K, Mavromatis K, Tavernarakis N, Kyrpides NC (2008) The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 36: 475–479.

2 Vincens P, Buffat L, André C, Chevrolat JP, Boisvieux JF, et al. (1998) A strategy for finding regions of similarity in complete genome sequences. Bioinformatics 14: 715–725.

3 Kurtz S, Schleiermacher C (1999) REPuter: Fast computation of maximal repeats in complete genomes. Bioinformatics 15(5): 426–427.

4 Paolo Ferragina , Roberto Grossi (1999) The string Btree: A new data structure for string search in external memory and its applications. J Assoc Comput Mach 46: 236–280.

5 Choi JH, G CH (2002) Analysis of common k-mers for whole genome sequences using SSB-tree. Genome Inform 13: 30–41.

6 Marais G, Kingsford C (2011) A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27(6): 764–770.

7 Karp RM, Rabin MO (1987) Efficient randomized patternmatching algorithms. IBM J Res Dev 31(2): 249–260.

8 Hubbard TJP, Aken BL, Beal K, Ballester B, Caccamo M, et al. (2007) Ensembl 2007. Nucleic Acids Res (Database issue) 35: 610–617.

9 Stoesser G, Baker W, van den Broek AE, Camon E, Hingamp P, et al. (2000) The EMBL Nucleotide Sequence Database. Nucleic Acids Res 28: 19–23.

10 Khil PP, Camerini-Otero RD (2005) Molecular Features and Functional Constraints in the Evolu-tion of the Mammalian X Chromosome. Critical Reviews in Biochemistry and Molecular Biology 40(6): 313–330.