The completeness and contamination values were calculated using the CheckM tool. According to CheckM2, the genome shows 98.62% completeness and 7.42% contamination. Notably, the housekeeping genes responsible for the contamination estimate are present in two copies but located adjacently within the same assembled contig. This suggests that these genes are duplicated in the genome.
From the genes detected by MiGA as duplicated, the majority are encoded on the same contig. We performed functional annotation using the TREMBL and NCBI databases, and in all cases, the best hits were associated with Izemoplasmataceae genomes.
Essential genes detected: 103/106
Completeness: 97.2%
Contamination: 10.4%
Duplicated genes encoded on the same contig, both annotated as Izemoplasma:
2 × Ribosomal_L23 (ribosomal protein L23)
2 × Ribosomal_L6 (ribosomal protein L6)
2 × Ribosomal_L4 (ribosomal protein L4/L1 family)
2 × TIGR00060 (ribosomal protein L18)
2 × TIGR01021 (ribosomal protein S5)
2 × TIGR01071 (ribosomal protein L15)
2 × TIGR03263 (guanylate kinase)
2 × TIGR03594 (ribosome-associated GTPase EngA)
Duplicated genes located on different contigs but affiliated with Izemoplasma:
2 × tRNA-synt_1d (tRNA synthetase class I, R)
3 × TIGR00436 (GTP-binding protein Era)
Since these latter genes occur on different contigs, they could be the ones considered as potential contaminants. Therefore, we can estimate that only two genes correspond to contamination (one with three copies), representing approximately 2.9% contamination.