Metagenomic sequencing reads were trimmed for adapter sequences and quality using sickle. The filtered metagenomic reads from 2018 were assembled using the IDBA_UD assembler (Peng et al. 2012) . The reads from 2019-2020 using metaSPAdes (Nurk et al. 2017) Contigs greater than 2.5 Kb were retained and sequencing reads from all samples were cross-mapped against each resulting assembly using Bowtie2 (Langmead and Salzberg 2012). The resulting differential coverage profiles were filtered at a 95% read identity cutoff and then used for genome binning with MetaBAT2 (Kang et al. 2019), VAMB (Nissen et al. 2021) and MaxBin2 (Wu et al. 2016). The resulting genome bins were assessed for completeness and contamination using CheckM2 (Parks et al. 2015) and were manually curated using taxonomic profiling with GGKBase (
www.ggkbase.berkeley.edu). The first iteration of taxonomy was assigned to genome bins with GTDB-Tk v2.3.0 (Chaumeil et al. 2019) and further validated with phylogenetic trees of single-copy marker genes. We also downloaded all the publicly available Asgard genomes from BV-BRC (
www.bv-brc.org). We used CheckM2 to estimate genome completeness.
The manually curated genomes were de novo reconstructed from high-quality Illumina metagenomic data as described previously (Chen et al. 2020). From soil samples taken at various depths, we recovered draft illumina based MAGs corresponding to Atabeyarchaeia-1, Atabeyarchaeia-2 and Freyarchaeia. The curation process involved the identification and removal of obvious chimeric regions, which were indicated by abrupt changes in GC content or by insufficient Illumina read mapping support. We also corrected sequences in regions with imperfect read alignment, allowing no single nucleotide polymorphisms (SNPs), by mapping reads at a reduced stringency threshold (allowing for up to 3% SNPs). This was followed by manual curation of the consensus sequence, including insertion, deletion, or substitution of individual base pairs. The extension of contig ends was conducted using unplaced Illumina reads. High read coverage was interpreted as indicative of the genomic termini. A genome was deemed complete when it displayed uninterrupted support from Illumina reads. The final assessment of genome completeness was performed by examining the cumulative GC skew and ensuring alignment with known complete genomes from related taxa. The Average Amino Acids of the new genomes was performed using AAI: Average Amino acid Identity calculator tool (
http://enve-omics.ce.gatech.edu/aai/) and using compareM (v.0.0.23) with the ‘aai_wf’ at default settings (
https://github.com/dparks1134/CompareM). Replichores of complete genomes were predicted according to the GC skew and cumulative GC skew calculated by the iRep package (gc_skew.py).