Soil health is defined as the ability of a soil to function effectively and support ecosystem services. It is currently assessed using a suite of physical, chemical, and biological measurements , which can be combined into a comprehensive score, such as the Soil Management Assessment Framework (SMAF) developed by the USDA. For a metric or score to be reliable indicator of soil health, it should be relevant to soil functions, responsive to perturbations, practical for widespread use, and informative for management decisions. As scientific capacity to assess soil features advances, frameworks like SMAF can evolve to provide more precise and actionable insights for sustainable soil management. For example, with the rapid increase in the ability to assess soil microbiomes, there is interest in incorporating expanded microbial perspectives into soil health assessments12. Many existing soil health metrics are derived from microbial processes, such as the beta-glucosidase assay, but these indicators often lack the nuance to fully capture complex soil microbial processes. This proposal seeks to evaluate and discover microbial indicators within a soil health framework. The overarching hypothesis is that bulk surface soil microbiomes are an underutilized resource for soil health assessments. The primary goal of this project is to enhance soil health management by incorporating microbiome composition and functional gene content into soil health assessments. Research will utilize next-generation sequencing technology to evaluate microbiome responses, along with traditional soil health measurements and SMAF scores, within the Mountain West of the United States. NGS sequencing is supported by a USDA post-doc fellowship awarded to Laura Mason in 2025, and work is performed alongside Drs Megan Machmuller and Lexi Firth (IN-RICHES: https://www.inrichsoil.com/)
Sample collection: Samples were collected from IN-RICHES research sites and COSH (Colorado Soil Health) program producer sites in the summer of 2024. Research sites are experimental plots with replication, lead by interested and trained producers, ARS members, and university researchers; COSH samples are 1 "one-off" samples collected by producers as participants of the CDA STAR program.
Laura Mason and Chris Chorpenning performed IN-RICHES research site sampling in CO, collecting a composite sample of three cores from each soil health team (Machmuller, Firth, Rebecca Even, Steve Bleckner) sampling point, and flash freezing in the field. IN-RICHES site sampling out of state (UT, WY, ID, NM, MT) was performed by the soil health team OR the lead researcher at the site. The composite core obtained for soil health was subsampled and roughly 5g of soil was added to a 50 mL tube with 10mL RNA later. Samples were transported on ice and shipped to CSU. COSH samples, as stated above, were sampled by producers. Producers collected several shovelfuls of soil from across the sampling plot into a clean bucket, mixed with a gloved hand, and subsampled about a tablespoon of soil, using a plastic spoon that had been cleaned with an alcohol pad, into 10 mL of RNA later. Samples were shipped to CSU SPUR and delivered to CSU.
Soil health metrics are being collected by Wilma Trujio (CSU SPUR, COSH samples) and the Machmuller/ Cotrufo labs (IN-RICHES research sites.
DNA extraction has been performed by Laura Mason, using the Zymo microprep fecal soil kit. DNA is eluted in 35 - 55 uL, quantified via qubit and quality assessed via gel, and samples are stored at -20 degC
Sequencing Libraries were prepared by Jessica Henley and Lady Grant in December 2024 - January 2025 Metagenomes were then sequenced by Jessica Henley at CSU at the Next Generation Sequencing Genomics Core (as part of a test of the new user core facility). Jessica provided the following info on methods: "DNA samples were quantified using a Qubit 4 and then processed using Illumina's DNA Prep kit, following the manufacturer's protocol. DNA inputs were all in the 25 - 50ng range, allowing for 6 cycles at the BLT PCR step (I remember Kelly asking about staying below 7 cycles). Pooled libraries were sequenced on an Illumina NextSeq 2000 using an XLEAP-SBS P4 300-cycle reagent kit with a 2% phiX control."
Raw reads were processed for bacterial/archaeal content using singleM 0.18.3 using the singlem –pipe command and default parameters per instructions from Ben Woodcroft. The resulting taxonomic profiles were concatenated and exported in a text file to R where tidyverse was used to 1) convert the profiles into an abundance table (coverages as counts) and 2) transform the “counts” into relative abundances by XXX (personal conversation with Ben Woodcroft). This relative abundance table was used for all downstream processing. To assess the fungal community, raw reads were then trimmed using Sickel (version). Bbduk (version) was used to remove poly-G sequences from the reverse reads and quality was assessed again. Kraken 2(v2.9) was used to identify fungal sequences using a custom database of 1600+ fungal genomes collected from Ensemble (v 67) and FungiDB (version), per (cite that paper). Read counts output by Kraken in the “report” file were transformed in Bracken v 2.9 (add more on what Bracken does). The resulting files were individually transformed into abundance tables using kreport_to_krona.py function in Kraken tools. Abundance tables were concatenated, and exported to R for downstream processing. In R, tidyverse was used to filter all taxa that appeared less than 3 times in the dataset and to convert the counts to relative abundances. This relative abundance table was used in all further statistical tests.
Prior to conducting any statistical tests that bridged both datasets, samples in the Research dataset were randomly selected by a random number generator so that only one replicate of each treatment was used (per Dr. Eric Gilland, CSU statistics dept). This resulted in 161 samples. For both communities, alpha diversity metrics (Shannon’s diversity, evenness, and richness) were assessed in vegan, and krukal wallis tests were used to assess differences in alpha diversity metrics by regenerative management practice, dataset, cropping system, and location. Maaslin2 was used to find enriched organisms by regenerative practice after removing taxa with less than 0.05% abundance and 10% prevalence across the dataset. Location Name was used as the random term. Beta diversity of each community was calculated in the adonis2 package in vegan via PERMANOVA and visualized using NMDS. Soil health metrics were divided into low, medium, and high groups (per Wilhelm et al 2022,2023), and indispecies was also used to determine bioindicators of low, medium, high, and super high bioindicators (Wilhelm et al 2023). Partial least squares regression was used to determine if any taxa were predictive of any soil health metrics and spearman’s correlations to correlate individual taxa to the each of the shared soil health metrics.