Publication Date
2023
Document Type
Dissertation/Thesis
First Advisor
Swingley, Wesley D.
Degree Name
M.S. (Master of Science)
Legacy Department
Department of Biological Sciences
Abstract
Machine learning and network analyses are powerful modern tools can process and map out connections between large amount of ecological data from complex environmental communities. Random forests, an ensemble machine learning algorithm, are particularly powerful as they can capture complex patterns in data while remaining easily interpretable. These tools are specifically useful in experimental settings where different types of data are collected. The aim of this study was to demonstrate the utility of machine learning models and network analyses at analyzing diverse ecological data from dynamic plant-soil microbial communities in a prairie ecosystem. Our experimental system is an experimental prairie maintained at Morton Arboretum located in Lisle, Illinois that provides the opportunity to understand the relationships between soil microbes, soil chemistry and prairie plants over four sampling years. Soil microbial communities shaping each individual sampling year were identified using feature importance from random forests. Similarly, random forests were also used to map out microbial taxa that differ between monoculture plots and polyculture plots. Microbial interactions across the different sampling years and their interactions with plant and edaphic variables were visualized using network analyses. The results of the random forest classification were compared against the constructed microbial networks. While the random forests models were able to pick up patterns differentiating soil microbial communities across different sampling years, the models were unable to find patterns differentiating soil microbial communities in monoculture plots from those in polyculture plots. Network analysis showed that microbial networks differed across sampling years with older samples having more established network hubs. Although network analysis showed no associations between microbial networks and plant data, network modules associated with soil organic matter were found. The tools employed here, if used in early analysis, can direct land managers and ecologists to specific bacterial groups and prairie variables to target for statistical analyses or manipulation.
Recommended Citation
Oku, Ali Eastman, "The Role of Machine Learning and Network Analyses in Understanding Microbial Composition in An Experimental Prairie" (2023). Graduate Research Theses & Dissertations. 7171.
https://huskiecommons.lib.niu.edu/allgraduate-thesesdissertations/7171
Extent
91 pages
Language
eng
Publisher
Northern Illinois University
Rights Statement
In Copyright
Rights Statement 2
NIU theses are protected by copyright. They may be viewed from Huskie Commons for any purpose, but reproduction or distribution in any format is prohibited without the written permission of the authors.
Media Type
Text
Included in
Artificial Intelligence and Robotics Commons, Bioinformatics Commons, Ecology and Evolutionary Biology Commons