Publication Date


Document Type


First Advisor

Swingley, Wesley D.

Degree Name

M.S. (Master of Science)

Legacy Department

Department of Biological Sciences


Machine learning and network analyses are powerful modern tools can process and map out connections between large amount of ecological data from complex environmental communities. Random forests, an ensemble machine learning algorithm, are particularly powerful as they can capture complex patterns in data while remaining easily interpretable. These tools are specifically useful in experimental settings where different types of data are collected. The aim of this study was to demonstrate the utility of machine learning models and network analyses at analyzing diverse ecological data from dynamic plant-soil microbial communities in a prairie ecosystem. Our experimental system is an experimental prairie maintained at Morton Arboretum located in Lisle, Illinois that provides the opportunity to understand the relationships between soil microbes, soil chemistry and prairie plants over four sampling years. Soil microbial communities shaping each individual sampling year were identified using feature importance from random forests. Similarly, random forests were also used to map out microbial taxa that differ between monoculture plots and polyculture plots. Microbial interactions across the different sampling years and their interactions with plant and edaphic variables were visualized using network analyses. The results of the random forest classification were compared against the constructed microbial networks. While the random forests models were able to pick up patterns differentiating soil microbial communities across different sampling years, the models were unable to find patterns differentiating soil microbial communities in monoculture plots from those in polyculture plots. Network analysis showed that microbial networks differed across sampling years with older samples having more established network hubs. Although network analysis showed no associations between microbial networks and plant data, network modules associated with soil organic matter were found. The tools employed here, if used in early analysis, can direct land managers and ecologists to specific bacterial groups and prairie variables to target for statistical analyses or manipulation.


91 pages




Northern Illinois University

Rights Statement

In Copyright

Rights Statement 2

NIU theses are protected by copyright. They may be viewed from Huskie Commons for any purpose, but reproduction or distribution in any format is prohibited without the written permission of the authors.

Media Type