Enhancing Coffee Leaf Rust Detection Using DenseNet201: A Comprehensive Analysis of the Mbozi and Public Datasets in Songwe, Tanzania
English
Keywords:
Coffee Leaf Rust (CLR) Detection, Dataset Quality, DenseNet201, Image Quality, Machine Learning (ML)Abstract
Coffee Leaf Rust (CLR) is a worldwide devastating fungal disease that threatens coffee production, upsetting economic and farmers' livelihoods. Traditional methods of detecting CLR heavily rely on using machine-learning (ML) models trained through weakly collected datasets and physical inspection which is tedious, time-consuming, and subject to human error. This study explores the performance of the DenseNet201 model using three datasets: Mbozi, Public, and Combined (a merger of Mbozi and Public datasets). Machine Learning Theory guided this research. The study objective is to assess the influence of dataset quality on CLR detection, analyze Mbozi and Public datasets using DenseNet201, and enhance robustness by merging the two datasets. A study on coffee leaf rot (CLR) severity was conducted using systematic sampling techniques. Leaves from multiple coffee farms were collected, representing different levels of infection. The Mbozi dataset, sourced from high-resolution images captured from Tanzania's Songwe coffee plantations, was analyzed for quality under controlled conditions, including environmental factors, image clarity, resolution, labeling consistency, and class balance, based on data completeness, image quality score, visual inspection, and model performance. DenseNet201 was trained and validated on each dataset achieving its highest accuracy with the Mbozi dataset at 98.72% and a validation accuracy of 97.65%, demonstrating the importance of consistent image quality and accurate annotations. In contrast, the public dataset suffered from inconsistencies in resolution and labeling, resulting in a lower training and validation accuracy of 96.86% and 96.42% respectively. The Combined dataset, which integrated the strengths of both datasets, exhibited a stronger generalization with an accuracy of 97.48% and validation accuracy of 97.49%, balancing the need for high-quality images with environmental variability. The study shows improved CLR detection speed and accuracy due to high-quality and consistently labeled images from the Mbozi dataset. It recommends future models integrate regionally relevant and high-resolution datasets for robust performance in real-world agricultural conditions, providing coffee farmers with timely disease intervention tools for better production management and economic stability in coffee-growing regions.
Published
How to Cite
Issue
Section
Copyright (c) 2025 Adrian Jackob Karia, Juma Said Ally, Stanley Leonard

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.