Team:NAU-CHINA/Software

Document

Software

Detection of Wheat Scab Based On

Feature Engineering and Machine Learning

In this study, we proposed a method to automatically detect whether there is any scab in wheat. It is based on feature engineering and machine learning.

The method includes two parts. In part one, we process the image to extract some features. In part two, we use the features which are selected in part one to train a machine learning model. Finally, the model will be used to detect Wheat Scab. Please refer to the information offered below for details.

Workflow of our method

We first compress the image and filter the color image. And the image is binarized to separate a single wheat image. The original image and the segmented image are multiplied so as to obtain the wheat color image, from which features are extracted. In this way, we process more than 110 images and extract more than 45 features from each image.

The first, second and third order matrix of each color channel are extracted in five color spaces of RGB, HSB, HSI, YCBCR and LAB respectively. The features, which we extracted above, consist in them. Then our problem is turned into binary classification, which belongs to machine learning. We use SVM (Support Vector Machine), Decision Tree, Random Forest and so on to train our model. In machine learning, all of them are supervised learning models related to the relevant learning algorithm. Taking our sample space into account, as well as good performance of SVM in small samples, we finally choose SVM as our supervised learning model. Our sample space is defined as follows. In this definition, Xin represents ‘the n-th feature of the i-th sample’, and Yi represents ‘the label of the i-th sample ’.

The data set S is divided into train set Strain and test set Stest , the data set is randomly divided into k packets. One of the packages is chosen as a test set, the remaining k-1 packages are a training set for training. This way is called ‘K-Fold Cross Validation’.

Data in training set and testing set is in the form of < xⁱ₁,xⁱ₂,xⁱ₃,...xⁱn,yⁱ > and 'yi' represents correct label in the testing set. At the same time, there is a new symbol ypredict in the testing set, which represents the prediction class given by our learning model. We choose ROC, precision, recall and F1 score, which are often used in machine learning, as the evaluation of our supervised learning model. Finally, we use SVM to train a model, and have a total of 10 times 5-fold cross validation. The following is the result.

Precision Recall F1 Score
0.97 0.89 0.93

Average evaluation value of 10 times

One of ROC Curves

In general, our model has a good effect and can detect Wheat Scab accurately.