Abstract:
In order to identify the geographical origin of soybeans accurately and quickly, the characteristics of soybean in different countries were studied by near infrared spectroscopy (NIRS), principal component analysis (PCA) and artificial neural network (ANN), and the identification model of imported soybean producing areas was established. The near-infrared reflectance spectrums of the 166 soybeans from Argentina, Brazil, Uruguay and the United States had been collected, then the twelve outliers of NIRS were eliminate by using a box-plot graph. The original spectral data was processed by means of multiplicative scatter correction(MSC), standard normal variate(SNV), Savitzky-Golay(SG), etc. It was gotten the optimal result by the preprocessing method based on the smoothing treatment in SG (3 points) with MSC. The PCA was used to compress the NIRS, the analysis results showed that the cumulative variance contribution of PC1 to PC10 (the first ten principal components) were 99.966%. The first 10 principal components obtained by principal component analysis were selected as input vectors, and four producing areas were selected as target vectors. The recognition models were established by support vector machine (SVM), neighbor algorithm (KNN) and artificial neural network (ANN).The results showed that the BP-ANN model was the best, and the overall discrimination accuracy for the test set was 95.65% by ANN model, and the discrimination accuracy for soybean samples from Argentina, Brazil, Uruguay and the United States were 100%, 100%, 80% and 100%, respectively. The ANN model can identify the origin of soybeans imported from different countriesrall discrimination accuracy for the test set was 95.65% by ANN model, and the discrimination accuracy for soybean samples from Argentina, Brazil, Uruguay and the United States were 100%, 100%, 80% and 100%, respectively. The ANN model can identify the origin of soybeans imported from different countries