A Random Forest Algorithm for Predicting Crop Yield in Hilly Regions of North East India
The North-Eastern Region (NER) of India, despite the potentialities of valuable natural resources, is struggling in the agricultural sector. The sector is cropped up with multiple problems of decline in agricultural products such as soil problems, land and water degradation, low potential areas of eastern India, marketing and finance issues. The region could not nourish its resources due to diverse natural location, lack of proper attention, and significant yield prediction. However, crop yield prediction being one of the critical factors in agriculture practices, there is a need to generate relevant area-specific information for high yield. The farmers in the region are unaware of the challenges and opportunities in agriculture due to a lack of adequate information, and therefore, these farmers largely need information regarding crop yield before sowing seeds to achieve enhanced crop yield. Given the advantages offered from accurate predictions of crop yields and the ability of data mining techniques to extract patterns in large data sets, this paper initially focuses on major influencing factors of crop production and then determines how accurately the random forest, i.e., a data mining technique, can estimate crop productivity in the hilly region of North East India in the pursuit of advancing industry sustainability. According to the experiment on testing data sets, the method has high classification accuracy and it is more suitable for the current big data scene in which data patterns will gradually change with time.