My Harvest | Devpost

Inspiration

Before we even got to the challenge, our team had, ironically, already ruled out challenge #1. We didn't see how gamification could be used to improve performance - we assumed that combine drivers wouldn't want to optimize the performance of their machine by an app.

When arrived at the hackathon and listened to the presentations, we realized that we completely misinterpreted the prompt. The issue isn't that the combine drivers didn't want to improve the performance of the machines, it was that they might not know how. Through apps such as AgCommand and Smart Connect, the driver has access to a large number of parameters that affect machine performance. For example, they can see the speed of the vehicle, the moisture levels of the grain, and the outside air temperature. However, not all of these parameters are relevant to combine performance, and it isn't exactly clear how the driver can optimize their performance using this information. This leads to a cluttered, confusing dashboard that is difficult for any human being to navigate.

What it does

My Harvest uses the same information that AGCO's existing apps use, but simplifies the user interface by determining which parameters are most important as well and providing real-time recommendations on how to use that information to improve performance.

Using the dataset provided to us, we found the most important parameters that were needed to predict a few key measures of performance. For instance, we found that capacity average was highly correlated with grain loss. With that information, we decided what parameters would be most important for the driver to see. We combined those parameters with ones that weren't found as important but were still important to the combine driver, such as total fuel, to create our dashboard.

The dashboard not only shows measurements for these parameters, it also calculates optimal ranges for them using the model we created from the dataset using real-time information. It displays those ranges in a way that is satisfying for the user if they achieve a measurement that is optimal for the machine's performance - for instance, meters that show a "green" reading for an optimal fan speed.

How we built it

The heart of our project is the model that we built that determines the most important features for combine performance. We did all of our data analysis in Python and used pandas as our dataframe. Originally, the dataset was arranged so that each row was a recorded event for a change in some parameter value. We reorganized it so that each row contained every parameter value for an instant in time, essentially transposing the data we were given.

Using this data, we trained a Random Forest Classifier with sci-kit-learn and JMP Statistical Analysis to predict various performance metrics - the three we focused on were grain loss rotor, yield, and fuel rate. We found this method produced extremely accurate results, with r-squared measurements of up to 0.927 (highly statistically significant). After training this model, we were able to find the most important features that the model depended on to make its prediction.

After we had an idea of these most important features, we were able to start fleshing out the design for the app's user interface. We decided to focus on this aspect rather than creating a prototype for the app due to time constraints. We created this mockup using Sketch.

Challenges we ran into

Our biggest obstacles occurred during our pre-processing steps for the dataset. Initially, we spent hours operating on the false assumption that the information recorded in each row of the dataset was purely instantaneous and not continuous. After discussing with AGCO experts, we realized that each row was only recorded when the value for a parameter changed - at other time intervals, we could assume that the value for a certain parameter was the same as it had been after the last event change. After we altered our pre-processing steps to account for this, our r-squared measurements for the Random Forest Classifiers we trained drastically increased.

Accomplishments that we're proud of

The accomplishment we're most proud of is how we restructured the dataset. Our use of existing machine learning and data science resources such as pandas, sci-kit-learn, and JMP was highly effective. We leveraged our knowledge of these machine learning algorithms with our intuition on model predictions to properly debug our model and our data structures.

What we learned

Half of our team focused on developing an iOS app, which they had no prior experience with, so they spent a lot of time learning how to use Swift and Xcode. Even though we decided we didn't have time to make a working prototype, this was a very valuable learning experience. We also learned a lot about how to restructure datasets and how to best use sci-kit-learn and pandas to produce an effective model.

What's next for My Harvest

Our biggest next step would be to develop the actual iOS app. While we were disappointed that we didn't have time to do this, we recognized that we didn't have access to a real-time updating database so it would be difficult to make a nice prototype in under 24 hours anyways. We also hope to fine tune our model to better provide ranges on the operating values for important parameters.

Built With

jmp
numpy
pandas
python
randomforestclassifier
sci-kit-learn
sketch

Submitted to

AGCO Accelerator Hackathon

Created by

I created the backend algorithm and data parser. This involved knowing the intricacies of how the model worked, evaluating different model choices, and understanding model training best practices.

Jatin Mathur
Dhvanil Popat
goeln
Jackie Oh

Updates

Jackie Oh started this project — Feb 03, 2019 12:59 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.