Crop scouting is an important process in farming. It involves assessing pest pressure, typically insects, and crop performance to evaluate the potential risk from pest infestations, weeds, disease, and other observations. Regular crop scouting during the growing season helps farmers to make timely and informed decisions to protect their crop yields.

Historically, crop scouting would be carried out by human scouts. These people would be responsible for walking through vast crop fields and documenting their findings. This is a time-consuming and costly method that often results in late detection of disease and pests.

Thanks to advanced artificial intelligence (AI) agrotech solutions, however, this time-consuming and costly method has been replaced by many that are quicker, cheaper, and more accurate.

What is Agrotech?

Agrotech (agricultural technology) solutions help farmers scout their fields for problems, such as disease or pests. There are a range of different solutions available, and many turn low-cost commercial drones into digital crop scouts by using powerful AI-backed platforms.

In a 20-minute walk, a crop scout might be able to check 150 potato plants. In a 20-minute flyover, however, a robust agrotech solution could cover 10,000.

To improve the power and accuracy of their platform, an agrotech platform recently sought the help of the right data labeling platform:

The challenge: Cleaning up a massive amount of data

The agrotech provider had amassed a huge amount of aerial images of potato fields, and they needed to identify a tiny pest that’s endemic in this environment: the Colorado potato beetle.

This type of beetle becomes active in spring, around the same time as potato plants grow out of the ground. The beetles feed on the potato plant’s leaves and can completely defoliate the plants. Potato plants can usually withstand infections early in the season, so it’s important for farmers to act quickly.

Since only 2% of these images had actual pests in them, the company’s internal data scientists found it extremely difficult to label them. A huge amount of time was wasted reviewing data that didn’t have a single pest in the image as a result, time that could be better spent on other tasks.

The solution: dynamic judgments

In tackling this problem, our primary challenge was cleaning up the data and detecting which ones contained beetles. We experimented with two distinct approaches:

Approach 1. Ask one group of users to put a dot on each beetle they see, then ask two other groups of users to mark them with bb and classify.

Approach 2. Simply ask the users, “Do you see a beetle in the image?” and then ask the users to mark beetles only on images where users indicated that they saw beetles.

In the end, we opted for the second approach.

Firstly, the obvious advantage of taking this approach was that a yes/no question is a much simpler task which led to a higher engagement rate, thus speeding up the project.

Secondly, it’s much easier to aggregate judgments by using the answers to a yes/no question as opposed to a graphic annotation.

Overall, the second approach improved the quality of the aggregated answer because it was easier to compare multiple judgments and conclude the final result. Combined with our dynamic judgments feature, where we stop collecting judgments when we hit a defined agreement level, we were able to cut the required judgments by 30%, making the whole process 30% faster and cheaper.

The process: Break down each data set and create micro-tasks

By breaking down each dataset into millions of micro-tasks, our data labelers were able to identify if there was a pest present in the image. Thanks to this process, the agrotech platform was able to vastly improve the accuracy of its pest detection ML model

At the peak of this engagement, our data labeling experts were using the Tasq platform to label a huge amount of images per day. This meant that the agrotech platform was able to increase its data labeling speed by over a factor of 30: Instead of having their own experts review tens of thousands of images, they were able to focus on the 2% of images where a pest was actually present.

The result: A superior ML model

Throughout this project, vast datasets were processed at lightning speed by hundreds of thousands of individual data labeling experts thanks to the power of the Tasq data labeling platform.

The result of this was a high-quality dataset and a superior ML model achieved much more quickly and at a lower cost than anything possible with other providers.