Data Validation

How reliable is our input data? Are field geometries truly representing our cropped area? Crop monitoring and data analysis can only be as good as the input data on which they are based. Field boundaries are dynamic and need maintenance. Errors in data capturing or incorrect data declaration do happen and need to be detected. Incorrect data leads to erroneous production status information, mislead decisions or unjustified payments.

Our Data Validation Dashboard can help to answer these questions and understand your data quality:

Data Validation Dashboard

It is based on the agknowledge API products of the Data Validation package.

The Data Validation Package provides reliable tools for detecting data errors, in particular faulty geometries or wrong crops. It identifies two or more crops at the same time on the same field or multiple crops in sequence within one season. It also supports a generic validation of crop types.

With this functionality the following benefits can be achieved:

  • increase data quality
  • detection of faulty field boundaries
  • identify multiple crops on one field
  • validate crop types declared
  • make production status information more reliable
  • identify false declarations
  • detect food fraud

Service: Land Use Homogeneity

This method was developed by the European Joint Research Center (JRC) and calculates the signal-to-noise ratio for all observations in the defined monitoring period. If the parcel is heterogeneous for the majority of observations within the period the signal-to-noise ratio will be very low and the parcel will be flagged.

Sample of an inhomogenous parcel (right)

Service: Crop Verification

This service compares the crop development of the parcel in question with the expected development of such a crop in that area. Based on the degree of similarity the crop type is confirmed or rejected. This service determines the expected crop development based on the data set provided and determines outliers. Therefore it assumes that the crop type of most of the parcels are labelled correctly.

The service comes with a generic configuration. Through training of a custom machine learning model the algorithm can be tuned to specific crops and regional conditions.

Service: Crop Cycle Detection

Wrong analytics data may also be caused by multiple crops grown after each other in the period defined by seeding and harvest date. The crop cycle functionality identifies such multiple cultivations. This may also indicate a wrong crop type declaration.

Sample of multiple crops in sequence

Link: Data Validation API documentation