Data are the foundation for decisionmaking in any enterprise but the complexity of their realization and their importance as a “currency” in the flow of information are overlooked by many practitioners. This is particularly true of precision agriculture, which was catapulted into importance by the shear volume of geo-referenced, field-level data. Understanding data is the first step for harnessing the power of precision agriculture.
The concept of data as currency is intriguing because it embodies the exchange of something that has value and ownership. Before delving into the question of ownership, it may be informative to discuss some technical aspects of data, especially from the perspective of their derivation for use in decisionmaking. While a technical discussion may seem dry and boring on the first read, it does emphasize the steps necessary to realize “good” data. After the discussion, the reader may better appreciate the dictum that a decision is only as good as the data it is based on.
Numbers become data when they are reported with units. When we say a grower produced 100 bushels per acre, it is explicitly understood that a crop was measured in terms of yield over an area and implicitly understood that the yield was accumulated over a growing season. In fact, a more complete expression for the units of production would be 100 bushels per acre per growing season. To be useful as information for decision making, data must be reported in some quantity over space and time. The finer the spatial and temporal resolutions of data, the more information we have about some phenomenon such as crop production.
Data, especially in the form of recorded numbers, are considered “raw” and must be “cleaned” before being accepted as credible information. That data are first considered raw upon collection can easily be appreciated by anyone who has worked with yield monitors. As a harvester moves and turns across a field with different speeds, the amount of yield recorded by a monitor can vary greatly. In fact, it is very common to find unrealistically high and low yield values due to the recording limitations of a monitor. The measured yields, which include extreme values, are considered “raw” data. These raw data must be post processed to remove the extreme values or “outliers.” The post-processed data are considered a “clean” source of information for decision making.
Hitting The Target
As part of the measurement process, every datum has an associated “precision” and “accuracy” in order to quantify the degree to which a number is close to the “true” value and the degree to which it is repeatable. The popular archery analogy may help to explain precision and accuracy. An archer shoots a number of arrows at a target consisting of a bull’s-eye (true value) surrounded by rings. If arrows are clustered close together right on the bull’s-eye of the target, then the archer has high precision and high accuracy. If arrows are clustered close together on the target but centered on an outer ring far from the bull’s-eye, then the archer has high precision and low accuracy. If arrows are far apart on different rings but centered on the bull’s-eye of the target, then the archer has low precision and high accuracy. The accuracy of a recording device can be improved by the process of calibration but precision is an inherent limitation due to design. The precision and accuracy of a datum is expressed mathematically as a number plus or minus an “uncertainty interval” at a stated level of confidence. In the example of the yield monitor, the harvested yield of a crop could be 100 +/- 5 bushels per acre at a 95% confidence level.
Knowledge about the uncertainty of a datum is important for decisionmaking. The larger the uncertainty interval associated with a datum at a given confidence level, the greater the risk in a decision. For a given uncertainty interval, the lower the confidence level associated with a datum, the greater the risk in a decision. A production manager relying on monitored yields clearly wants a small uncertainty interval with a high level of confidence when making decisions. For typical crop production decisions, the confidence level is usually set at 90% or higher when determining an uncertainty interval associated with a datum. The manager is then guided by the uncertainty interval when making a decision using the datum.
It is apparent from the previous discussion that the realization of “good” data takes some effort. However, once the effort becomes routine, a manager will have confidence in the data and subsequently in the decisions made with the data. If the procedure for recording and reporting data follows a standard, then the realized data can be a shared among different farming operations. Before discussing this sharing of data as the “currency” of precision agriculture, the issue of ownership needs to be addressed.
A Continuing Debate
The question of data ownership has been debated since the beginning of precision agriculture. There are several positions on ownership. They range from growers own data because they were collected on their farms to companies own data because they were incorporated into services. Some argue that whoever pays for data, owns the data. Others argue that whoever collects the data, owns the data. The question of ownership ultimately comes down to the agreement made between the grower and the person or entity collecting and/or generating data to support decision making. To avoid any misunderstanding, this agreement should be explicit and in writing. The agreement should specify the rights of individuals or other entities as to the collection, storage, and use of data. It should also address any backup, security, and privacy issues.
As new equipment and information technologies are introduced into precision agriculture, the volume of data realized in a production cycle will continue to grow at an increasing rate. The sharing of these data as a “currency” across communities will benefit not only the participants but contribute to the information flow within the industry.