Interpreting Yield Results – Data Variability and Accumulation

Measured performance (yield) of any product is the result of the combined effects of the genetics of the product and the environment in which it is tested. One must always keep in mind that yield trials deal with many variables that can contribute to yield performance. Average yields can also change as more data is accumulated across locations. Greater quantities of yield data will likely give a clearer picture of the actual yield potential provided the datasets are sorted by using data preparation software, allowing the data to easily turn into qualitative information on past yields, the variables that could have affected said yields, as well as predictive information on future yields.

Variability of Observations

Most genetic traits that contribute to yield are quantitative traits, which means they are controlled by multiple genes that each contribute a certain percentage to the overall characteristic. Observations of any quantitative trait, such as height of an individual or yield for a plot or strip trial, follow a bell-shaped curve. This variation is due to the interaction of environment and genetics, as the environment can have an effect on each of these genes independently and in different ways. Most observations cluster or fall close to the mean, but some observations, usually 5% or less, appear to be very different from the mean. These values are not wrong or incorrect, but just part of the natural variation seen in any population. For example, consider height for men in the United States. If the average height is 70 inches, most men will be between 66 and 74 inches tall, but a few will be much taller and a few others will be much shorter. Yield measurements of any particular product will follow a similar pattern.

When comparing two products, yield observations for each will fall into a bell-shaped curve around their means. The means of Product A and Product B are different, but there is overlap between the two products. You can see that a specific observation for Product B may be higher than Product A, although the overall mean of Product A is higher.

This may be a response to the specific environment, or it may just be due to chance. Some environments may favor one product over another, resulting in a higher plot yield for Product B than for Product A even though Product A is the better overall performer. Environmental factors such as excess moisture, drought, or disease may favor Product B, while in a different environment the opposite might be true based on the individual product’s response to it’s environment. Some differences may also be due to experimental error. Plant populations may differ slightly due to germination or planting variations. Like any statistical calculation the more observations that are evaluated, the higher the confidence that the mean calculated represents the true mean of the population.

The Importance of Having All the Data

Because of the inherent variability of observations, the initial data that is reported may not give a clear picture of the actual average performance of a product. The percent accumulated data against the rank correlation among entries in a yield plot. As harvest season begins and limited field data is collected, the correlation is low. As the harvest season progresses, yield data begins to accumulate and the correlation becomes stronger and moves closer to a value of 1. Towards the end of the harvest season, the correlation increases dramatically to over 90% as the large quantity of accumulated data gives a better estimate of the true yield potential of a product and its rank among the other field plot entries. Thus with very little data accumulated, the rankings of various products within that yield plot when compared to rankings across other plots can be extremely variable. As the data is accumulated, how a particular product will rank in the plot will likely become more consistent and provide a better estimate of the true yield potential.

The more data that is accumulated on a given product, the more stable its ranking among multiple products becomes. If two products have equal yield potential, the odds of either one winning are similar to the 50:50 odds of heads or tails when you toss a coin. If you toss a coin 10 times, you will not necessarily find the results being five heads and five tails. The same is true of a yield trial. If two products are equal in yield, the odds of either one winning are 50%. If one product is superior, it will win more frequently, but a win is still not assured in every test. Even if a product has a 4 bu/acre overall yield advantage, it will likely win only 75% of the head-to-head comparisons in a yield plot. Conversely the product that is actually 4 bushels less can win 25% of the time.


No product, even if it is truly superior, will win every yield plot. Over many tests, industry-leading products have typical head-to-head winning percentages of only 60 to 65%. Environmental factors, genetic potential, and test variability constitute the variables that contribute to yield differences across test plot sites. Yield ranks among entries in compiled data sets can also change based on the number of tests and the geographical location of the plots. The more data and comparisons that are assimilated and examined, the better picture of yield performance. This more robust picture can increase the degree of confidence one can place on picking a winning product.

Having a representative distribution of the data across the geography in which the product is marketed is also critical for identifying true yield potential. If the data represents only 75% of the geography that the product is sold in, and the remaining 25% of the data will come from a very different geography, your final result could be quite different. This is important as yields start to be reported and geographical and seasonal stress differences can significantly influence harvest timing.


Related Articles