Agricultural Knowledge Discovery Dashboard
Crop Claim Loss vs. Climate
Generate report
Agriculture Knowledge Discovery Dashboard
Step 2: Data Assembly

Choose a state and a year to look at USDA claims records

Agriculture Knowledge Discovery Dashboard

DMINE is funded thru NOAA grant # NA15OAR4310145

Agriculture Knowledge Discovery Dashboard
State Insurance Loss

Annual Range
Agriculture Knowledge Discovery Dashboard
Palouse Climate Comparison
Agriculture Knowledge Discovery Dashboard
Southern ID Climate Comparison
Agriculture Knowledge Discovery Dashboard
Climate Normals
Agriculture Knowledge Discovery Dashboard
Animation
Agriculture Knowledge Discovery Dashboard
NASS Data Review
Agriculture Knowledge Discovery Dashboard
County Insurance Loss
Annual Range

Agricultural Commodity Loss Analysis - Climate Lag Correlations

The Agriculture Knowledge Discovery Dashboard is a research-focused interactive application, that provides a step-by-step methodology for knowledge discovery and machine learning prediction for the area of agriculture. This portion of the dashboard provides regional analysis of agricultural insurance loss in relationship to climate - allowing a user to evaluate correlations based on differing time lagged associations.

Overview and Approach

The Agriculture Knowledge Discovery Dashboard is a research-focused interactive application, that provides a set of dashboards for knowledge discovery and machine learning prediction in the area of agricultural crop insurance. The dashboard is meant for scientists and other more sophisticated users, who want to examine crop insurance data compared to climate in varying spatial and temporal scales.

These set of dashboards are laid out to step thru differing organizations of crop insurance data, as well as to examine temporal climate relationships using a time lag d esign matrix. A Models tab is provided to examine the predictive relationships of climate to insurance loss.

In this overview, we provide a general review of the data, and the process/steps to be used for examination.


Palouse Climate Correlations with Mean Seasonal Insurance Loss: Provides an ability to match up the highest correlated values between insurance loss and individual climate variables, for the Palouse region of the Inland Pacific Northwest, using data from 1989-2015. Data is compared using the water year (October - June) as a basis for aggregating climate and mean loss values for individual commodities.

Southern Idaho Climate Correlations with Mean Seasonal Insurance Loss: Provides an ability to match up the highest correlated values between insurance loss and individual climate variables, for the Palouse region of the Inland Pacific Northwest, using data from 1989-2015. Data is compared using the water year (October - June) as a basis for aggregating climate and mean loss values for individual commodities.

Agricultural Systems in the Pacific Northwest

Agricultural systems and their products are essential components to our society.  In 2014, the U.S. agricultural sector created a gross output of more than 835 billion dollars, and had an employee base of approximately 750,000 people.  With roughly 2 million farms in the US, with an average size of about 435 acres, total grain production alone was $436 million (USDA Economic Research Service, 2014).  For the state of Washington, agriculture production value exceeded $10 billion, with over 160,000 jobs, making up 13-15 percent of the state’s economy each year. (WSDA drought report, 2015).  Similarly, the Washington forestry support industries generated over $1.8B in total economic impacts across the same time frame.

Methodology for Agricultural Systems Analysis 

Our approach uses data extraction and transformation techniques in R and python to organize and filter data, for use in machine learning predictive models. Our models are visualized in data dashboards as well as application programming interfaces (API). Our data dashboards allow a user to review and predict outcomes of a particular area, with our initial efforts focusing on agricultural systems for insurance commodity loss relate to climate.

Below is a step by step description of each analysis tab:

1) Data Selection. We are examining agricultural commodity loss information, which is collected at a monthly temporal and county spatial scale. Farmers file claims with their cropping agent, which kicks off a process of validation and payment distribution. The farmer provides the total acreage of the claim, the commodity, and the cause of the damage. This information is recorded and assembled by the USDA Risk Management Office (RMA). For our purposes of this analysis, we have subsetted the national dataset to examine just data from 1989-2015, for the three state region of Washington, Oregon, and Idaho. The code and processes will eventually be applied nationwide. (Step 1),

2) Processing and Transformation. We aggregate data by differing factors of the data, in order to allow for the dynamic viewing and examination of the data in differing ways. We also generate some basic transformed variables, such as loss per acre, mean loss, etc. (Step 2),

3) Exploratory Data Analysis (EDA) is a primary component of this agricultural dashboard. As noted in the Overview tab, we have broken out our analysis into several steps, that allow for examination of the data at a state county level, by month, thru animation, comparison to climate variables, and modeling of the data relationships between climate and commodity loss. (Step 3),

5) Climatic correlation analysis is performed as part of the "Climate Impacts" tab. Here we attempt to find the best correlation between differing climate variables and commodity loss, by constructing a temporal matrix of months in relationship to loss - plotting correlations between values.

6) Modeling provides several approaches to examining climate and commodity loss relationships, with the goal of generating a tuned model that would allow for prediction.


Case Example Area: Climate Impacts & agricultural systems
Summary: Economic crop loss has a close relationship to food resilience and security. Under this premise, we have been developing a case scenario example of data mining and machine learning to explore agricultural commodity loss and its relationship to drought and water scarcity.


Palouse Climate Impact Correlations Analysis

This Palouse regional analysis compares agricultural commodity loss, in varying forms, to climate - in an aggregated manner - using a matrix of monthly climate variables for comparison.

Comparing Palouse climate variables to agriculture, using a design matrix 

USDA county level insurance claims, attributed to drought, were aggregated on a seasonal water year basis (October – June) for a set of select commodities (wheat, barley, and apples), and compared to all combinations of the previous years’ monthly individual climate variables. Loss data was transformed using a cube root function for normalcy. The set of months with the highest climate variable correlation with insurance loss (for each commodity), for each county, were selected and assembled with the associated insurance loss data (i.e. for precipitation within Whitman county, WA, three months previous to September - June/July/Aug/Sept – had the highest correlation with seasonal mean Whitman wheat/drought loss claims).



Southern Idaho Climate Impact Correlations Analysis

This Southern Idaho regional analysis compares agricultural commodity loss, in varying forms, to climate - in an aggregated manner - using a matrix of monthly climate variables for comparison.

Comparing Southern Idaho climate variables to agriculture, using a design matrix 

USDA county level insurance claims, attributed to drought, were aggregated on a seasonal water year basis (October – June) for a set of select commodities (wheat, barley, and apples), and compared to all combinations of the previous years’ monthly individual climate variables. Loss data was transformed using a cube root function for normalcy. The set of months with the highest climate variable correlation with insurance loss (for each commodity), for each county, were selected and assembled with the associated insurance loss data (i.e. for precipitation within Whitman county, WA, three months previous to September - June/July/Aug/Sept – had the highest correlation with seasonal mean Whitman wheat/drought loss claims).