Agricultural Knowledge Discovery Dashboard
The Agriculture Knowledge Discovery Dashboard is a research-focused interactive application, that provides a step-by-step methodology for knowledge discovery and machine learning prediction for the area of agriculture.
The Agriculture Knowledge Discovery Dashboard is a research-focused interactive application, that provides a set of dashboards for knowledge discovery and machine learning prediction in the area of agricultural crop insurance. The dashboard is meant for scientists and other more sophisticated users, who want to examine crop insurance data compared to climate in varying spatial and temporal scales.These set of dashboards are laid out to step thru differing organizations of crop insurance data, as well as to examine temporal climate relationships using a time lag d esign matrix. A Models tab is provided to examine the predictive relationships of climate to insurance loss. In this overview, we provide a general review of the data, and the process/steps to be used for examination.
Agricultural systems and their products are essential components to our society. In 2014, the U.S. agricultural sector created a gross output of more than 835 billion dollars, and had an employee base of approximately 750,000 people. With roughly 2 million farms in the US, with an average size of about 435 acres, total grain production alone was $436 million (USDA Economic Research Service, 2014). For the state of Washington, agriculture production value exceeded $10 billion, with over 160,000 jobs, making up 13-15 percent of the state’s economy each year. (WSDA drought report, 2015). Similarly, the Washington forestry support industries generated over $1.8B in total economic impacts across the same time frame.
Our approach uses data extraction and transformation techniques in R and python to organize and filter data, for use in machine learning predictive models. Our models are visualized in data dashboards as well as application programming interfaces (API). Our data dashboards allow a user to review and predict outcomes of a particular area, with our initial efforts focusing on agricultural systems for insurance commodity loss relate to climate.
Below is a step by step description of each analysis tab:
1) Data Selection. We are examining agricultural commodity loss information, which is collected at a monthly temporal and county spatial scale. Farmers file claims with their cropping agent, which kicks off a process of validation and payment distribution. The farmer provides the total acreage of the claim, the commodity, and the cause of the damage. This information is recorded and assembled by the USDA Risk Management Office (RMA). For our purposes of this analysis, we have subsetted the national dataset to examine just data from 1989-2015, for the three state region of Washington, Oregon, and Idaho. The code and processes will eventually be applied nationwide. (Step 1),
2) Processing and Transformation. We aggregate data by differing factors of the data, in order to allow for the dynamic viewing and examination of the data in differing ways. We also generate some basic transformed variables, such as loss per acre, mean loss, etc. (Step 2),
3) Exploratory Data Analysis (EDA) is a primary component of this agricultural dashboard. As noted in the Overview tab, we have broken out our analysis into several steps, that allow for examination of the data at a state county level, by month, thru animation, comparison to climate variables, and modeling of the data relationships between climate and commodity loss. (Step 3),
Insurance Commodity Loss in Relationship to the Cause of Damage
The state level exploratory data analysis section provides access to crop loss and county damage levels for the three state region of Washington, Oregon, and Idaho, from 1989 - 2015. The loss data has been adjusted for inflation using 2015 consumer price indexing for agricultural commodities.
Our initial exploratory analysis comparisons explore commodity loss in relationship to county and state. In addition, we examine the cause of damage reported for claims by county and year, as well as claim distribution by month and year. Insurance claims of agricultural commodity loss are quite spatially and temporally extensive. The USDA summarizes claims at a county and monthly level, and provides data going back to 1980.
County Level Insurance Commodity Loss Exploratory Data Analysis
In this step we continue to explore agricultural data variations by looking at county-level claim loss and counts, as well as claim counts at an annual level. The range of data is from 1989 - 2015 for the three state region of Washington, Oregon, and Idaho.
The agricultural commodity loss analysis by county provides a breakdown of loss and claim counts by county, for a range of different factors, including by damagecause, commodity type and year.The tabs above can be described as follows:
1) County level claims and loss by month. This tab provides individual county data on a monthly basis.
2) County level multi-year loss by damage cause, commodity type, and year. Data is segregated by differing factors, so a user can parse the data in the structure that is most appropriate.
Agricultural Commodity Animation
In order to visually examine all commodities and damage causes over time, we have animated commodities vs. damage causes for the time period of 1989-2015, for Washington, Oregon, and Idaho. Each animation has a map of loss, as well as the total claim counts and loss in $.
In this animation analysis, 2.8+ million commodity claims in Idaho, Oregon, and Washington - between 1989 and 2015 - have been assembled by their documented damage cause, commodity, month and year. Keep in mind that a farmer files a claim, and after verification by a crop agent, that claim is documented as caused by a particular damage category (e.g. drought, heat, freeze, hail, cold weather, declining prices failed irrigation supply, etc). The provided animation shows crop loss in $, as well as the commodity claim counts and the claim counts by county for each month. If there were no claims for a particular commodity in a state, then no animation was generated and that commmodity will not be listed in the pulldown menu on the left.
National Ag Stats Service Analysis
The National Agricultural Statistics Service (NASS) provides survey and census data with regards to agricultural commodities. In this data review, we examine differing commodity totals (production, area harvested, sales, yield).
The DMINE commodity predictive model provides impact scores for a location, based on optimized models that are run on a regular basis. The models contain meteorological and climatological data that are used over extended periods of time to predict crop commodity loss outcomes. Over time we optimize our models to attempt to get better predictions. Insurance claims of agricultural commodity loss are quite spatially and temporally extensive. The USDA summarizes claims at a county and monthly level, and provides data going back to 1980.