Excerpt – Data Analytics: The Value of Predictive Models and Enhanced Traceability

Artificial intelligence and machine learning seem to be the new data analytics or big data hype. There’s an assumption that there is ‘enough’ of the ‘right’ data available to make decisions. There is a difference between a ‘model for insight’ and a ‘model for prediction.’ Food fraud is a relatively rare event, so there are prediction model limitations.

I’ve been extremely fortunate to have had the ability to travel widely and to have met a lot of great thought leaders along the way. Several people/ presentations/ moments were key to either identifying an idea or reinforcing a research direction. One of those instances was at the 2008 Society for Risk Analysis meeting, where I heard and met with Bob Ross, a risk assessment thought leader from the US Department of Homeland Security. After his presentation, we had a fantastic discussion about his Homeland Security risk assessments and their application to emerging product counterfeiting and food fraud issues.

Vulnerability or Risk – Models for Insight or Models for Prediction

I kept getting into the trap of thinking of the need – or even the possibility – of creating a probabilistic risk assessment. Through my PhD research, we constantly worked with large data sets and high-level statistical analysis. Bob helped me be aware that product counterfeiting and food fraud were rare events (that did not seem novel to me). And predicting future rare events will make “little or no… quantitatively reliable sense” (that was novel to me and encouraged the focus on vulnerabilities).

BUT, that doesn’t mean that the risk assessor just gives up. The emphasis from Dr. Ross was to take the ‘givens’ that you have and do the best you can. The rare event – like product counterfeiting or food fraud – “is preceded by a chain of individually more likely developments that create intent, capability, and opportunity.” This development could include changing market conditions or vulnerabilities based on system weaknesses.

The food fraud vulnerability assessments shifted from “models for prediction” to “models for insight.”

Analysis in the Social Science

Expanding to understand the process for social science research was another insight that was incredibly important for directing the research. For most of the early food fraud research, the activity was conducted by food scientists, which is categorized under the physical sciences. There is a level of precision or finality in physical sciences. With great precision and accuracy, you can test for the physical presence – or absence – of a chemical or molecule.

The work of Dr. Ross was in terrorism, which is not the physical sciences but human behavior. Food fraud and crime are human behavior. He stated, “predictive social sciences models pose even greater challenges than predictive models in the physical sciences.”

Excerpt from Food Fraud Prevention (Spink, 2019, pages 167-168):

Sidebar: “It Is Simply Not Possible to Validate Predictive Models of Rare Events That Have Not Occurred”

At the 2008 Society for Risk Analysis annual meeting, Robert G Ross of the US Department of Homeland Security presented “Observations on the Importance of Risk Communication in Managing Homeland Security Risk”((Ross 2009) also (Ross 2006a, b, 2007)). He discussed “models for insight versus models to predict.” He recommended using a range of risk models to provide a wide range of insight into these unique vulnerabilities.

Regarding probabilistic risk assessment and more advanced quantitative risk assessments, he made several important key points that apply to food fraud prevention (JASON 2009):

“[It] is simply not possible to validate (evaluate) predictive models of rare events that have not occurred, and unvalidated models cannot be relied upon.”
There is a “…distinction between models for probabilistic risk assessment on long timescales… versus specific point production of individual rare events.”
“It is not a realistic goal to anticipate and prevent all rare events, but it may be possible to make rare events rarer and to reduce their effect.”
“A rare event is preceded by a chain of individually more likely developments that create intent, capability, and opportunity. Intervention may be possible at many points in that chain.”
“There are two principal problems in applying quantitative models to the anticipation of rare events. One problem is that rare events are rare. There will necessarily be little or no previous data from which to extrapolate future expectations in any quantitatively reliable sense, or to evaluate any model.”
“In the extreme, how can the probability of an event that has never been seen or may never even have been imagined be predicted?”
“An additional difficulty is that rare event assessment is largely a question of human behavior, in the domain of the social sciences, and predictive social sciences models pose even greater challenges than predictive models in the physical sciences. Reliable models for ameliorating rare events will need to address smaller, well-defined, testable pieces of the larger problem.”

This insight provides a foundation regarding rare events for what can NOT be expected from traditional probabilistic risk assessment. This insight encourages a shift in focus from predicting the exact incident to considering the wide range of known factors, variables, or vulnerabilities. Ross’s presentation encouraged using many different types of assessments and focusing on “models for insight” rather than “models for prediction”—risk-informed versus risk-based decision-making.

An argument is that once there is ‘enough’ of the ‘right’ information or data, then we can apply artificial intelligence or machine learning. So, the next question would be to define the threshold of ‘enough’, and what is the ‘right’ data. For now – and the assessments that are needed to make business decisions at this exact moment – we can still focus on vulnerabilities.

Takeaway Points

Food fraud is statistically considered a rare event with what statisticians would consider very little historical data.
Quantitative, statistical predictions are very difficult to make with very little historical data.
Leveraging criminology and risk theory, it is possible to identify developments that “create intent, capability, and opportunity.” Those wide ranges of strong or weak ‘signals’ can help reduce the vulnerability.