Reliance on Incomplete or Uncertain Data Sets is an Achilles Heel of Enhanced Traceability, Artificial Intelligence and Machine Learning

Can you trust your data? How much do you need to trust your data? What is the error rate in your data set (e.g., credit card companies expect a 1-3% fraud rate)? The answer to these questions is based on how you respond to the findings. There is a danger if there is 100% reliance on a data set to authenticate or approve a product.

Data and information are obviously crucial for measuring and managing business operations. This is also true for food fraud prevention. Data analytics, big data, traceability, and systems such as artificial intelligence or machine learning will obviously help food fraud prevention. The key will be understanding what data we have in relation to our specific problems and the resource allocation decisions.

Data and More Data

Having data – and more data – is excellent. The more data we have, the more insight we can gain into a situation or problem. However, there are often small, incomplete, or uncertain data sets for emerging topics or relatively infrequent events (e.g., food fraud, product counterfeiting, and other rapidly emerging fraud cases).

If there is a 1% error rate in a data set – e.g., if the data is 99% accurate or complete – it is not necessarily good or bad. If you are authenticating a vial of adrenaline to restart a heart after open heart surgery, a 1% error rate is NOT ok. If you are trying to pick which shipping container to randomly check at a border crossing, then a 1% error rate is ok. If you accept that a shipment of a food ingredient is tree nut allergen-free, then a 1% error rate is NOT ok.

The food industry has had a tremendous amount of interest in traceability and tracking products, as well as, more recently, predictive analytics and the use of artificial intelligence (“A.I.”) or machine learning. In support of food fraud prevention, there have been published projects on data usage for food fraud prevention (several are listed in the references below).

BEFORE starting to analyze data sets, there are two key questions:

What problem are you managing?
What decision are you supporting?

What Problem are you Managing?

There is a saying that “general information generally helps, and specific information specifically helps.” There is more value in an assessment that specifically and directly supports a resource allocation.

From the Food Fraud Prevention textbook (2019): “The field of Decision Sciences emphasizes: (1) The need to be very specific in defining the question that is being asked and (2) To focus on the process or method of gathering information and supporting that final decision.”

‘Food fraud’ is a general problem. Melamine in whey protein powder procured from spot suppliers is a specific problem.

What Decision are you Supporting?

When considering resource allocation decision-making, it is important to drill down to enough data to make an actual decision. That actual decision addresses a specific problem and includes countermeasures or control systems that can be explained (what they are and how they reduce a specific food fraud vulnerability), the confidence that the resource allocation will actually help, how to measure success, AND why this resource allocation is better than any other options.

In our article “Food Fraud Data Collection Needs Survey” (2019), we listed some uses of data sets (listed here from general to specific). All of these are legitimate questions but very different goals for the different projects:

Overview of the problem—General
Overview of the problem—Detailed
Negative list/Black list (including “Early Warning System” for known concerns)
Food Fraud Vulnerability Assessment—Current state
Food Fraud Vulnerability Assessment—Fraud opportunity
Product fraud incident clustering (general review of data sets)
Ongoing suspicious activity scouting/Horizon scanning
Criminal prosecution of a supplier—use as evidence during an investigation or court case

How to Start – Assess the Problem

From the Food Fraud Prevention textbook (Spink, 2019):

“Assess the Vulnerability: In every – emphasis on ‘each and every time’ – a new incident or problem that is identified, it should be run through a systematic review that includes:

(1) A review of the suspicious activity (such as using the Food Fraud Suspicious Activity Report method (FFSAR),
(2) Conduct a vulnerability assessment (including supply chain mapping to identify where and how the vulnerability exists) and
(3) Then plot the problem on a corporate risk map.”

Once the problem has been clearly defined and confirmed as a priority, then:

“(4) the supply chain mapping and corporate risk map can help support resource allocation decision making.”

Do You Have Enough Data for this Decision or Assessment?

At this point, you now have enough information and clarity about your resource allocation decision to understand if you have enough of the right data to answer your specific question.

Conducting this assessment will help you help your I.T. / data suppliers meet your needs more precisely.

Takeaway Points

There is often not enough of the right data to answer your specific question.
The food fraud opportunity is rapidly evolving and emerging since the human adversary is clandestine, stealthy, intelligent, resilient, innovative, often well-funded, actively seeks to avoid detection…and they are often very patient.
Data and information are obviously very important and helpful, but we must be realistic about hype and overreliance on what could be incomplete or uncertain data sets.

References (among others):

I am grateful and honored to work with a wide range of remarkable colleagues and co-authors on the related topics.

Spink, John. & Fejes, Zoltan (Lev), (2012) Review of Reports Quantifying the Economic Impact of Counterfeiting and Piracy, International Journal of Comparative and Applied Criminal Justice, Volume 36, Number 4, p. 1-23, URL: http://www.tandfonline.com/doi/full/10.1080/01924036.2012.726320
Spink, John, Elliott, Christopher, Dean, Moira, Speier-Pero, Cheri, (2019). Fraud Data Collection Needs Survey, NPJ Science of Food, 3(1), Pages 1-8, URL: https://www.nature.com/articles/s41538-019-0036-x
Frera, Massimo., Elahi, Selvarani., Woolfe, Mark., Crew, Sterling., & Spink, John. (2021). Has COVID-19 caused a significant increase in observed food fraud incidents? Institute of Food Science+ Technology, URL: https://ifst.onlinelibrary.wiley.com/pb-assets/assets/26891816/FAN%20COVID%2019%20food%20fraud%20article%20-%20final%20(1)-1608332369637.pdf
GAO, Government Accountability Office. (2010). INTELLECTUAL PROPERTY: Observations on Efforts to Quantify the Economic Effects of Counterfeit and Pirated Goods, URL: http://www.gao.gov/new.items/d10423.pdf