Trusted AI in Healthcare: Avoiding the Risks of Black Box AI Solutions
AI in Healthcare is Powerful but Skepticism Abounds
Artificial Intelligence (AI) is delivering tremendous value in Healthcare. For example, recent news out of GoogleHealth and DeepMind, as reported in the journal Nature (“International evaluation of an AI system for breast cancer screening”) details how their AI solution potentially spots cancer earlier and more accurately than radiologists in the United States and United Kingdom. But skepticism abounds.
Promising as GoogleHealth/DeepMind’s AI-based breast cancer detection solution is, there are significant “trust issues” including a lack of transparency with study data and models, the black box nature of many AI solutions, potential biases, and a lack of explainability. More serious trust issues obviously relate to model performance and their ability to deliver proper diagnoses or outperform clinicians. Risks include unnecessary procedures and the possibility of misdiagnosis, loss of human life, or lawsuits resulting from negligence.
The Need for Trusted AI Solutions
Spending on Artificial Intelligence (AI) is expected to more than double from $35 billion in 2019 to $79 billion in 2022, according to IDC forecasts, reflecting the enormous potential societal benefits of AI. Yet broad adoption of AI systems will not come from the benefits alone but from the ability to trust these dynamically evolving digital systems. Trust is the foundation of all digital systems. Without trust, artificial intelligence and machine learning systems cannot deliver on their potential value.
Ultimately, Healthcare organizations and end users need to be able to answer key questions about these black box AI solutions like:
- How did the system predict what it predicted? (Explanation)
- Has the AI system or model been unfair to a particular group? (Fairness/Bias)
- How easily can the model be fooled? (Robustness)
- If a person got an unfavorable outcome from the model(s), what can they do to change that? (Counterfactuals)
- Can the solution provides relevant results for different business and IT stakeholders? (e.g. Clinician End Users, Risk Executives, Line of Business Owners, Data Scientists/IT)
How to Measure & Manage AI Trust
To trust an AI system, we must have confidence in its decisions. We need to know that a decision is reliable and fair, that it can be transparent, and that it cannot be tampered with. However, most automated decisioning data and models today — ML algorithms, statistical models, and rules — are black boxes that often function in oblique, invisible ways for both its developers as well as consumers and regulators.
To help businesses take the first step towards building and maintaining a trustworthy and responsible AI solution, CognitiveScale has developed solutions that identify (and perhaps score) five different types of business risks:
- Bias & Fairness_ Trusted AI systems ensure that the data and models being used is representative of the real world and the AI models are free of algorithmic biases to mitigate skewed decision making and reasoning, resulting in reasoning errors and unintended consequences. A Fairness Score can be obtained by comparing the burdens that the model imposes on different segments of the population, using the Gini index, a well-known measure of inequity. Segments can be based on ethnicity, gender, race, age, combination criteria etc. and need to be specified by the user based on domain needs. The “burden” of a group represents the average amount of change required by members of that group to get to favorable outcomes, hence lower numbers indicate more preferential treatment.
- Explainability: AI systems built using Trusted AI principles and software understand stakeholder concerns for decision interpretability and provide business process, algorithmic, and operational transparency so human users will be able to understand and trust decisions. Explainability is also represented as a number 0 and 100, indicating the typical complexity of a counterfactual explanation. The complexity is determined by the number of attributes participating in the explanation for each record in the test data. Counterfactual explanations tend to be shorter for more explainable models. The score for an instance is a non-linear but monotonically decreasing function of the number of attributes that need changing. If only a single change (the minimum possible) is required, the score is 100, and if more than 5 attributes need to change the score is 0.
- Robustness: As with other technologies, cyber-attacks can penetrate and fool AI systems. Trusted AI systems provide ability to detect and provide protection against adversarial attacks while understanding how issues with data quality impact system performance. Robustness is a measure of how well a model retains a specific outcome given perturbations to the data, whether due to adversarial attacks or natural, statistical variations. Represented as a number between 0 and 100, it is obtained using the Normalized Counterfactual Explanation-based Robustness Score (NCER) Score. A higher number indicates that larger perturbations are needed on average to significantly change a decision, hence indicating more robustness.
- Data Quality_ Data is the fuel that powers an AI. AI systems built using Trusted AI principles will ensure user visibility around data drifts, data poisoning, and ensure data validity and fit while ensuring legal justifications to use and process the data.
- Compliance: Trusted AI systems take a holistic design, implementation, and governance model that ensures that AI systems operate within the boundaries of local, national and industry regulation and are built and controlled in a compliant and auditable manner.
Summary: Only Trusted AI Will Deliver Value
According to a recent survey by Dimensional Research, nearly eight out of 10 enterprise organizations currently engaged in AI and ML report that projects have stalled due to issues of data quality and model confidence. Trust in AI solutions will become a serious issue as more clinical AI applications try to move from “lab to live,” or development to production.
Trusted AI solutions will help accelerate time to value from AI by removing the black box barrier and driving AI to help generate clinical and business value that by PWC’s projection is expected to cross $13 Trillion globally by 2030. There is certainly tremendous value for AI in cancer detection and treatment. Only trusted AI solutions that address bias, fairness, explainability, robustness, data quality and compliance will be able optimize value.