Sunday, June 30, 2019

Interpretable AI: Reasoning about Why (...and why a solution is the right solution)


In the Sherlock Holmes novels, Conan Doyle’s hero is said to use his deductive power to infer by whom and how a crime was committed. He gathers the facts and then proceeds to deduce their logical conclusion. Ideally, given rules [A→B, B→C, …Y→Z] and fact (antecedent) A, Z can be deduced using the rules transitively. But in each of his cases there are gaps, not just in facts, but in available explanations. He therefore has to propose new explanations, since much of the crime was done without any witnesses. He applies abductive reasoning rather than deductive reasoning, to infer, or abduce, the cause and explanation for a certain set of given resulting facts.



Sherlock’s favorite phrase is “Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth”. This is about finding explanations for the result B, beginning with an open set of antecedents {A1,A2,A3…}; it is not simple deduction from A to B to C. And if all explanations are found impossible (possibly even by using deduction, but possible by other means), than the remaining single one must be the answer. But what if the case isn’t so discrete? What if your elimination reduces to a set of several solutions, not just one (e.g., different but overlapping genotypes)? Then you must find the most likely explanation using some oth3re means, like posterior (Bayesian) probabilities. This is precisely what abductive reasoning is all about: to find the best explanation or set of best explanations, not deducing the exact correct solution!


Truth Table:
A→B
A
True
False

B
True
True
True
False
False
True


Let’s break this down further. Logically, deduction is sound if the implications (rules) used are all sound. For the implication A→B , modus ponens exactly states: if (A→B) & A then B. That is, given a rule and the knowing the antecedent (A) is true, then B must be true if everything is sound. However, the inverse (if (A→B) & B then A) is not necessarily true (see truth table, purple text). On the other hand, it is also not necessarily false to predict A from B, tough it isn’t exactly (always) sound, and is therefore referred to as the fallacy of the converse. Yet this reasoning when done over the complete set of possible explanations, with them being ranked by which is most probable, it can be used to infer real, possible explanations. This is at the heart of abductive reasoning, which has been cited by many [] as what scientists frequently apply. Researchers using Bayesian Inference to propose explanations or mechanisms are indeed applying abduction.
Abductive reasoning requires a high bar for tools that support it, since it is not only a matter of being able to proffer a few different explanations about a phenomenon B, but to have sufficient coverage on all possible explanations so that the best ones can be ranked (using posterior probabilities), which often requires knowing the sum of (almost) all probabilities. I recommend this being the high-bar for what we have been calling knowledge bases. One can argue that practical knowledge should include verifiable explanations or the ability to find such explanations for evidence-based discovery. The guidance for this should be openly discussed and agreed on soon due to the large number of recent knowledge graph/base related offerings, some which may not meet this requirement. Knowledge systems should serve both human queries and machine-driven interrogations and inference. Currently, there are no well-defined objectives of their use, making recommendations and selection by enterprises and institutions very ambiguous.
In addition, the AI/ML community needs to address the relevance and benefits of using such knowledge resources, and whether further alignment (e.g., APIs) are needed. Specifically, knowledge systems could be used to address interpretability of AI solutions, in order to inject both context and non-technical access to their overall benefits. So far the AI community often views knowledge as something an AI system finds but doesn't require itself to take advantage of, while those developing knowledge graphs view their semantic forms as their interpretation of AI and feeding learning systems transformed data from within the graph. Both views are unproductive, and the real benefits will emerge out of considering how both technologies can be more fundamentally integrated.


Examples of logical inference

Deduction
  1. All oncogenes have the potential of becoming mutated and driving oncogenesis.
  2. Gene W is an oncogene.
∴ Gene W can cause cancer if mutated.


Abduction
  1. All oncogenes have the potential of becoming mutated and driving oncogenesis.
  2. Gene Y is observed to be mutated in a tumor
∴ Gene Y is an oncogene. 
→ NO! Counter-example: Altered Gene Y can also affect oncogenesis if it’s a Tumor Suppressor. It could even be passive and simply incidental.
→ However, if one continues to see an association of Y mutants in similar classes of tumors and it appears to be a gain of function and these mutations are rarer in other cases, then the proposition may be verified.


Induction
  1. Genes W,X,Y,Z are RTKs.
  2. Gene Y is observed to be mutated in 20% of lung  carcinomas
∴ RTKs can drive oncogenesis in lung when mutated