If we let the unconscious cognitive processes constitute a filter, we find not only the filtration of how the brain collects data, but also biases in what hypotheses it focuses evidence acquisition on. That is: even when we are aware of a whole set of hypotheses, we have a tendency of gathering information about only one of them. In the literature, this is known as “confirmation bias”. It has been explained in various different ways, like how we feel good about being right and value consistency, but is perhaps most elegantly accounted for with reference to Bayes.
According to German psychologist Gerd Gigerenzer, scientific theories tend to be metaphors of the tools used to discover and justify them (the “tools-to-theories heuristic”). It changes the Fragestellung. For example, the cognitive revolution was fueled by the advent of computers and statistical techniques, which soon became theories of mind. Thus, the brain is often thought of as a homunculus statistician that collects data and plugs them iteratively into Bayes’ theorem. And as already stated, the brain cannot collect all kinds of data – a completely Bayesian mind is impossible, for the number of computations required for optimization (i.e. finding the best hypothesis) would increase exponentially with the number of variables.
The brain must therefore allocate computational resources dynamically, based on how promising a hypothesis seems. By limiting the sample size (the working memory and attentional window), we become more likely to detect correlations, because contingencies are more likely to be exaggerated. Presumably, these benefits outweigh the dangers of confirmation bias.
In Bayesian terms, confirmation bias would mean that we fail to take the likelihood ratio P(D|H1)/P(D|H2) into account, particularly when H1 and H2 are each other’s opposites. The fact that our evidence acquisition is partial means that, for the hypotheses we favor a priori, we over-weigh the prior (our preconceptions), while for the hypothesis we initially disbelieve, we over-weigh the likelihood, being either too conservative or too keen in our belief revisions.
Because this tendency persists over time, the hypothesis is positively reinforced, leading to what psychologist Wolfgang Köhler called “fixedness” and “mental set”, the inability to think “outside the box”, which has been implicated in many clinical conditions, from depression to paranoia. Iterated Bayesian inference is like a self-modifying filter in which our belief in a hypothesis is continually revised in light of incoming data. The posterior probabilities correspond to the widths of filter meshes – the more coarse-meshed the filter, the more receptive we are to the hypothesis, and the bigger its impact on future interpretations.
Our biased evidence-collection is most evident in the type of experiment pioneered by Peter Wason in 1960. Subjects are given a triplet of numbers, such as “2,4,8”, hypothesize the generating rule, and ask the experimenter about the correctness of other triplets in order to infer it. What subjects tend to do is to state the hypothesis “2 + 2x” and then only look for confirmatory evidence by asking about, for example, “4,8,10”. As a consequence, they will never infer a generating rule for which these triplets are a subset, for example “Any even numbers” or “Numbers in increasing order”.
By analogy, if you want to know if a person is still alive, you immediately go for his pulse, rather than, say, if the eyelids are down, because the pulse is a better differentiator. We want evidence that surprises us, which in information-theoretical terms has a high self-information content. It would have been more rational, more efficient, if instead the subject asked about triplets that would distinguish between the working hypotheses and more general candidates, like “4,5,7”, in order to, as Richard Feynman put it, “prove yourself wrong as quickly as possible”. Instead, we are like the drunkard looking for our key under the streetlight, because that is where we can see.