Lest we forget, you too are a system pervaded by hierarchies. Like all other systems, the human observer has a surface that defines discontinuities in information flow, and is able to be parsed into input channels, output channels, and an internal system mediating signals between the two, by for example attenuating, delaying, and averaging them through internal surfaces. Low-frequency entities operate with “long-term windows” on which recent inputs are averaged out against pre-existing information, while lower entities are responsive to short term happenings. A fly, for example, has a much shorter time lag filter for integrating experience than the human it sits on – it sees the lamp light as discontinuous flickering while human sensations are much more averaged over time. Analogously, a judge has a slower integrating filter than a policeman, allowing him to see broader patterns.
Observation can be regarded as the interface sliding between material and perceptual contributions. Two filters are therefore operative: that of information flowing from the observed (leading to statistical patterns in, for example, the retinal optic array), and that of the observer input channels (e.g. human attention). Like with all filters, in perception there is a tradeoff between “grain” (the threshold between the recorded and unrecorded – blurring an image reduces grain) and “extent” (threshold between the recorded and the undifferentiated background – zooming in shifts extent). To capture a pattern (a system surface, i.e. a hierarchical level), measurement must be of appropriate width. The global warming, for example, will not appear from data with hourly measurement, nor will it appear from data of billion-year measurements. By analogy, if you fish with a too fine-meshed net, you will get a lot of undesired sea contents along with the fish, but if it is too coarse-meshed, you will capture nothing. Observation therefore always has an a priori criterion-driven component.
How does the perceptual system select an appropriate grain and extent to differentiate system surfaces – the exploitable patterns buried amidst the stochastic flux? It is a problem that should remind you of the tradeoff in Holland’s “multiplier effect”, and how, by the inevitable accumulation of error, natural selection causes a genome to be tenuously poised at an optimum between a state of over-generality and over-specificity. An analogous principle must reasonably operate in the perceptual system, to carve the world into categories abstract enough to be useful without being vacuous, neither too fine-grained nor too coarse-grained. How best can we characterize this hierarchical mechanism?
Computer scientist Douglas Hofstadter, in the book “Fluid Concepts and Creative Analogies” (2008), has simulated this computationally in an Artificial Intelligence program able to flexibly draw analogies between strings of letters. An analogy between two strings can take an infinite number of shapes: the analogy between ABC and FGH is quite straightforward, whereas the analogy between BAC and GFH is more subtle. Both, however, make sense only with reference to alphabetical consecutiveness (while the analogy between HHH and PPP does not). We may regard “alphabetical consecutiveness” as one of many concepts or paths in a phase space of possible connections to explore, and that BAC/GFH is deeper down this path than ABC/FGH, and thus requires more computations. Analogy-making thus represents the central goal of perception: of inducing patterns from the input stream of data that allows us to predict our environment.
Hofstadter solves this by using a mechanism he calls “parallel terraced scan”. First, the whole space of potential pathways is randomly explored unfocusedly, that is, cheaply and uninterestedly. As you collect probes you use this information to assess how promising they seem, and allocate a proportionate amount of resources, so that successive stages are increasingly focused and computationally expensive, making it “terraced” (or hierarchical). At no point, however, do you neglect exploring other possibilities: the path you chose can, after all, turn out to be a dead end. If any of the other paths is found sufficiently promising, it will compete with the current viewpoint, and may ultimately override the positive feedback of the first. Hofstadter uses the metaphor of an ant colony: scout ants make random forays in the forest, reporting to the chief ant how strong the scent of food is, so the chief allocates more scouts in that direction but makes sure that some scouts continue to wander around unconcerned, shall the path prove fruitless.
Simon, incidentally, describes human problem-solving in a similar manner. Tackling that last tricky question in the exam booklet is about discovering a sequence of transformations that makes transparent the path from what you have to the goal state. While this involves trial and error and random exploration, this is selective, based on some rule of thumb of how promising a strategy seems upon evaluation – an analytical resting-point, or an intermediate subassembly in watchmaker terminology.
As seen, hierarchy theory illuminates many aspects of human cognition. But promised at the outset was also that hierarchy theory has important implications for education. Next section will argue that explanation is necessarily hierarchical in nature.