Some scientific fields including life and social sciences make use of surrogate measurements or surrogate test environments to make sweeping assertions and incremental steps forward in the field. Easily measured phenomena is used in place of more representative ones to indicate a target event. This type of analysis actually exists within a larger class of fallacious, inductive approaches to experimentation. In less admirable situations, these may be wielded to push a pet theory or wrangle unwarranted funding or clout. In even more extreme cases, such approaches are taken to accelerate drugs through clinical trials with less than desirable outcomes.
An example of a surrogate observation: I see smoke coming through a chimney; therefore, someone is cooking. Then again, the house may be burning down. Then again again, the dryer exhaust may be ported to the chimney. Then again, the chimney in question may silhouette an adjacent house’s chimney which is emitting the smoke. You see the issue.
Note, I do not completely condemn such observations in discourse or private research, as it can be useful and informative in a pinch, particularly with bleeding edge research. Although, even in those contexts, actively conceding the shortcomings should be acknowledged and sustained. Further, it can be argued that every measurement is a surrogate or inductive measurement, but that is a different discussion and is not productive here. Their use is mostly problematic in public and potentially massively influential channels, or when those wielding it stand to incur substantial financial gains, as it might be easier to ignore poor experimental design in such situations. I also do not mean to target one field or entity for being more or less cavalier with such measurements; I am addressing it more generally as a pervasive phenomenon in human thought that is more detrimental in some domains more than others.
While many erroneous conclusions drawn from surrogate observations are harmless, unlimited use in science – particularly in clinical/pre-clinical domains – can have disastrous outcomes[1]. Irresponsible and frankly lazy invocation[2] of such tools can be found in the annuls of sociology and psychology. This type of analysis can take many subtle forms with varying degrees of validity, some more warranted than others. Of all technical disciplines, sociology seems to be one of the largest practitioners of surrogate measurements; further, it is commonly used in combination with plausibility, dogma, and narrative in place of logical or evidence-based formulations.
Even big dumb, unverifiable concepts like “law of least mental effort” are common in associated literature; what could that law mean, given our impoverished conception of the human brain? Such seemingly colloquial concepts are first engendered through intuition, accepted in key circles, then “confirmed” in biased, black box studies which are indiscernible from rudimentary polls. Some are not even formally “confirmed”; nonetheless, they permeate the field and become a priori truth, perpetuated through incestuous, referential affirmation in literature. I present this as empirical information, first and second hand. An example is comprehensively demonstrative:
“The ‘law of least mental effort’ clearly has intuitive appeal, in part from the strong analogical relationship between mental and physical effort (for discussion, see Eisenberger, 1992). It also makes sense from a normative perspective, since a bias against mental effort would steer cognition toward more efficient tasks (see Botvinick, 2007), and might preserve limited cognitive resources (see Muraven & Baumeister, 2000). Remarkably however, despite its widespread application, the ‘law of least mental effort’ appears never to have been subjected to a direct experimental test.”[2]
Over time, surrogate measurements become gold standard and dogma, with frequency increasing proportionally with field abstractness. Psychology bad, sociology worse. The caveat disclaimer that previously accompanied them fades. Some or all experts in the field will undoubtedly understand the shortcomings of the approach; however, this is of no concern. The public message and impact will have been administered, in some cases, shifting public sentiment and policy, in other cases, directly damaging individuals’ health. The course of the field is always affected for reasons of precedence and intrigue. If nothing else, outcomes serve as a distraction from rigorously uncovered truth.
These types of measurements have no place at the table with the scientific method, at least not in its ideal form (although this form may be rare). Solutions exist; simple statistical approaches can be used to determine how closely the observed variable represents the target event, what sample size is needed, etc. However, this can only get a researcher to “these data are meaningful”, that’s it. It does not afford the researcher free-form narrative describing causation. Such statistical approaches are even uncommon in studies and are replaced with norms such as experiments always being carried out in triplicate rather than representative sample sizes. Such practices become more apparent in hard problems such as human consciousness and behavior, where metaphysics and storytelling operates under the guise of science.
Acknowledging surrogate observations as an issue is not merely quibbling and is not just propped up by scarce anecdotes; rather, it exists as a more fundamental issue of reasoning and thought, crudely derivable without examples, if you will afford me the time.
There exists a slight of hand, a subtle logical slip in the application of surrogate observations. Let’s entertain the following problem:
- I would like to know “X”
- “Y” unequivocally informs me of “X”
- “Y” is not conveniently observable for reasons of cost, time, or shortcomings of current technologies
- “Z” is observable
- “Z” has been shown to correlate/covary with “X”
- “Z” is a convenient measurement
CONCLUSION: We will measure “Z” to get information about “X”.
This decision is almost presupposed, obvious, and irrefutable in some situations. After “Z” is chosen by me, a precedence is set and others can now freely observe “Z” without much thought or ridicule. This all because measuring “Y” is impractical and not productive, as it were. “Y” disappears from the set of options for characterizing “X”. Over time, within the field, “Y” becomes “X”.
With this thought in mind , a cursory survey of vaguely technical fields may frighten you.
My recommendation, in the face of sparse data, impoverished models, or poor measurement technologies, make no assertions or musings. Acknowledge their shortcomings and develop better ones. To do any different is pure mysticism.
And you thought this was a godless era.
- Echt DS, Liebson PR, Mitchell LB, Peters JW, Oblas-Manno D, Barker AH, et al. Mortality and morbidity in patients receiving encainide, flecainide, or placebo. The Cardiac Arrhythmia Suppression Trial. N Engl J Med. 1991;324:781–788.
- Kool W, McGuire JT, Rosen ZB, Botvinick MM. Decision Making and the Avoidance of Cognitive Demand. J Exp Psychol Gen. 2010 Nov; 139(4): 665–682.