The antidote to misuse of mathematics and junk data

Depending on who you ask, the perceptions of mathematics range from an esoteric discipline that has little relevance to everyday life to a collection of magical rituals and tools that shape the operations of human cultures. In an age of exponentially increasing data volumes, the public perception has increasingly shifted towards the latter perspective.

On the one hand it is nice to see a greater appreciation for the role of mathematics, and on the other hand the growing use of mathematical techniques has led to a set of cognitive blind spots in human society:

Blind use of mathematical formalisms – magical rituals
Blind use of second hand data – unvalidated inputs
Blind use of implicit assumptions – unvalidated assumptions
Blind use of second hand algorithms – unvalidated software
Blind use of terminology – implicit semantic integration
Blind use of numbers – numbers with no sanity checks

Construction of formal models is no longer the exclusive domain of mathematicians, physical scientists, and engineers. Large and fast flowing data streams from very large networks of devices and sensors have popularised the discipline of data science, which is mostly practiced within corporations, within constraints dictated by business imperatives, and mostly without external and independent supervision.

The most worrying aspect of corporate data science is the power that corporations can wield over the interpretation of social data, and the corresponding lack of power of those that produce and share social data. The power imbalance between corporations and society is facilitated by the six cognitive blind spots, which affect the construction of formal models and their technological implementations in multiple ways:

Magical rituals lead to a lack of understanding of algorithm convergence criteria and limits of applicability, to suboptimal results, and to invalid conclusions. Examples: Naive use of frequentist statistical techniques and incorrect interpretations of p-values by social scientists, or naive use of numerical algorithms by developers of machine learning algorithms.
Unvalidated inputs open the door for poor measurements and questionable sampling techniques. Examples: use of data sets collected by a range of different instruments with unspecified characteristics, or incorrect priors in Bayesian probabilistic models.
Unvalidated assumptions enable the use of speculative causal relationships, simplistic assumptions about human nature, and create a platform for ideological bias. Examples: many economic models rest on outdated assumptions about human behaviour, and consciously ignore evidence from other disciplines that conflicts with established economic dogma.
Unvalidated software can produce invalid results, contradictions, and unexpected error conditions . Examples: outages of digital services from banks and telecommunications service providers are often treated as unavoidable, and computational errors sometimes cost hundreds of millions of dollars or hundreds of lives.
Unvalidated semantic links between mathematical formalisms, data, assumptions and software facilitate further bias and spurious complexity. Examples: Many case studies show that formalisation of semantic links and systematic elimination of spurious complexity can reduce overall complexity by factors between 3 and 20, whilst improving computational performance.
Unvalidated numbers can enable order of magnitude mistakes and obvious data patterns to remain undetected. Example: Without adequate visual representations, even simple numbers can be very confusing for a numerically challenged audience.

Whilst a corporation may not have an explicit agenda for creating distorted and dangerously misleading models, the mechanics of financial economics create an irresistible temptation to optimise corporate profit by systematically shifting economic externalities into cognitive blind spots. A similar logic applies to government departments that have been tasked to meet numerically specified objectives.

Mathematical understanding and numerical literacy is becoming increasingly important, but it is unrealistic to assume that the majority of the population will become sufficiently proficient in mathematics and statistics to be able to validate and critique the formal models employed by corporations and governments. Transparency, including open science, open data, and open source software are are emerging as essential tools for independent oversight of cognitive blind spots:

Mathematicians must be able to review the formalisms that are being used
Statisticians must be able to review measurement techniques and input data sources
Scientists and experts from disciplines relevant to the problem domain must be able to review assumptions
Software engineers familiar with the software tools that are being used must be able to review software implementations
Mathematicians with an understanding of category theory, model theory, denotational semantics, and conceptual modelling must be able to review semantic links between mathematical formalisms, terminology, data, assumptions, and software
Mathematicians and statisticians must be able to review data representations

In a zero marginal cost society, transparency allows scarce and highly specialised mathematical knowledge to be used for the benefit of society. It is very encouraging to note the similarity in knowledge sharing culture between the mathematical community and the open source software community, and to note the decreasing relevance of opaque closed source software.

The more society depends on decisions made with the help of mathematical models, the more important it becomes that these decisions adequately accommodate the concrete needs of individuals and local communities, and that the language used to reason about economics remains understandable, and enables the articulation of economic goals in simple terms.

Jorn Bettin

Knowledge archaeologist by day and neurodivergent anthropologist by night

The antidote to misuse of mathematics and junk data

5 thoughts on “The antidote to misuse of mathematics and junk data”

Leave a comment Cancel reply

Share this:

Related

5 thoughts on “The antidote to misuse of mathematics and junk data”

Leave a comment Cancel reply