Are you a model builder or a story teller?

Posted by Jorn Bettin

Have you ever wondered why “storytelling” is such a trendy topic? If this question bothers you and makes you uncomfortable, your perspective on human affairs and your cognitive lens is rather unusual.

Once upon a time, in the 1970s, model building gained popularity and nearly caught up with the older art of storytelling. But half a century later the popularity of model building is back to what it was in the 1960s, and according to the data analysis by Google Ngram “storytelling” has reached new heights, the word being used twice as often as the word “agile”.

The “art of storytelling” and “agile software” are strong contenders for being the catchphrase of the current millennium. Is is not surprising that the latter term was not in circulation in the 20th century, but it is perhaps somewhat surprising that the “art of storytelling” was pretty much non-existent before the 20th century.

What has happened to model building?

It seems that model building has given way to machine learning and “artificial intelligence” in recent years.

Model building is linked to the success of the scientific method. Researchers create, validate, and refine models to improve our level of understanding about various aspects of the world we live in, and to articulate their understanding in formal notations that facilitate independent critical analysis.

What is the usefulness of models that are only understandable for very few humans?

The scientific revolution undoubtedly led to a better understanding of some aspects of the world we live in by enabling humans to create more and more complex technologies. But it also created new levels of ignorance about externalities that went hand in hand with the development of new technologies, fuelled by specific economic beliefs about efficiency and abstractions such as money and markets.

In the early days of the industrial revolution modelling was concerned with understanding and mastering the physical world, resulting in progress in engineering and manufacturing. Over the last century formal model building was found to be useful in more and more disciplines, across all the natural sciences, and increasingly as well in medicine and the social sciences, especially in economics.

With 20/20 hindsight it becomes clear that there is a significant lag between model building and the identification of externalities that are created by systematically applying models to accelerate the development and roll-out of new technologies.

Humans are biased to thinking they understand more than they actually do, and this effect is further amplified by technologies such as the Internet, which connects us to an exponentially growing pool of information. New knowledge is being produced faster than ever whilst the time available to independently validate each new nugget of “knowledge” is shrinking, and whilst the human ability to learn new knowledge at best remains unchanged – if it is not compromised by information overload.

Those who engage in model building face the challenge of either diving deep into a narrow silo, to ensure and adequate level of understanding of a particular niche domain, or to restrict their activity to an attempt of modelling the dependencies between subdomains, and to coordinating the model building of domain experts across a number of silos. As a result:

Many models are only understandable for their creators and a very small circle of collaborators.
Each model integrator can only be effective at bridging a very limited number of silos.
The assumptions associated with each model are only known understood locally, some of the assumptions remain tacit knowledge, and assumptions may vary significantly between the models produced by different teams.
Many externalities escape early detection, as there is hardly anyone or any technology continuously looking for unexpected results and correlations across deep chains of dependencies between subdomains.

When the translation of new models into new applications and technologies is not adequately constrained by the level to which models can be independently validated and by application of the precautionary principle, potentially catastrophic surprises are inevitable.

Does it make sense to talk about models that are not understandable for any human?

Good models are not only useful, they are also understandable and have explanatory power – at least for a few people. Additionally, from the perspective of a mathematician, many of the most highly valued models also conform to an aesthetic sense of beauty, by surfacing a surprising degree of symmetry, by bringing non-intuitive connections into focus, and simply by their level of compactness.

Scientific model building is a balancing act between simplicity and usefulness. An overly complex model is less easy to understand and therefore less easy to integrate with complementary models, and an over-simplified model may be easy to work with, but may be so limited in its scope of applicability that it becomes useless.

What is not widely recognised beyond the mathematical community is that the so-called models generated by machine learning algorithms / artificial intelligence systems are not human understandable, for the same reasons that the physical representations of knowledge within a human brain are not understandable by humans. Making any sense of knowledge representations in a brain requires not only highly specialised scanning technologies but also non-trivial visualisation technologies – and the resulting pictures only give us a very crude and indirect understanding of what a person experiences and thinks about at any given moment.

Do correlation models without explanatory power qualify as models? There are many useful applications of machine learning, but if the learning does not result in models that are understandable, then the results of machine learning should perhaps be referred to as digital correlation maps to avoid confusion with models that are designed for human consumption. Complex correlation maps can be visualised in ways similar to the results of brain scans, and the level of insights that can be deduced from such visualisations are correspondingly limited.

It is not yet clear how to construct conscious artificial intelligence systems, i.e. systems that can not only establish correlations between data streams, but that are also capable of developing conceptual models of themselves and their environment that can be shared with and can be understood by humans. In particular current machine learning systems are not able to explain how they arrive at specific conclusions.

The limitations of machine learning highlights what is being lost by neglecting model building and by leaving modelling entirely to individual experts working in deep and narrow silos. Model validation and integration has largely been replaced with over-simplified storytelling – the goal has shifted from improving understanding to applying the tools of persuasion.

What’s the story with storytelling?

rumours

The art of storytelling is linked to the rise of marketing and persuasive writing. Edward Bernays was one of the original shapers of the logic of marketing:

Bernays’ vision was of a utopian society in which individuals’ dangerous libidinal energies, the psychic and emotional energy associated with instinctual biological drives that Bernays viewed as inherently dangerous given his observation of societies like the Germans under Hitler, could be harnessed and channelled by a corporate elite for economic benefit. Through the use of mass production, big business could fulfil the cravings of what Bernays saw as the inherently irrational and desire-driven masses, simultaneously securing the niche of a mass production economy (even in peacetime), as well as sating what he considered to be dangerous animal urges that threatened to tear society apart if left unquelled.

Bernays touted the idea that the “masses” are driven by factors outside their conscious understanding, and therefore that their minds can and should be manipulated by the capable few. “Intelligent men must realize that propaganda is the modern instrument by which they can fight for productive ends and help to bring order out of chaos.”

The conscious and intelligent manipulation of the organized habits and opinions of the masses is an important element in democratic society. Those who manipulate this unseen mechanism of society constitute an invisible government which is the true ruling power of our country. …In almost every act of our daily lives, whether in the sphere of politics or business, in our social conduct or our ethical thinking, we are dominated by the relatively small number of persons…who understand the mental processes and social patterns of the masses. It is they who pull the wires which control the public mind.

Propaganda was portrayed as the only alternative to chaos.

The purpose of storytelling is the propagation of beliefs and emotions.

What is the usefulness of stories if they do nothing to improve our level of understanding of the world we live in?

Sure, if stories help to increase the number of shared beliefs within a group, then the people involved may understand more about the motivations and behaviours of the others within the group. But at the same time, in the absence of building improved models about the non-social world, the behaviour of the group easily drifts into more and more abstract realms of social games, making the group increasingly blind to the effects of their behaviours on outsiders and on the non-social world.

Stories are appealing and hold persuasive potential because of their role in cultural transmission is the result of gene-culture co-evolution in tandem with the human capability for symbolic thought and spoken language. In human culture stories are involved in two functions:

Transmission of beliefs that are useful for the members of a group. Shared beliefs are the catalyst for improved collaboration.
Deception in order to protect or gain social status within a group or between groups. In the framework of contemporary competitive economic ideology deception is often referred to as marketing.

Storytelling thus is a key element of cultural evolution. Unfortunately cultural evolution fuelled by storytelling is a terribly slow form of learning for societies, even though storytelling is an impressively fast way for transmitting beliefs to other individuals. Not entirely surprisingly some studies find the prevalence of psychopathic traits in the upper echelons of the corporate world to be between 3% and 21%, much higher than the 1% prevalence in the general population.

Storytelling with the intent of deception enables individuals to reap short-term benefits for themselves to the longer-term detriment of society

The extent to which deceptive storytelling is tolerated is influenced by cultural norms, by the effectiveness of institutions and technologies entrusted with the enforcement of cultural norms, and the level of social inequality within a society. The work of the disciples of Edward Berneys ensured that deceptive storytelling has become a highly respected and valued skill.

However, simply focusing on minimising deception is no fix for all the weaknesses of storytelling. When a society with highly effective norm enforcement insists on rules and behavioural patterns that create environmental or social externalities, some of which may be invisible from within the cultural framework, deception can become a vital tool for those who suffer as a result of the externalities.

Furthermore, even in the absence of intentional deception, the maintenance, transmission, and uncritical adoption of beliefs via storytelling can easily become problematic if beliefs held in relation to the physical and living world are simply wrong. For example some people and cultures continue to hold scientifically untenable beliefs about the causes of specific diseases.

All political and economic ideologies rely on storytelling

Human societies are complex adaptive systems that can’t be described by any simple model. More precisely, it is not possible to develop long-range and detailed predictive models for social and economic behaviour. However, in a similar way that extensive sensor networks and modern computing technology allows the development of useful short-range weather forecasts, it is possible to use social and economic data to look for externalities and attempts of corruption.

Nothing stands in the way of monitoring the results of significant social and economic changes with a level of diligence that is comparable to the diligence expected from researchers when conducting scientific experiments in the medical field. Of course the pharmaceutical industry also has a reputation for colourful storytelling, and the healthcare sector is not spared from ethical corruption and the tools of marketing. But at least the healthcare sector is heavily regulated, academic research is an integral part of the sector, and independent validation of results is part of the certification process for all new products and treatments.

One has to wonder why economic and social policies are not subject to a comparable level of independent oversight. The model of governance in modern democracies typically includes a separation of power between legislature, executive, and judiciary, but the question is whether effective separation of power can be maintained over decades and centuries.

Human societies and social structures are far from static. Concepts such as the nation state are only a couple of hundred years old and the lifespan of economic bubbles and the structures created by within such bubbles is measured in years rather than centuries. And yet, many people and institutions are incapable of considering possible economic or social arrangements that lie outside consumerism and the cultural norms that currently dominate within a particular nation state. Cultural inertia is beneficial for societies whenever the environment in which they are embedded is highly stable, but it becomes problematic when the environment is undergoing rapid change.

Historically a rapidly changing environment used to be associated with local wars or local natural disasters such as extended periods of draughts or earthquakes. The industrial revolution has significantly shifted the main triggers of rapid change:

Improvements in technology, hygiene and medicine have facilitated significant population growth and ushered in a new geological era – the anthropocene, human activity is changing the physical environment faster than ever before
Machine powered technology has enabled wars of unprecedented scale, speed, and levels of destructiveness
The paradigm of growth based economics fuelled by interest bearing debt and aggressive marketing dominates on all continents in most societies, and facilitates global economic bubbles
Carbon emissions and other physical externalities of modern economic activity have no physical boundaries

Given this context it is extremely tempting for professional politicians within government and corporations to subscribe to the elitist logic of Edward Bernays and to exploit storytelling for local or personal gains. An alien observer of human societies would probably be amazed that some humans (and large organisations) are given a platform for virtually unlimited storytelling at a scale that affects billions and hundreds of millions people, and that delusional and misleading stories are let lose on the population of a species that is the local champion of cultural transmission on this planet.

Within growth based economics the effectiveness of marketing can never be good enough. Desperate corporations are hoping machine learning algorithms can take storytelling to yet another level. High frequency trading is one example of “successful” automated marketing, where algorithms try to trick each other into believing stories that are beyond human comprehension.

End of story?

If we continue to believe that the world is shaped exclusively by human delusions, then the human story may come to a fairly unspectacular end rather soon. It also won’t help us if we focus on building technologies that provide even more powerful delusions.

If there is anything that has led to significant improvements in human well-being and life expectancy in the last thousand years it would undoubtedly have to be model building and the scientific method. The power tools of systematic experimentation and modelling facilitated much of what we call progress but they also facilitated dangerous social games at a planetary scale.

gamification

Just as medical science no longer relies on unsubstantiated stories, the stories that we tell each other in business, government, and academic administration need to be subjected to critical analysis, and the public needs to be made aware of the evidence (or lack thereof) that underpins the claims of politicians and executives in the corporate world, so that experiments are clearly identified, and most importantly, that experiments are carefully monitored and subjected to independent review before being sold as solutions.

In this context lessons can be learned from the fast moving world of digital technology. On the positive side the software development community is acutely aware of the need to conduct experiments, on the negative side, outside a few life critical industries, the lack of rigour when conducting experiments in the development and deployment of new software solutions is embarrassing. In the software development community conducting multiple independent experiments is generally considered a waste of time, and the interests of financial investors determine the kinds of “solutions” that receive funding:

All human artefacts are technology. But beware of anybody who uses this term. Like “maturity” and “reality” and “progress”, the word “technology” has an agenda for your behaviour: usually what is being referred to as “technology” is something that somebody wants you to submit to.

“Technology” often implicitly refers to something you are expected to turn over to “the guys who understand it.” This is actually almost always a political move. Somebody wants you to give certain things to them to design and decide. Perhaps you should, but perhaps not.
– Ted Nelson, a pioneer of information technology, philosopher, and sociologist who coined the terms hypertext and hypermedia in 1963.

The software industry is an interesting economic subsystem for observing human social behaviour at large scale. Today this sector is interwoven with virtually all other economic subsystems and even with the most common tools that we use for communicating with each other.

David Graeber has analysed the phenomenon of “bullshit jobs” in detail.

“In the year 1930, John Maynard Keynes predicted that technology would have advanced sufficiently by century’s end that countries like Great Britain or the United States would achieve a 15-hour work week. There’s every reason to believe he was right. In technological terms, we are quite capable of this. And yet it didn’t happen. Instead, technology has been marshalled, if anything, to figure out ways to make us all work more. In order to achieve this, jobs have had to be created that are, effectively, pointless. Huge swathes of people, in Europe and North America in particular, spend their entire working lives performing tasks they secretly believe do not really need to be performed. The moral and spiritual damage that comes from this situation is profound. It is a scar across our collective soul. Yet virtually no one talks about it. …”

Silicon Valley innovation pop-culture?

Students of software engineering and computer science are often attracted by the idea of “innovation” and by the prospect of exciting creative work, contributing to the development of new services and products. The typical reality of software development has very little if anything to do with innovation and much more with building tools that support David Graeber’s “bullshit jobs” and Edward Bernays’ elitist “utopia” of conscious manipulation of the habits and opinions of the masses by a small number of “leaders” suffering from narcissistic personality disorder.

The culture within the software development community is shaped much less by mathematics and scientific knowledge about the physical world than by the psychology of persuasion – and an anaemic conception of innovation based on social popularity and design principles that encourage planned obsolescence. A few years ago Alan Kay, a pioneer of object-oriented programming and windowing graphical user interface design observed:

It used to be the case that people were admonished to “not re-invent the wheel”. We now live in an age that spends a lot of time “reinventing the flat tire!”

The flat tires come from the reinventors often not being in the same league as the original inventors. This is a symptom of a “pop culture” where identity and participation are much more important than progress. … In the US we are now embedded in a pop culture that has progressed far enough to seriously hurt places that hold “developed cultures”. This pervasiveness makes it hard to see anything else, and certainly makes it difficult for those who care what others think to put much value on anything but pop culture norms.

Mainstream software development practices are geared towards dealing with the characteristics of big ball of mud architectures and the reality of the curse of software maintenance.

Do we need a better language for model building?

Making model building accessible to a wider audience may require developing a cognitively simple visual language for articulating resource and information flows in living and economic systems in a format that is not influenced by any particular economic ideology.

Many of the languages of mathematics already make use of visual concept graphs. Digital devices open up the possibility of highly visual languages and user interfaces that enable everyone to create concept graphs that are formal in a mathematical sense, understandable for humans, and easily processable by software tools. The only formal foundations needed implementing such a visual language system are axioms from model theory, category theory, and domain theory.

In terms of usability, a formal software-mediated visual language system that takes into consideration human cognitive limits has the potential to:

Improve the speed and quality of knowledge transfer between human domain experts
Improve the speed and quality of knowledge transfer between human domain experts and software tools
Facilitate innovative approaches to extracting human understandable semantics from informal textual artefacts, in a format that is easily processable by software tools
Facilitate innovative approaches to unsupervised machine learning that deliver results in a format that is compatible with familiar representations used by human domain experts, enabling the construction of knowledge repositories capable of receiving inputs from:
- human domain experts
- informal textual sources of human knowledge
- machine learning systems

All scientists, engineers, and technologists are familiar with a language that is more expressive and less ambiguous than spoken and written language. The language of concept graphs with highly domain and context-specific iconography regularly appears on white boards whenever two or more people from different disciplines engage in collaborative problem solving. Such languages can easily be formalised mathematically and can be used in conjunction with rigorous validation by example / experiments.

Model building and digital correlation maps can go hand in hand

Machine learning need not result in opaque systems that are as difficult to understand as humans, and a formal visual language may represent the biggest breakthrough for improving the understanding between humans since the development of spoken language.

… And storytelling and social transmission need not result in a never ending sequence of psychopathic social games if we get into the habit of explicitly tagging all stories with the available supporting evidence, so that untested ideas and attempts of corruption become easier to identify.

semantic lens In all domains where decisions and actions may have significant impact on others and on the environment we live in, adopting a more autistic mindset in relation to human stories may improve human decision making. In the Asch conformity experiment, autists were found to resist changing their spontaneous judgement to an array of graphic lines despite social pressure to change by conforming to the erroneous judgement of an authoritative confederate.

Mathematics – the language of explanation and validation

Paul Lockhart describes mathematics as the art of explanation. He is correct. Mathematical proofs are the one type of storytelling that is committed to being entirely open regarding all assumptions and to the systematically exploring all the possible implications of specific sets of assumptions. Foundational mathematical assumptions are usually refereed to as axioms.

Formal proofs are parametrised formal stories (sequences of reasoning steps) that explore the possibilities of entire families of stories and their implications. Mathematical beauty is achieved when a complex family of stories can be described by a small elegant formal statement. Complexity does not melt away accidentally. It is distilled down to the its essence by finding a “natural language” (or “model”) for the problem space represented by a family of formal stories.

A useful model encapsulates all relevant commonalities of the problem space – it provides an explanation that is understandable for anyone who is able to follow the reasoning steps leading to the model.

The more parameters and relationships between parameters come into play, the more difficult it typically is to uncover cognitively simple models that shed new light onto a particular problem space and the underlying assumptions. If a particular set of formal assumptions is found to have a correspondence in the physical or living world, the potential for positive and negative technological innovation can be profound.

Whether the positive or negative potential prevails is determined by the motivations, political moves, and stories told by those who claim credit for innovation.

Any hope of progress beyond stories?

From within a large organisation culture is often perceived as being static or very slow moving locally, and changes in the environment are being perceived as being dynamic and fast moving. This is an illusion. It is easy to lose sight of the bigger picture.

Outside the context of “work” the people within a large organisation are part of many other groups and part of the rapidly evolving “external” context. The larger an organisation, the greater the inertia of the internal organisational culture, and the faster this culture disconnects from the external cultural identities of employees, customers, and suppliers.

The resulting cognitive dissonance manifests itself in terms of low levels of employee engagement, high levels of mental illness, and the increasingly short life expectancy of large corporations. Group identities and concepts such as intelligence and success are cultural constructs that are subject to evolutionary pressure and phase transitions.

Marketing may well become a taboo in the not-too-distant future

Over the next few weeks, to my knowledge, there are at least four dedicated conferences on the topics of redefining intelligence, new economics, and cultural evolution.

knowledgesynthesis

Conference on Interdisciplinary Innovation and Collaboration

Melbourne, Australia – 2 September 2017
Auckland, New Zealand – 16 September 2017

These events are part of the quarterly CIIC unconference series, addressing challenges that go beyond the established framework of research in industry, government and academia. The workshops in September will build on the results from earlier workshops to explore the essence of humanity and how to construct organisations that perform a valuable function in the living world.

The historic record of societies and large organisations being aware of the limitations of their culture is highly unimpressive. Redefining intelligence is our chance to break out of self-destructive patterns of behaviour. It is a first step towards a better understanding of the positive and negative human potential within the ecological context of the planet.

More information on CIIC and the theme for the upcoming unconference:

Registration: https://ciic.s23m.com/registration/

Thanks to Pete Rive and Arthur Shelley at AUT and RMIT for providing CIIC with superb venues!

New Economy Conference

Brisbane, Australia – 1-3 September 2017

Building on the inaugural 2016 conference held in Sydney, the 2017 gathering invites people to come together to share stories of success, address challenges and join the broader movement so we can continue working together to build a ‘new’ economic system. The 2017 New Economy Conference will bring together hundreds of people and organisations to launch powerful new collective strategies for creating positive social and economic change, to achieve long term, liveable economies that fit within the productive capacity of a healthy environment.

More information on NENA and the Building the New Economy conference:

Registration: https://www.trybooking.com/book/event?eid=281640

Thanks to Donnie Maclurcan and Tirrania Suhood for making me aware of this event!

Inaugural Cultural Evolution Society Conference

Jena, Germany – 13 – 15 September 2017

The Cultural Evolution Society supports evolutionary approaches to culture in humans and other animals. The society welcomes all who share this fundamental interest, including in the pursuit of basic research, teaching, or applied work. We are committed to fostering an integrative interdisciplinary community spanning traditional academic boundaries from across the social, psychological, and biological sciences, and including archaeology, computer science, economics, history, linguistics, mathematics, philosophy and religious studies. We also welcome practitioners from applied fields such as medicine and public health, psychiatry, community development, international relations, the agricultural sciences, and the sciences of past and present environmental change.

More information on CES and the related conference:

Registration: https://mi.conventus.de/online/imces-2017.do

Thanks to Joe Brewer and his team for coordinating this unique event!

The antidote to misuse of mathematics and junk data

Posted by Jorn Bettin

Depending on who you ask, the perceptions of mathematics range from an esoteric discipline that has little relevance to everyday life to a collection of magical rituals and tools that shape the operations of human cultures. In an age of exponentially increasing data volumes, the public perception has increasingly shifted towards the latter perspective.

On the one hand it is nice to see a greater appreciation for the role of mathematics, and on the other hand the growing use of mathematical techniques has led to a set of cognitive blind spots in human society:

Blind use of mathematical formalisms – magical rituals
Blind use of second hand data – unvalidated inputs
Blind use of implicit assumptions – unvalidated assumptions
Blind use of second hand algorithms – unvalidated software
Blind use of terminology – implicit semantic integration
Blind use of numbers – numbers with no sanity checks

Construction of formal models is no longer the exclusive domain of mathematicians, physical scientists, and engineers. Large and fast flowing data streams from very large networks of devices and sensors have popularised the discipline of data science, which is mostly practiced within corporations, within constraints dictated by business imperatives, and mostly without external and independent supervision.

The most worrying aspect of corporate data science is the power that corporations can wield over the interpretation of social data, and the corresponding lack of power of those that produce and share social data. The power imbalance between corporations and society is facilitated by the six cognitive blind spots, which affect the construction of formal models and their technological implementations in multiple ways:

Magical rituals lead to a lack of understanding of algorithm convergence criteria and limits of applicability, to suboptimal results, and to invalid conclusions. Examples: Naive use of frequentist statistical techniques and incorrect interpretations of p-values by social scientists, or naive use of numerical algorithms by developers of machine learning algorithms.
Unvalidated inputs open the door for poor measurements and questionable sampling techniques. Examples: use of data sets collected by a range of different instruments with unspecified characteristics, or incorrect priors in Bayesian probabilistic models.
Unvalidated assumptions enable the use of speculative causal relationships, simplistic assumptions about human nature, and create a platform for ideological bias. Examples: many economic models rest on outdated assumptions about human behaviour, and consciously ignore evidence from other disciplines that conflicts with established economic dogma.
Unvalidated software can produce invalid results, contradictions, and unexpected error conditions . Examples: outages of digital services from banks and telecommunications service providers are often treated as unavoidable, and computational errors sometimes cost hundreds of millions of dollars or hundreds of lives.
Unvalidated semantic links between mathematical formalisms, data, assumptions and software facilitate further bias and spurious complexity. Examples: Many case studies show that formalisation of semantic links and systematic elimination of spurious complexity can reduce overall complexity by factors between 3 and 20, whilst improving computational performance.
Unvalidated numbers can enable order of magnitude mistakes and obvious data patterns to remain undetected. Example: Without adequate visual representations, even simple numbers can be very confusing for a numerically challenged audience.

Whilst a corporation may not have an explicit agenda for creating distorted and dangerously misleading models, the mechanics of financial economics create an irresistible temptation to optimise corporate profit by systematically shifting economic externalities into cognitive blind spots. A similar logic applies to government departments that have been tasked to meet numerically specified objectives.

Mathematical understanding and numerical literacy is becoming increasingly important, but it is unrealistic to assume that the majority of the population will become sufficiently proficient in mathematics and statistics to be able to validate and critique the formal models employed by corporations and governments. Transparency, including open science, open data, and open source software are are emerging as essential tools for independent oversight of cognitive blind spots:

Mathematicians must be able to review the formalisms that are being used
Statisticians must be able to review measurement techniques and input data sources
Scientists and experts from disciplines relevant to the problem domain must be able to review assumptions
Software engineers familiar with the software tools that are being used must be able to review software implementations
Mathematicians with an understanding of category theory, model theory, denotational semantics, and conceptual modelling must be able to review semantic links between mathematical formalisms, terminology, data, assumptions, and software
Mathematicians and statisticians must be able to review data representations

In a zero marginal cost society, transparency allows scarce and highly specialised mathematical knowledge to be used for the benefit of society. It is very encouraging to note the similarity in knowledge sharing culture between the mathematical community and the open source software community, and to note the decreasing relevance of opaque closed source software.

The more society depends on decisions made with the help of mathematical models, the more important it becomes that these decisions adequately accommodate the concrete needs of individuals and local communities, and that the language used to reason about economics remains understandable, and enables the articulation of economic goals in simple terms.

The big human battle of this century

Posted by Jorn Bettin

The big human battle of this century is going to be the democratisation of data and all forms of knowledge, and the introduction of digital government with the help of free and open source software

Whilst undoubtedly the reaction of the planet to the explosion of human activities with climate change and other symptoms is the largest change process that has ever occurred in human history in the physical realm, the exponential growth of the Internet of Things and digital information flows is triggering the largest change process in the realm of human organisation that societies have ever experienced.

The digital realm

Sensor networks and pervasive use of RFID tags are generating a flood of data and lively machine-to-machine chatter. Machines have replaced humans as the most social species on the planet, and this must inform the approach to the development of healthy economic ecosystems.

Sensors that are part of the Internet of Things

When data scientists and automation engineers collaborate with human domain experts in various disciplines, machine-generated data is the magic ingredient for solving the hardest automation problems.

In domains such as manufacturing and logistics the writing is on the wall. Introduction of self-driving vehicles and just a few more robots on the shop floor will eliminate the human element in the social chatter at the workplace within the next 10 years.
The medical field is being revolutionised by the downward spiral of the cost of genetic analysis, and by the development of medical robots and medical devices that are hooked up to the Internet, paving the way for machine learning algorithms and big data to replace many of the interactions with human medical professionals.
The road ahead for the provision of government services is clearly digital. It is conceivable that established bureaucracies can resist the trend to digitisation for a few years, but any delay will not prevent the inevitability of automation.

The social implications

Data driven automation leads to an entirely new perspective on the purpose of the education system and on the role of work and employment in society.

Large global surveys show that more than 70% of employees are disengaged at work. It is mainly in manufacturing that automation directly replaces human labour. In many other fields the shift in responsibilities from humans to machines initially goes hand in hand with the invention of new roles and loss of a clear purpose.

Traditional work is being transformed into a job for a machine. Exceptions are few and far between.

Data that is not sufficiently accessible is only of very limited value to society. The most beneficial and disruptive data driven innovation are those that result from the creative combination of data sets from two or more different sources.

It is unrealistic to assume that the most creative minds can be found via the traditional channel of employment, and it is unrealistic that such minds can achieve the best results if data is locked up in organisation-specific or national silos.

The most valuable data is data that has been meticulously validated, and that is made available in the public domain. It is no coincidence that software, data, and innovation is increasingly produced in the public domain. Jeremy Rifkin describes the emergence of a third mode of commons-based digitally networked production that is distinct from the property- and contract-based modes of firms and markets.

The education system has a major role to play in creating data literate citizen-scientists-innovators.

The role of economics

It is worthwhile remembering the origin of the word economics. It used to denote the rules for good household management. On a planet that hosts life, household management occurs at all levels of scale, from the activities of single cells right up to processes that involve the entire planetary ecosystem. Human economics are part of a much bigger picture that always included biological economics and that now also includes economics in the digital realm.

To be able to reason about economics at a planetary level the planet needs a language for reasoning about economic ecosystems, only some of which may contain humans. Ideally such a language should be understandable by humans, but must also be capable of reaching beyond the scope of human socio-economic systems. In particular the language must not be coloured by any concrete human culture or economic ideology, and must be able to represent dependencies and feedback loops at all levels of scale, as well as feedback loops between levels of scale, to enable adequate representation of the fractal characteristic of nature.

The digital extension of the planetary nervous system

In biology the use of electrical impulses for communication is largely confined to communication within individual organisms, and communication between organisms is largely handled via electromagnetic waves (light, heat), pressure waves (sound), and chemicals (key-lock combinations of molecules).

The emergence of the Internet of Things is adding to the communication between human made devices, which in turn interact with the local biological environment via sensors and actuators. The impact of this development is hard to overestimate. The number of “tangible” things that might be computerized is approaching 200 billion, and this number does not include large sensor networks that are being rolled out by scientists in cities and in the natural environment. Scientists are talking about trillion-sensor networks within 10 years. The number of sensors in mobile devices is already more than 50 billion.

Compared to chemical communication channels between organisms, the speed of digital communication is orders of magnitude faster. The overall effect of equipping the planet with a ubiquitous digital nervous system is comparable to the evolution of animals with nervous systems and brains – it opens up completely new possibilities for household management at all levels of scale.

The complexity of the Internet of Things that is emerging on the horizon over the next decade is comparable to the complexity of the human brain, and the volume of data flows handled by the network is orders of magnitudes larger than anything a human brain is able to handle.

The global brain

Over the course of the last century, starting with the installation of the first telegraph lines, humans have embarked on the journey of equipping the planet with a digital electronic brain. To most human observers this effort has only become reasonably obvious with the rise of the Web over the last 20 years.

Human perception and human thought processes are strongly biased towards the time scales that matter to humans on a daily basis to the time scale of a human lifetime. Humans are largely blind to events and processes that occur in sub-second intervals and processes that are sufficiently slow. Similarly human perception is biased strongly towards living and physical entities that are comparable to the physical size of humans plus minus two orders of magnitude.

As a result of their cognitive limitations and biases, humans are challenged to understand non-human intelligences that operate in the natural world at different scales of time and different scales of size, such as ant colonies and the behaviour of networks of plants and microorganisms. Humans need to take several steps back in order to appreciate that intelligence may not only exist at human scales of size and time.

The extreme loss of biodiversity that characterises the anthropocene should be a warning, as it highlights the extent of human ignorance regarding the knowledge and intelligence that evolution has produced over a period of several billion years.

It is completely misleading to attempt to attach a price tag to the loss of biodiversity. Whole ecosystems are being lost – each such loss is the loss of a dynamic and resilient living system of accumulated local biological knowledge and wisdom.

Just like an individual human is a complex adaptive system, the planet as a whole is a complex adaptive system. All intelligent systems, whether biological or human created, contain representations of themselves, and they use these representations to generate goal directed behaviour. Examples of intelligent systems include not only individual organisms, but also large scale and long-lived entities such as native forests, ant colonies, and coral reefs. The reflexive representations of these systems are encoded primarily in living DNA.

From an external perspective it nearly seems as if the planetary biological brain, powerful – but thinking slowly in chemical and biological signals over thousands of years, has shaped the evolution of humans for the specific purpose of developing and deploying a faster thinking global digital brain.

It is delusional to think that humans are in control of what they are creating. The planet is in the process of teaching humans about their role in its development, and some humans are starting to respond to the feedback. Feedback loops across different levels of scale and time are hard for humans to identify and understand, but that does not mean that they do not exist.

The global digital brain is currently still in under development, not unlike the brain of a human baby before birth. All corners of the planet are being wired up and connected to sensors and actuators. The level of resilience of the overall network depends on the levels of decentralisation, redundancy, and variability within the network. A hierarchical structure of subsystems as envisaged by technologist Ray Kurzweil is influenced by elements of established economic ideology rather than by the resilient neural designs found in biology. A hierarchical global brain would likely suffer from recurring outages and from a lack of behavioural plasticity, not unlike the Cloud services from Microsoft and Amazon that define the current technological landscape.

Global thinking

The ideology of economic globalisation is dominated by simplistic and flawed assumptions. In particular the concepts of money and globally convertible currencies are no longer helpful and have become counter-productive. The limitations of the monetary system are best understood by examining the historic context in which money and currencies were invented, which predates the development of digital networks by several thousand years. At the time a simple and crude metric in the form of money was the best technology available to store information about economic flows.

As the number of humans has exploded, and as human societies have learned to harness energy in the form of fossil fuels to accelerate and automate manufacturing processes, the old monetary metrics have become less and less helpful as economic signals. In particular the impact of economic externalities that are ignored by the old metrics, both in the natural environment as well as in the human social sphere, is becoming increasingly obvious.

The global digital brain allows flows of energy, physical resources, and economic goods to be tracked in minute detail, without resorting to crude monetary metrics and assumptions of fungibility that open the door to suppressing inconvenient externalities.

A new form of global thinking is required that is not confined to the limited perspective of financial economics. The notions of fungibility and capital gains need to be replaced with the notions of collaborative economics and zero-waste cyles of economic flows.

Metrics are still required, but the new metrics must provide a direct and undistorted representation of flows of energy, physical resources, and economic goods. Such highly context specific metrics enable computational simulation and optimisation of zero-waste economics. Their role is similar to the role of chemical signalling substances used by biological organisms.

Global thinking requires the extension of a zero-waste approach to economics to the planetary level – leaving no room for any known externalities, and encouraging continuous monitoring to detect unknown externalities that may be affecting the planetary ecosystem.

The future of human economics

The real benefits of the global digital brain will be realised when massive amounts of machine generated data become accessible in the public domain in the form of disruptive innovation, and are used to solve complex optimisation problems in transportation networks, distributed generation and supply of power, healthcare, recycling of non-renewable resources, industrial automation, and agriculture.

Five years ago Tim O’Reilly predicted a war for control of the Web. The hype around big data has let many organisations forget that the Web and social media in particular is already saturated with explicit and implicit marketing messages, and that there is an upper bound to the available time (attention) and money for discretionary purchases. A growing list of organisations is fighting over a very limited amount of potential revenue, unable to see the bigger picture of global economics.

Over the next decade one of the biggest challenges will be the required shift in organisational culture, away from simplistic monetisation of big data, towards collaboration and extensive data and knowledge sharing across disciplines and organisational boundaries. The social implications of advanced automation across entire economic ecosystems, and a corresponding necessary shift in the education system need to be addressed.

The future of humans

Human capabilities and limitations are under the spot light. How long will it take for human minds to shift gears, away from the power politics and hierarchically organised societies that still reflect the cultural norms of our primate cousins, and from myopic human-centric economics, towards planetary economics that recognise the interconnectedness of life across space and time?

The future of democratic governance could be one where people vote for human understandable open source legislation that is directly executable by intelligent software systems. Corporate and government politicians will no longer be deemed as an essential part of human society. Instead, any concentration of power in human hands is likely to be recognised as an unacceptable risk to the welfare of society and the health of the planet.

Earth

Humans have to ask themselves whether they want to continue to be useful parts of the ecosystem of the planet or whether they prefer to take on the role of a genetic experiment that the planet switched on and off for a brief period in its development.

Big data blah $ blah $ blah $

Posted by Jorn Bettin

Why does LinkedIn feed me with big data hype from 2011?

Big data: The next frontier for innovation, competition, and productivity

By only talking about dollar metrics, potential big data intelligence is turned into junk data science.

blunt abstraction of native domain metrics into dollars is a source of junk data

All meaningful automation, quality, energy efficiency, and resilience metrics are obliterated by translating into dollars. Good business decisions are made by understanding the implications of domain-specific metrics:

Level of automation
Volume of undesirable waste
Energy use
Reliability and other quality of service attributes

Any practitioner of Kaizen knows that sustainable cost reductions are the result of improvements in concrete metrics that relate directly to the product that is being developed or the service that is being delivered. The same domain expertise that is useful for Kaizen can be combined with high quality external big data sources to produce insights that enable radical innovation.

Yes, often the results have a highly desirable effect on operating costs or sales, but the real value can only be understood in terms of native domain metrics. The healthcare domain is a good example. Minimising the costs of high quality healthcare is desirable, but only when patient outcomes and quality of care are not compromised.

When management consultants only talk about results in dollars, there is a real danger of only expressing goals in financial terms. This then leads down the slippery slope of tinkering with outcomes and accounting procedures until the desirable numbers are within range. It is too late when experts start to ask questions about outcomes, and when lacking native domain metrics expose reductions in operational costs as a case of cutting corners.

Before believing a big data case study, always look beyond the dollars. If in doubt, survey customers to confirm claims of improved outcomes and customer satisfaction. The referenced McKinsey article does not encourage corner cutting, but it fails to highlight the need for setting targets in native domain metrics, and it distracts the reader with blunt financial metrics.

Let’s talk semantics. Do you know what I mean?

Posted by Jorn Bettin

Over the last few years the talk about search engine optimisation has given way to hype about semantic search.

context matters

The challenge with semantics is always context. Any useful form of semantic search would have to consider the context of a given search request. At a minimum the following context variables are relevant: industry, organisation, product line, scientific discipline, project, and geography. When this context is known, a semantic search engine can realistically tackle the following use cases:

Looking up the natural language names or idioms that are in use to refer to a specific concept
Looking for domain knowledge; i.e. looking for all concepts that are related to a given concept
Investigating how a word or idiom is used in other industries, organisations, products, research projects, geographies; i.e. investigating the variability of a concept across industries, organisations, products, research projects, and geographies
Looking up all the instances where a concept is used in Web content
Investigating how established a specific word or idiom is in the scientific community, to distinguish between established terminology and fashionable marketing jargon
Looking up the formal names that are used in database definitions, program code, and database content to refer to a specific concept
Looking up all the instances where a concept is used in database definitions, program code, and database content

These use cases relate to the day-to-day work of many knowledge workers. The following presentation illustrates the challenges of semantic search and it contains examples that illustrate how semantic search based on concepts differs from search based on words.

Do you know what I mean?

The current semantic Web is largely blind to the context parameters of industry, organisation, product line, scientific discipline, and project. Google, Microsoft, and other developers of search engines consider a fixed set of filter categories such as geography, time of publication, application, etc. and apply a more or less secret sauce to deduce further context from a user’s preferences and browsing history. This approach is fundamentally flawed:

Each search engine relies on an idiosyncratic interpretation of user preferences and browsing history to deduce the values of further context variables, and the user is only given limited tools for influencing the interpretation, for example via articulating “likes” and “dislikes”
Search engines rely on idiosyncratic algorithms for translating filters, and “likes” and “dislikes” into search engine semantics
Search engines are unaware of the specific intent of the user at a given point in time, and without more dynamic and explicit mechanisms for a user to articulate intent, relying on a small set of filter categories, user’s preferences, and browsing history is a poor choice

The weak foundations of the “semantic Web”, which evolved from a keynote from Tim Berners-Lee in 1994, compound the problem:

“Adding semantics to the Web involves two things: allowing documents which have information in machine readable forms, and allowing links to be created with relationship values.”

Subsequently developed W3C standards are the result of the design by committee with the best intentions.

All organisations that have high hopes for turning big data into gold should pause for a moment, and consider the full implication of “garbage in, garbage out” in their particular context. Ambiguous data is not the only problem. Preconceived notions about semantics are another big problem. Implicit assumptions are easily baked into analytical problem statements, thereby confining the space of potential “insights” gained from data analysis to conclusions that are consistent with preconceived interpretations of so-called metadata.

The root cause of the limitations of state-of-the-art semantic search lies in the following implicit assumptions:

Text / natural language is the best mechanism for users to articulate intent, i.e. a reliance on words rather than concepts
The best mechanism to determine context is via a limited set of filter categories, user preferences, and via browsing history

words vs concepts

Semantic search will only improve if and when Web browsers rely on explicit user guidance to translate words into concepts before executing a search request. Furthermore, to reduce search complexity, a formal notion of semantic equivalence is essential.

semantic equivalence

Lastly, the mapping between labels and semantics depends significantly on linguistic scope. For example the meaning of solution architecture in organisation A is typically different from the meaning of solution architecture in organisation B.

linguistic scope

If the glacial speed of innovation in mainstream programming languages and tools is any indication, the main use case of semantic search is going to remain:

User looks for a product with features x, y, and z

The other use cases mentioned above may have to wait for another 10 years.

Governance of Big Data Cloud Formations – Cyclone Alert

Posted by Jorn Bettin

The shift of business applications and business data into the Cloud has led to the following challenges:

The physical locations at which data is stored, and the physical locations through which data travels are increasingly unknown to the producers and consumers of data.
Data ownership and the responsibility of data custodianship is increasingly impossible to determine, as deep Web service supply chains transect multiple contracts and jurisdictional boundaries.
Local (national) privacy legislation is increasingly impossible to enforce.
The control over the integration points between a specific pair of Cloud services is migrating away from the thousands and millions of organisations whose data is being integrated to a few handfuls of vendors that specialise in connecting the specific pair of Cloud services.
Correspondingly the responsibility for the robustness and reliability of system integration solutions is shifting to a small number of proprietary Cloud services.

The centralised and constrained Web of today

The structure of the Web of today artificially imposes the same constraints on the digital realm that apply in the physical realm.

Centralised and hierarchical control of the Web creates a whole number of avoidable problems. Netizens, and especially the younger generation of digital natives, are using the digital realm as an extension of their brain. The value of the digital realm to human society is not found in the technology that is being used, the value is found in the information, knowledge and insights that flow, evolve, and multiply in the digital realm. To be very clear, Web technology is fully commoditised. There is very little intrinsic value in the mundane software that powers the services from Google, Facebook, Microsoft, and other providers of Cloud platforms. The digital realm is currently owned and controlled by a small number of corporations, which is increasingly incompatible with its use value:

Digital knowledge as a personal brain extension
Unlimited on-demand communication between any number of netizens
A public tool for tracing information flows and for independent validation of scientific knowledge
A globally accessible interface to technologies that operate in the physical realm

Leaving these functions in the hands of a small number of corporations is not in the interest of society.

The decentralised Web we should aim for

It is time to acknowledge the commoditisation of digital technology, to decentralise control of the Web, and to provide digital technology as a public utility to all netizens, without any artificial constraints or interference.

What are the implications for governments and governance?

The governance challenge consists of:

Protecting personal freedom in the digital realm
Sustainable management of limited resources in the physical realm
Integration of social and ecological concerns in the interest of the inhabitants of the biosphere

Important first steps that can be undertaken today to address the governance challenge are outlined here.

The story of life is language

Posted by Jorn Bettin

This post is a rather long story. It attempts to connect topics from a range of domains, and the insights from experts in these domains. In this story my role is mainly the one of an observer. Over the years I have worked with hundreds of domain experts, distilling the essence of deep domain knowledge into intuitive visual domain-specific languages. If anything, my work has taught me the skill to observe and to listen, and it has made me concentrate on the communication across domain boundaries – to ensure that desired intent expressed in one domain is sufficiently aligned with the interpretations performed in other domains.

The life of language and the language of life can’t be expressed in written words. Many of the links contained in this story are essential, and provide extensive background information in terms of videos (spoken language, intonation, unconscious body language, conscious gestures), and visual diagrams. To get an intuitive understanding of the significance of visual communication, once you get to the end of the story, simply imagine none of the diagrams had been included.

Drawing Hands, 1948, by the Dutch artist M. C. Escher

It may not be evident on the surface, but the story of life started with language, hundreds of millions of years ago – long before humans were around, and it will continue with language, long after humans are gone.

The famous Drawing Hands lithograph from M. C. Escher provides a very good analogy for the relationship between life and language – the two concepts are inseparable, and one recursively gives rise to the other.

At a fundamental level the language of life is encoded in a symbol system of molecular fragments and molecules – in analogy to an alphabet, words, and sentences.

The language of life

TED – Craig Ventor on creating synthetic life

Over the last two decades molecular biologists and chemists have become increasingly skilled at reading the syntax of the genetic code; and more recently scientists started to work on, and have successfully prototyped techniques to write the syntax of the genetic code. In other words, humans now have the tools to translate bio-logical code into digital code as well as the tools to translate digital code back into bio-logical code. The difference between the language of biology and the language of digital computers is simply one of representation (symbolic representations are also called models). Unfortunately, neither the symbols used by biology (molecules), nor the symbols used by digital computers (electric charges), are directly observable via the cognitive channels available to humans.

However, half a century of software development has not only led to convoluted and unmaintainable legacy software, but also to some extremely powerful tools for translating digital representations into visual representations that are intuitive for humans to understand. We no longer need to deal with mechanical switches or punch cards, and modern user interfaces present us with highly visual information that goes far beyond the syntax of written natural language. These visualisation tools, taken together with the ability to translate bio-logical code into digital code, provide humans with a window into the fundamental language of life – much more impressive in my view than the boring magical portals dreamed up by science fiction authors.

TED – Bonnie Bassler on how bacteria communicate

The language of life is highly recursive. It turns out that even the smallest single-celled life forms have developed higher-level languages, to communicate – not only within their species, but even across species. At the spacial and temporal scale that characterises the life of bacteria, the symbol system used consists of molecules. What is fascinating, is that scientists have not only decoded the syntax (the density of molecular symbols surrounding the bacteria), but have also begun to decode the meaning of the language used by bacteria, for example, in the case of a pathogen, communication that signals when to attack the host.

The biological evidence clearly shows, in a growing number of well-researched examples, that the development of language does not require any “human-level” intelligence. Instead, life can be described as an ultra-large system of elements that communicate via various symbol systems. Even though the progress in terms of discovering and reading symbol systems is quite amazing, scientists are only scratching the surface in terms of understanding the meaning (the semantics) of biological symbol systems.

Language systems invented by humans

From muddling to modelling

Semantics is the most fascinating touch point between biology and the mathematics of symbol systems. In terms of recursion, mathematics seems to have found a twin in biology. Unfortunately, computer scientists, and software development practitioners in particular, for a long time have ignored the recursive aspect of formal languages. As a result, the encoding of the software that we use today is much more verbose and complex than it would need to be.

From code into the clouds

Nevertheless, over the course of a hundred years, the level of abstraction of computer programming has slowly moved upwards. The level of progress is best seen when looking at the sequence of the key milestones that have been reached to date. Not unlike in biology, more advanced languages have been built on top of simpler languages. In technical terms, the languages of biology and all languages invented by humans, from natural language to programming languages, are codes. The dictionary defines code as follows:

Code is a system of signals used to send messages
Code is a system of symbols used for the purpose of identification or classification
Code is a set of conventions governing behaviour

Sets – the foundation of biological and digital code

Mathematically, all codes can be represented with the help of sets and the technique of recursion. But, as with the lowest-level encoding of digital code in terms of electric charges, the mathematical notation for sets is highly verbose, and quickly reaches human cognitive limits.

The mathematical notation for sets predates modern computers, and was invented by those who needed to manually manipulate sets at a conceptual level, for example as part of a mathematical proof. Software programming and also communication in natural language involves so many sets that a representation in the classical mathematical notation for sets is unpractical.

The importance of high-quality representation of symbols is often under-rated. A few thousand years ago humans realised the limitation of encoding language in sounds, and invented written language. The notation of written language minimises syntactical errors, and, in contrast to spoken language, allows reliable communication of sequences of words across large distances in space and time.

The challenge of semantics

The impossibility of communicating desired intent

Software development professionals are becoming increasingly aware of the importance of notation, but interpretation (inferring the semantics of a message) remains an ongoing challenge. Adults and even young children, once they have developed a theory of mind, know that others may sometimes interpret their messages in a surprising way. It is somewhat less obvious, that all sensory input received by the human brain is subject to interpretation, and that our own perception of reality is limited to an interpretation.

The curse of software maintenance

Interpretation is not only a challenge in communication between humans, it is as much a challenge for communication between humans and software systems. Every software developer knows that it is humanly impossible to write several hundred lines of non-trivial program code without introducing unintended “errors” that will lead to a non-expected interpretation by the machine. Still, writing new software requires much less effort than understanding and changing existing software. Even expert programmers require large amounts of time to understand software written by others.

The challenge of digital waste

We have only embarked down the road of significant dematerialisation of artefacts in the last few years, but I am somewhat concerned about the semantic value of many of the digital artefacts that are now being produced at a mind-boggling rate. I am coming to think of it as digital waste – worse than noise. The waste involves the time involved in producing and consuming artefacts and the associated use of energy.

Sharpening your collaborative edge

Of particular concern is the production of meta-artefacts (for example the tools we use to produce digital artefacts, and higher-level meta-tools). The user interfaces of Facebook, Google+ and other tools look reasonable at a superficial level, just don’t look under the hood. As a result, we produce the digital equivalent of the Pacific Garbage Patch. Blinded by shiny new interfaces, the digital ocean seems infinite, and humanity embarks on yet another conquest …

Today’s collaboration platforms not only rely on a central point of control, they are also ill-equipped for capturing deep knowledge and wisdom – there is no semantic foundation, and the tools are very limited in their ability to facilitate a shared understanding within a community. The ability to create digital artefacts is not enough, we need the ability to create semantic artefacts in order to share meaningful information.

How does life (the biological system of the planet) collectively interpret human activities?

TED – Naomi Klein : Addicted to risk

As humans we are limited to the human perspective, and we are largely unaware of the impact of our ultra-large scale chemical activities on the languages used by other species. If biologists have only recently discovered that bacteria heavily rely on chemical communication, how many millions of other chemical languages are we still completely unaware of? And what is the impact of disrupting chemical communication channels?

Scientists may have the best intentions, but their conclusions are limited to the knowledge available to them. To avoid potentially fatal mistakes and misunderstandings, it is worthwhile to tread carefully, and to invest in better listening skills. Instead of deafening the planet with human-made chemicals, how about focusing our energies on listening to – and attempting to understand, the trillions of conversations going on in the biosphere?

Gmodel – The Semantic Database

At the same time, we can work on the development of symbolic codes that are superior to natural language for sharing semantics, so that it becomes easier to reach a shared understanding across the boundaries of the specialised domains we work in. We now have the technology to reduce semantic communication errors (the difference between intent and interpretation) to an extent that is comparable to the reduction of syntactic communication errors achieved with written language. If we continue to rely too heavily on natural language, we are running a significant risk of ending the existence of humanity due to a misunderstanding.

Life is language

Life and languages continuously evolve, whether we like it or not. Life shapes us, and we attempt to shape life. We are part of a dynamic system with increasingly fast feedback loops.

Life interprets languages, and languages interpret life.

Language is life.

Software evolves like culture, like language, like genes

Posted by Jorn Bettin

Software continuously evolves, whether we like it or not. Software shapes us and we attempt to shape software; as part of a dynamic system with increasingly fast feedback loops. Today The Australian covers two interesting complementary topics relating to software:

1. Cloud computing round table with six of Australia’s top CIOs

If you take the time to listen to the conversation, the following concepts stick out: social, sharing, digital artefacts, digital natives, trust, privacy, security, mobile, risks, transactions, insurance; and also: simplification, modularity, standardisation, outsourcing, lock-in, low cost, and scalability.

Quite a lot of concepts, hopes, expectations – all looking forward to systems that are easier and more convenient to use. And yet, a look into the bowels of any software-intensive business reveals a different here and now, characterised by a range of systems that vary in age from less than a year to more than four decades, and …

standards 2

an explosion of standards (1.1MB pdf);

standards 1

… strong coupling within and between systems (the pictures below are the result of tool-based analysis of several millions of lines of production-grade software code);

The complexity inherent in large software artefacts

… and a shift in effort and costs from software creation to software maintenance that has caught many organisations by surprise (from Capers Jones, The economics of software maintenance in the twenty first century, February 2006).

Focus of software development professionals in the US, and percentage of software professional as part of the total US population

The statistics shouldn’t really be a surprise, at least not if software is understood for what is really is: a culture, a language, a pool of genes.

Big changes to software are comparable to changes in culture, language, and genes; they require interactions between many elements, they involve unpredictable results, and they can not be achieved with brute force – big changes take generations, literally. Which brings us to the second topic mentioned in The Australian today:

2. A pair of articles on the longevity of legacy software

It is important for humans to learn to live in a plurality of software cultures, and to realise that embracing a new software culture is different from buying a new car. An old car is easily sold and forgotten, but old software culture stays around alongside the new arrivals.

No one is in control, mistakes happen on this planet

Posted by Jorn Bettin

No one is in control, mistakes happen on this planet

As humans we heavily rely on intuition and on our personal mental models for making many millions of subconscious decisions and a much smaller number of conscious decisions on a daily basis. All these decisions involve interpretations of our prior experience and the sensory input we receive. It is only in hindsight that we can realise our mistakes. Learning from mistakes involves updating our mental models, and we need to get better at it, not only personally, but as a society:

The mathematical foundation of being wrong
The direct implications for human interactions (external perspective)
The emotional impact (a wonderful talk on the internal perspective)
The lessons for creating new lines of business (the commercial perspective that applies to all start-ups)

Whilst we will continue to interact heavily with humans, we increasingly interact with the web – and all our interactions are subject to the well-known problems of communication. One of the more profound characteristics of ultra-large-scale systems is the way in which the impact of unintended or unforeseen behaviours propagates through the system.

The most familiar example is the one of software viruses, which have spawned an entire industry. Just as in biology, viruses will never completely go away. It is an ongoing fight of empirical knowledge against undesirable pathogens that is unlikely to ever end, because both opponents are evolving their knowledge after each new encounter based on the experience gained.

Similar to viruses, there are many other unintended or unforeseen behaviours that propagate through ultra-large-scale systems. Only on some occasions do these behaviours result in immediate outages or misbehaviours that are easily observable by humans.

Sometimes it can take hours, weeks, or months for downstream effects to aggregate to the point where they cause some component to reach a point where an explicit error is generated and a human observer is alerted. In many cases it is not possible to trace down the root cause or causes, and the co-called fix consists in correcting the visible part of the downstream damage.

Take the recent tsunami and the destroyed nuclear reactors in Japan. How far is it humanly and economically possible to fix the root causes? Globally, many nuclear reactor designs have weaknesses. What trade-off between risk levels (also including a contingency for risks that no one is currently aware of) and the cost of electricity are we prepared to make?

Addressing local sources of events that lead to easily and immediately observable error conditions is a drop in the bucket of potential sources of serious errors. Yet this is the usual limit of scope of that organisations apply to quality assurance, disaster recovery etc.

The difference between the web and a living system is fading, and our understanding of the system is limited to say the least. A sensible approach to failures and system errors is increasingly comparable to the one used in medicine to fight diseases – the process of finding out what helps is empirical, and all new treatments are tested for unintended side-effects over an extended period of time. Still, all the tests only lead to statistical data and interpretations, no absolute guarantees. In the life sciences no honest scientist can claim to be in full control. In fact, no one is in full control, and it is clear that no one will ever be in full control.

Traditional management practices strive to avoid any semblance of “not being in full control”. Organisations that are ready to admit that they operate within the context of an ultra-large-scale system have a choice between:

conceding they have lost control internally, because their internal systems are so complex, or
regaining a degree of internal understandability by simplifying internal structures and systems, enabled by shifting to the use of external web services – which also does not establish full control.

Conceding the unavoidable loss of control, or being prepared to pay extensively for effective risk reduction measures (one or two orders of magnitude in cost) amounts to political suicide in most organisations.

The impossibility of communicating desired intent

Posted by Jorn Bettin

Communication relies on interpretation of the message by the recipient

Communication of desired intent can never be fully achieved. It would require a mind-meld between two individuals or between an individual and a machine.

The meaning (the semantics) propagated in a codified message is determined by the interpretation of the recipient, and not by the desired intent of the sender.

In the example on the right, the tree envisaged in the mind of the sender is not exactly the same as the tree resulting from the interpretation of the decoded message by the recipient.

To understand the practical ramnifications of interpretation, consider the following realistic example of communication in natural language between an analyst, a journalist, and a newspaper reader:

Communication of desired intent and interpretation

1. intent

Reiterate that recurring system outages at the big four banks are to be expected for at least 10 years whilst legacy systems are incrementally replaced
Indicate that an unpredictable and disruptive change will likely affect the landscape in banking within the next 15 years
Explain that similarly, 15 years ago, no one was able to predict that a large percentage of the population would be using Gmail from Google for email
Suggest that overseas providers of banking software or financial services may be part of the change and may compete against local banks
Indicate that local banks would find it hard to offer robust systems unless they each doubled or tripled their IT upgrade investments

2. interpretation

Bank customers must brace themselves for up to 15 years of pain
The big four banks would take 10 years to upgrade their systems and another five to stabilise those platforms
Local banks would struggle to compete against newer and nimbler rivals, which could sweep into Australia and compete against them
Local banks would find it hard to offer robust systems unless they each doubled or tripled their IT upgrade investments

3. intent (extrapolated from the differences between 1. and 2.)

Use words and numbers that maximise the period during which banking system outages are to be expected
Emphasise the potential threats to local banks and ignore irrelevant context information

4. interpretation

The various mental models that are constructed in the minds of readers who are unaware of 1.

Adults and even young children (once they have developed a theory of mind) know that others may sometimes interpret their messages in a surprising way. It is somewhat less obvious to realise that all sensory input received by the human brain is subject to interpretation, and that our own perception of reality is limited to an interpretation.

Next, consider an example of communication between a software user, a software developer (coder), and a machine, which involves both natural language and one or more computer programming languages:

Communication of desired intent including interpretation by a machine

1. intent

Request a system that is more reliably than the existing one
Simplify a number of unnecessarily complex workflows by automation
Ensure that all of the existing functionality is also available in the new system

2. interpretation

Redevelop the system in newer and more familiar technologies that offer a number of technical advantages
Develop a new user interface with a simplified screen and interaction design
Continue to allow use of the old system and provide back-end integration between the two systems

3. intent

Copy code patterns from another project that used some of the same technologies to avoid surprises
Deliver working user interface functionality as early as possible to validate the design with users
In the first iterations of the project continue to use the existing back-end, with a view to redeveloping the back-end at a later stage

4a. interpretation (version deployed into test environment)

Occasional run-time errors caused by subtle differences in the versions of the technologies used in this project and the project from which the code patterns were copied
Missing input validation constraints, resulting in some operational data that is considered illegal when accessed via the old system
Occurrences of previously unencountered back-end errors due to the processing of illegal data

4b. interpretation (version deployed into production environment)

Most run-time errors caused by subtle differences in the versions of the technologies have been resolved
Since no one fully understands all the validation constraints imposed by the old system (or since some constraints are now deemed obsolete), the back-end system has been modified to accept all operational data received via the new user interface
The back-end system no longer causes run-time errors but produces results (price calculations etc.) that in some cases deviate from the results produced by the old version of the back-end system

In the example above it is likely that not only the intent in step 3. but also the intent in step 1. is codified in writing. The messages in step 1. are codified in natural language, and the messages in step 3. are codified in programming languages. Written codification in no way reduces the risk of interpretations that deviate from the desired intent. In any non-trivial system the interpretation of a specific message may depend on the context, and the same message in a different context may result in a different interpretation.

Every software developer knows that it is humanly impossible to write several hundred lines of non-trivial program code without introducing unintended “errors” that will lead to a non-expected interpretation by the machine. Humans are even quite unreliable at simple data entry tasks. Hence the need for extensive input data validation checks in software that directly alert the user to data that is inconsistent with what the system interprets as legal input.

There is no justification whatsoever to believe that the risks of mismatches between desired intent and interpretation are any less in the communication between user and software developer than in the communication between software developer and machine. Yet, somewhat surprisingly, many software development initiatives are planned and executed as if there is only a very remote chance of communication errors between users and software developers (coders).

In a nutshell, the entire agile manifesto for software development boils down to the recognition that communication errors are an unavoidable part of life, and for the most part, they occur despite the best efforts and intentions from all sides. In other words, the agile manifesto is simply an appeal to stop the highly wasteful blame culture that saps time, energy and money from all parties involved.

The big problem with most interpretations of the agile manifesto is the assumption that it is productive for a software developer to directly translate the interpretation 2. of desired user user intent 1. into an intent 3. expressed in a general purpose linear text-based programming language. This assumption is counter-productive since such a translation bridges a very large gap between user-level concepts and programming-language-level concepts. The semantic identities of user-level concepts contained in 1. end up being fragmented and scattered across a large set of programming-language-level concepts, which gets in the way of creating a shared understanding between users and software developers.

In contrast, if the software developer employs a user-level graphical domain-specific modelling notation, there is a one-to-one correspondence between the concepts in 1. and the concepts in 3., which greatly facilitates a shared understanding – or avoidance of a significant mismatch between the desired intent of the user 1. and the interpretation by the software developer 2. . The domain-specific modelling notation provides the software developer with a codification 3. of 1. that can be discussed with users and that simultaneously is easily processable by a machine. In this context the software developer takes on the role of an analyst who formalises the domain-specific semantics that are hidden in the natural language used to express 1. .

Jorn Bettin

Knowledge archaeologist by day and neurodivergent anthropologist by night

Data science

Are you a model builder or a story teller?

What has happened to model building?

What’s the story with storytelling?

End of story?

Silicon Valley innovation pop-culture?

Do we need a better language for model building?

Any hope of progress beyond stories?

The antidote to misuse of mathematics and junk data

The big human battle of this century

Big data blah $ blah $ blah $

Let’s talk semantics. Do you know what I mean?

Governance of Big Data Cloud Formations – Cyclone Alert

The story of life is language

Software evolves like culture, like language, like genes

No one is in control, mistakes happen on this planet

The impossibility of communicating desired intent