Software evolves like culture, like language, like genes

Software continuously evolves, whether we like it or not. Software shapes us and we attempt to shape software; as part of a dynamic system with increasingly fast feedback loops. Today The Australian covers two interesting complementary topics relating to software:

1. Cloud computing round table with six of Australia’s top CIOs

If you take the time to listen to the conversation, the following concepts stick out: social, sharing, digital artefacts, digital natives, trust, privacy, security, mobile, risks, transactions, insurance; and also: simplification, modularity, standardisation, outsourcing, lock-in, low cost, and scalability.

  1. VIDEO: Cloud computing roundtable part one
  2. VIDEO: Cloud computing roundtable part two
  3. VIDEO: Cloud computing roundtable part three
  4. VIDEO: Cloud computing roundtable part four
  5. VIDEO: Cloud computing roundtable part five

Quite a lot of concepts, hopes, expectations – all looking forward to systems that are easier and more convenient to use. And yet, a look into the bowels of any software-intensive business reveals a different here and now, characterised by a range of systems that vary in age from less than a year to more than four decades, and …

standards 2

an explosion of standards (1.1MB pdf);

standards 1

… strong coupling within and between systems (the pictures below are the result of tool-based analysis of several millions of lines of production-grade software code);

The complexity inherent in large software artefacts

… and a shift in effort and costs from software creation to software maintenance that has caught many organisations by surprise (from Capers Jones, The economics of software maintenance in the twenty first century, February 2006).

Focus of software development professionals in the US, and percentage of software professional as part of the total US population

The statistics shouldn’t really be a surprise, at least not if software is understood for what is really is: a culture, a language, a pool of genes.

Big changes to software are comparable to changes in culture, language, and genes; they require interactions between many elements, they involve unpredictable results, and they can not be achieved with brute force – big changes take generations, literally. Which brings us to the second topic mentioned in The Australian today:

2. A pair of articles on the longevity of legacy software

  1. Old mainframe systems not extinct
  2. Demand for mainframe language skills remains strong

It is important for humans to learn to live in a plurality of software cultures, and to realise that embracing a new software culture is different from buying a new car. An old car is easily sold and forgotten, but old software culture stays around alongside the new arrivals.

Poll on current priorities of IT organisations in the financial sector

As part of research on the banking sector, I have set up a poll on LinkedIn on the following question:

Which of the following objectives is currently the most relevant for IT organisations in the financial sector?

  • Improving software and data quality
  • Outsourcing new application development
  • Outsourcing legacy software maintenance
  • Improving time to market of new products
  • Reducing IT costs

The poll is intended as a simple pulse-check on IT in banking, and I’ll make the results available on this blog.

Please contribute here on LinkedIn, in particular if  you work in banking or are engaged in IT projects for a financial institution. Additional observations and comments are welcome, for example insights relating to banks in a particular country or geography.

No one is in control, mistakes happen on this planet

No one is in control, mistakes happen on this planet

As humans we heavily rely on intuition and on our personal mental models for making many millions of subconscious decisions and a much smaller number of conscious decisions on a daily basis. All these decisions involve interpretations of our prior experience and the sensory input we receive. It is only in hindsight that we can realise our mistakes. Learning from mistakes involves updating our mental models, and we need to get better at it, not only personally, but as a society:

Whilst we will continue to interact heavily with humans, we increasingly interact with the web – and all our interactions are subject to the well-known problems of communication. One of the more profound characteristics of ultra-large-scale systems is the way in which the impact of unintended or unforeseen behaviours propagates through the system.

The most familiar example is the one of software viruses, which have spawned an entire industry. Just as in biology, viruses will never completely go away. It is an ongoing fight of empirical knowledge against undesirable pathogens that is unlikely to ever end, because both opponents are evolving their knowledge after each new encounter based on the experience gained.

Similar to viruses, there are many other unintended or unforeseen behaviours that propagate through ultra-large-scale systems. Only on some occasions do these behaviours result in immediate outages or misbehaviours that are easily observable by humans.

Sometimes it can take hours, weeks, or months for  downstream effects to aggregate to the point where they cause some component to reach a point where an explicit error is generated and a human observer is alerted. In many cases it is not possible to trace down the root cause or causes, and the co-called fix consists in correcting the visible part of the downstream damage.

Take the recent tsunami and the destroyed nuclear reactors in Japan. How far is it humanly and economically possible to fix the root causes? Globally, many nuclear reactor designs have weaknesses. What trade-off between risk levels (also including a contingency for risks that no one is currently aware of) and the cost of electricity are we prepared to make?

Addressing local sources of events that lead to easily and immediately observable error conditions is a drop in the bucket of potential sources of serious errors. Yet this is the usual limit of scope of that organisations apply to quality assurance, disaster recovery etc.

The difference between the web and a living system is fading, and our understanding of the system is limited to say the least. A sensible approach to failures and system errors is increasingly comparable to the one used in medicine to fight diseases – the process of finding out what helps is empirical, and all new treatments are tested for unintended side-effects over an extended period of time. Still, all the tests only lead to statistical data and interpretations, no absolute guarantees. In the life sciences no honest scientist can claim to be in full control. In fact, no one is in full control, and it is clear that no one will ever be in full control.

Traditional management practices strive to avoid any semblance of “not being in full control”. Organisations that are ready to admit that they operate within the context of an ultra-large-scale system have a choice between:

  • conceding they have lost control internally, because their internal systems are so complex, or
  • regaining a degree of internal understandability by simplifying internal structures and systems, enabled by shifting to the use of external web services – which also does not establish full control.

Conceding the unavoidable loss of control, or being prepared to pay extensively  for effective risk reduction measures (one or two orders of magnitude in cost) amounts to political suicide in most organisations.