Friday, November 09, 2018

The Phi Value of Integrated Information Theory Might Not Be Stable Across Small Changes in Neural Connectivity

In learning and in forgetting, the amount of connectivity between your neurons changes. Throughout your life, neurons die and grow. Through all of this, the total amount of conscious experience you have, at least in your alert, attentive moments, seems to stay roughly the same. You don't lose a few neural connections and with it 80% of your consciousness. The richness of our stream of experience is stable across small variations in the connectivity of our neurons -- or so, at least, it is plausible to think.

One of the best known theories of consciousness, Integrated Information Theory, purports to model how much consciousness a neural system has by means of a value, Φ (phi), that is a mathematically complicated measure of how much "integrated information" a system possesses. The higher the Φ, the richer the conscious experience, the lower the Φ, the thinner the experience. Integrated Information Theory is subject to some worrying objections (and here's an objection by me, which I invite you also to regard as worrying). Today I want to highlight a different concern than these: the apparent failure of Φ to be robust to small changes in connectivity.

The Φ of any particular informational network is difficult to calculate, but the IIT website provides a useful tool. You can play around with networked systems of about 4, 5, or 6 nodes (above 6, the computation time to calculate Φ becomes excessive). Prefab systems are available to download, with Φ values from less than 1 to over 15. It's fun!

But there are two things you might notice, once you play around with the tool for a while:

First, it's somewhat hard to create systems with Φ values much above 1. Slap 5 nodes together and connect them any which way, and you're likely to get a Φ value between 0 and 1.

Second, if you tweak the connections of the relatively high-Φ systems, even just a little, or if you change a logical operator from one operation to another (e.g., XOR to AND), you're likely to cut the Φ value by at least half. In other words, the Φ value of these systems is not robust across small changes.

To explore the second point more systematically, I downloaded the "IIT 3.0 Paper Fig. 17 Specialized majority" network which, when all 5 nodes are lit, has a Φ value of 10.7. (A node's being "lit" means it has a starting value of "on" rather than "off".) I then tweaked the network in every way that it was possible to tweak it by changing exactly one feature. (Due to the symmetry of the network, this was less laborious than it sounds.) Turning off any one node reduces Φ to 2.2. Deleting any one node reduces Φ to 1. Deleting one connection, altering its direction (if unidirectional), or changing it from unidirectional to bidirectional or vice versa, always reduces system's Φ to a value ranging from 2.6 to 4.8. Changing the logic function of one node has effects that are sometimes minor and sometimes large: Changing any one node from MAJ to NOR reduces Φ all the way down to 0.4, while changing any one node to MIN increases Φ to 13.0. Overall, most ways of introducing one minimal perturbation into the system reduce Φ by at least half, and some reduce it by over 90%.

To confirm that the "Specialized majority" network was not unusual in this respect, I attempted a similar systematic one-feature tweaking of "CA Paper Fig 3d, Rule 90, 5 nodes". The 5-node Rule 90 network, with all nodes in the default unlit configuration, has a Φ of 15.2. The results of perturbation are similar to the results for the "Specialized majority" network. Light any one node of the rule 90 network and Φ falls to 1.8. Delete any one arrow and Φ also falls to 1.8. Change any one arrow from bidirectional to unidirectional and Φ falls to 4.8. Change the logic of one node and Φ ranges anywhere from a low of 1.8 (RAND, PAR, and >2) to a high of 19.2 (OR).

These two examples, plus what I've seen in my unsystematic tweaking of other prefab networks, plus my observations about the difficulty of casually constructing a five-node system with Φ much over 1, suggest that, in five-node systems at least, having a high Φ value requires highly specific structures that are unstable to minor perturbations. Small tweaks can easily reduce Φ by half or more.

It would be bad for Integrated Information Theory, as a theory of consciousness, if this high degree of instability in systems with high Φ values scales up to large systems, like the brain. The loss of a few neural connections shouldn't make a human being's Φ value crash down by half or more. Our brains are more robust than that. And yet I'm not sure that we should be confident that the mathematics of Φ has the requisite stability in large, high-Φ systems. In the small networks we can measure, at least, it is highly unstable.

ETA November 10:

Several people have suggested to me that Phi will be more stable to small perturbations as the size of the network increases. I could see how that might be the case (which is why I phrased the concluding paragraph as a worry rather than as an positive claim). Now if Phi, like entropy, were dependent in some straightforward way on the small contributions of many elements, that would be likely to be so. But the mathematics of Phi relies heavily on discontinuities and threshold concepts. I exploit this fact in my earlier critique of the Exclusion Postulate, in which I show that a very small change in the environment of a system, without any change interior to the system, could cause that system to instantly fall from arbitrarily high Phi to zero.

If anyone knows of a rigorous, rather than handwavy attempt to show that Phi in large systems is stable over minor perturbations, I would be grateful if you pointed it out!

14 comments:

David Duffy said...

I would expect small systems to exhibit exactly this kind of behaviour, given the various measures of integration measure the degradation of the system arising from (minimally) partitioning it.

You may have read Engel and Malone [2018]
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0205335
where phi correlates with a group's collective intelligence.

Jim Cross said...

I am not a fan of IIT but the instability of phi could be mitigated in actual living systems with redundant and/or alternative connections.

Also, instability might be a feature not a bug. Several people have pointed out that the brain operates close to chaos. This near chaos may be critical to the potential for novelty in neural circuits.

SelfAwarePatterns said...

I'm with Jim in not being a fan of IIT. Rather than an actual scientific theory on how consciousness works, it strikes me as an attempt to figure out the recipe for a ghost in the machine. Since I don't see any evidence for a ghost, naturalistic or otherwise, theorizing about how it forms strikes me as pointless. Not that information integration isn't crucial, it's just not sufficient.

All that said, given that last point, I do think phi, or something like it, could conceivably be useful in determining whether a particular brain is in a conscious state. But it's just a measure of how much a particular system is currently integrating information, not whether that system feels, has memory, perception, attention, imagination, self awareness, or any of the other qualities we usually associate with consciousness.

Anonymous said...

ITT, with its underlying mathematical architecture is a quintessential attempt at trying to express a Phi "Value" using discrete, binary systems. Value is linear in nature, therefore, within any hierarchy, Value comes first. Discrete systems of thought are problematic because they do not correspond to Value, they seek to suppress Value and even exclude it.

In my current models, the Value Axiom corresponds to what we know about quantum systems and the information contained within them. Quantum systems underwrite the classical world and quantum information differs strongly from classical information which is epitomized by the bit, which is a binary, discrete form of information. A unit of quantum information is known as a qubit. A qubit is the smallest possible unit of quantum information. Unlike classical information which is binary and discrete, a qubit is characterized as continuous-valued. It is because of this property of being continuous-valued that it becomes impossible to measure value precisely. This qualitative property or Value corresponds to the ubiquitous nature of Value which is linear and can never be contained within discrete, binary forms of expression. Fundamentally, as an Objective Reality, Value can only be experienced or expressed.

Eric Schwitzgebel said...

Thanks for the comments, folks! I've got lots of feedback from various sources, which I'll need to digest more fully after the weekend. Back soon!

David Duffy said...


https://arxiv.org/abs/1806.09373
tweak 2 parameters in a 2 node autoregressive model and measure 8 different types
of integrated information measures (Fig 3). The different measures, at least, are all over the place.

There are a few recent papers on IIT for large homogenous networks on the arxiv.

Re SAP's comments, even if IIT has absolutely nothing to do with consciousness, there is still plenty of interest as a measure of network and causal complexity and how it applies to autopoeitic entities.

David Duffy said...

Further to small networks: my understanding of these matters is quite shallow, but if you take the Causal Density measure of integration (see the Mediano paper cited above) to suggest a straightforward relationship to transfer entropy, then
https://arxiv.org/abs/1803.04217
shows a large spread of integration measure values across all possible 3 node setups.

Selection has pruned out the less useful ones in this context of gene regulatory networks. I am partial to the enactivist/Fristonian idea that this extends up to nervous systems too.

Eric Schwitzgebel said...

Thanks, David -- that's very helpful!

Lee: I wonder how much discreteness vs continuity matters at the quantum level. Discreteness at a fine enough grain might be practically similar to continuity, yes?

Jim and SelfAware: Redundancy will in some situations have a very negative impact on Phi, so it's not clear how the math of redundancy will work out for high-Phi networks, I think. Also, I like the idea of "information integration", but I do wonder whether Phi as officially calculated captures what we might want a measure of information integration to capture.

Anonymous said...

Eric,

Discreteness at a fine enough grain can certainly be utilized to “model” the continuity of a linear system, albeit those discrete systems will be constrained by the inherent nature of their discreteness. Nevertheless, one is still confronted by the question of hierarchy. What comes first, a linear system or a discrete system? I think that question can be answered easily enough by definition; a discrete system has determinate boundaries which assert a beginning and an end, whereas a linear system is an indeterminate continuum asserting no beginning or end.

A discrete system will not accommodate a linear system, discreteness seeks to suppress linear systems and even exclude them. Whereas, a linear system will accommodate and include discrete systems. In conclusion; a linear system at the quantum level, one that is continuous-valued underwrites and will accommodate discrete systems at the classical level. So yes; discreteness vs continuity does matter at the quantum level, because the architecture of discreteness that we observe and experience within our phenomenal world corresponds to a linear continuum which underwrites that phenomenal realm of discreteness.

Anonymous said...

Referring to the previous post I have one final comment: "A discrete system will not accommodate a linear system, discreteness seeks to suppress linear systems and even exclude them." It should be noted that the true system, the real system, and the one system which we should be the most concerned with is our present construction of systematic thought itself, rationality itself. For reasoning and rationality is a discrete, binary system, bringing with it the inherent limitations built into the architecture of that model, including the intrinsic predisposition to reject linear systems such as Value.

Thanks......

Eric Schwitzgebel said...

Thanks for your continuing comments, Lee!

I agree that a continuous ("linear") system can in a certain sense contain and model and discrete system while the reverse is not true. I'm not convinced, though, we can can know -- or that it matters for anything we care about -- whether reality at the tiniest microscopic level is discrete or continuous. It seems to me that both reasoning and emotional feeling can be implemented by either sort of system. I guess I'm still not sure why that wouldn't be so.

Anonymous said...

I am in agreement with your brief assessment Eric; and yes, only a noumenalist would even dare to care. As a science, theoretical physicists and philosophers might care about an accurate description of the structure of reality, nevertheless, that goal will never be achieved because our current construction of systematic thought itself is a discrete, binary system; and that system does not and will not accommodate a linear system.

I'm hedging my bets that the noumenal realm is a linear system of some kind, and since that system underwrites our phenomenal reality, that system will be residing in open sight right under our noses. I can guarantee that this linear system will be elegantly beautiful and simplistic in nature. Once discovered, that system will also become the "universal constant" that stands alone at the center of motion and form, the integral source of causality, the very substance of what tends.

Thanks...

Filip Sondej said...

Nice experiment!
You're right, that it's hard to extrapolate how phi would behave in larger systems. Here, the changes may be so dramatic, because maybe changing one connection in a network with 10 connections really IS a significant change. Computing exact phi on large networks takes forever, but there are several ways to approximate it. We could use them to experiment on much larger networks. Although then the results could be refuted, that "it's only an approximation, it's not REAL phi" :(

It's a pity that current formulation of IIT only allows discrete changes (two nodes are either connected or not). It's unrealistic, because real synapses are continuous - they can pass on signals strongly, weakly, or everything in between. I understand that IIT is still quite young, so I hope in the future it'll be fixed to work continuously.

If that happens, we would be able to perform much more conclusive experiments. The changes we make to the network could now be arbitrarily small - e.g. changing weight of a single connection by 0.00001. Then, if we could find an arbitrarily small change to some connection, that would change phi significantly, it would be a strong argument that this IIT formulation is wrong. This principle could even be formalized mathematically, similarly as it's done for continuous functions.

As others pointed out, there are many alternative formulations of IIT. Maybe I'm just ignorant and there already exists one that allows continuous changes. If you, or anybody else heard about it, please let me know!

Arnold said...

...that intentional formulation is relative too, when observation/measurement is constant...

Isn't meaning infinite...