Characterizing consciousness is a profound scientific problem Koch et al. Examples include disorders of consciousness Laureys ; Casali et al. Here, we address the phenomenon of structured experience from an information-theoretic perspective. Science strives to provide simple models that describe observable phenomena and produce testable predictions. In line with this, we offer here the elements of a theory of consciousness based on algorithmic information theory AIT.
AIT studies the relationship between computation, information, and algorithmic randomness Hutter , providing a definition for the information of individual objects data strings beyond statistics Shannon entropy. Furthermore, we argue that brains, agents, and cognitive systems can be identified with special patterns embedded in mathematical structures enabling computation and compression.
A brief summary of what we may call the Kolmogorov theory of consciousness KT is as follows. Brains are model builders and compressors of information for survival. Cognition and phenomenal consciousness arise from modeling, compression, and data tracking using models. Then we shift to the objective view: what kind of mathematical structures connecting the concept of information with experience could describe the above? We argue that the proper framework is provided by AIT and the concept of algorithmic Kolmogorov complexity.
AIT brings together information theory Shannon and computation theory Turing in a unified way and provides a foundation for a powerful probabilistic inference framework Solomonoff. These three elements, together with Darwinian mechanisms, are crucial to our theory, which places information-driven modeling in agents at its core. To make the discussion more concrete, we briefly discuss Cellular Automata CA.
CAs, as universal Turing machines TMs , can instantiate embedded sub-TMs and provide an example of how complex-looking, entropic data chatter can be produced by simple, iterative rules. We return to the subjective and hypothesize that structured, graded, and multidimensional experience arises in agents that have access to simple models.
These models are instantiated on computational substrates such as recurrent neural networks RNNs and are presumably found by successful agents through interaction with a complex-looking world governed by simple rules.
Finally, based on the prior items and shifting to empirical application,. We focus instead on understanding how structured experience is shaped by the algorithmic characteristics of the models brains or other systems build with simplicity as a guiding principle. We aim to link the properties of models with those of experience, such as uniqueness, unity, and strength.
This is an ambitious but challenging program. In the Discussion section, we discuss some limitations and open questions.
The definition of a Universal TM Turing provides the mathematical foundation for computation and information theory and hence plays a key role in KT. Although our starting point is mathematical, it is readily linked to physics. The field of physics is guided by the notion that some simple laws dictate this evolution.
Although the specific choice of a physical theory is not of immediate concern for us, KT is certainly aligned with the idea that the universe is isomorphic to—or can be fully described by—such a mathematical structure, and that organisms are examples of special complex patterns embedded in it with the interesting property of being capable of modeling parts of the universe. CAs are mathematical structures defined on a cell grid with simple local interaction rules Wolfram , and they encapsulate many of fundamental aspects of physics spatiotemporal homogeneity, locality, and recursion.
They can be used to formalize the concepts of computation, information, and emergence of complex patterns and have attracted a great deal of interest because they capture two basic aspects of many natural systems: i they evolve according to local homogenous rules and ii they can exhibit rich behavior even with very simple rules.
A rule specifies, for the next iteration dynamics the value at that location from its prior value and that of its neighbors state. Surprisingly, some of these rules have been shown to produce universal computers—such as Rule Cook The initial configuration of the CA provides the program Wolfram CAs can produce highly entropic data, with power law behavior Kayama ; Mainzer and Chua ; Ninagawa Neural networks NNs represent another important paradigm of computation with a direct application in cognitive neuroscience and machine learning.
Feedforward networks have been shown to be able to approximate any reasonable function Cybenko ; Hornik Remarkably, if the function to be approximated is compositional recursive , then a hierarchical, feedforward network requires less training data than one with a shallow architecture to achieve similar performance Mhaskar et al.
Recurrence in NNs thus enables universal modeling. There is increasing evidence that the brain implements such deep, recursive, hierarchical networks—see, e. Taylor et al. In this section, we attempt to formalize our ideas. Definition 1. A model of a dataset is a program that generates or, equivalently, compresses the dataset efficiently, i. As discussed in Ruffini , this definition of model is equivalent to that of a classifier or generating function—NNs and other classifiers can be seen to essentially instantiate models.
A succinct model can be used to literally compress information by comparing data and model outputs and then compress the random difference or error using, e. For example, Newtonian physics is a simple model that accounts for kinematics, dynamics, and gravitational phenomena on the Earth falling apples and space orbit of the Moon. Naturally, a powerful model is both comprehensive and integrative, encompassing multiple data streams e. Examples of models built by brains include our concepts of space and time, hand, charge, mass, energy, coffee cups, quarks, tigers, and people. To survive—to maintain homeostasis and reproduce—brains build models to function effectively, storing knowledge economically saving resources such as memory or time.
They use models to build other models, for agile recall and decision making, to predict future information streams, and to interact successfully with the world. Having access to a good, integrated model of reality with compressive, operative, and predictive power is clearly an advantage for an organism subjected to the forces of natural selection from this viewpoint, brains and DNA are similar compressing systems acting at different time scales.
Furthermore, when a brain interacts actively with the rest of the universe, it disturbs it with measurements or other actions represented as information output streams. The information gathered from its inputs senses depends on how it chooses to extract it from the outside world through the passive and active aspects of sensing or other actions. Such self-models correspond here to what are called body representation and self-awareness.
Top a : Modeling for predictive compression. Bottom b : An agent with coupled modeling and an action modules. The action module contains an optimization objective function e. The model itself may be informed of the state of the action module directly dotted line or indirectly via the output stream concatenated to the input stream.
Definition 2. Figure 1b displays schematically the modeling engine and the resulting error stream from comparison of data and model outputs. These are passed onto an action module that makes decisions guided by an optimization function possibly querying the model for action simulations and generates output streams, which also feedback to the model. A classical thermostat or a machine-learning classifier are not agents by this definition, but new artificial intelligence systems being developed are.
As an example, we refer to Bongard et al. Cover and Thomas ; Li and Vitanyi Crucially, although the precise length of this program depends on the programming language used, it does so only up to a string-independent constant. Briefly, one can conceptually split the Kolmogorov optimal program describing a data string into two parts: a set of bits describing its regularities and another which captures the rest the part with no structure. The first term is the effective complexity, the minimal description of the regularities of the data.
This concept brings to light the power of the notion of Kolmogorov complexity, as it provides, single handedly, the means to account for and separate regularities in data from noise. However, within a limited computation scheme e. An example of this is Lempel—Ziv—Welch compression Ziv and Lempel , a simple yet fast algorithm that exploits the repetition of symbol sequences one possible form of regularity.
LZW file length is actually equivalent to entropy rate, an extension of the concept of entropy for stochastic sequences of symbols. This is the prior probability that a given string x could be generated by a random program. Thus, the probability of a given string being produced by a random program is dominated by its Kolmogorov complexity.
Because of this, a Bayesian prior for simplicity may be a good strategy for prediction, e.
Although we can only hypothesize the existence of such a data generating process, we do seem to inhabit a universe described by simple rules. From this, and from considerations on the evolutionary pressure on replicating agents natural selection favoring pattern-finding agents , we formulate the following hypothesis:. Hypothesis 1. We address next the nature of conscious content. From a cognitive perspective, we have argued that what we call reality is represented and shaped by the simplest programs brains can find to model their interaction with the world.
In some sense, simplicity is equivalent to reality and therefore, we now hypothesize, to structured experience. When we become conscious of something, we become conscious of it through a model, which is chosen among many as the one best fitting the available data. In more detail, we propose our next hypothesis, relating cognition and consciousness:. Hypothesis 2. The more compressive these models are, the stronger the subjective structured experiences generated.
Returning to Fig. The model itself is a mathematical, multidimensional, highly structured object, and can easily account for a huge variety of experiences. An implicit element here is thus that consciousness is a unified, graded, and multidimensional phenomenon. In KT, structured conscious awareness is thus associated to information processing systems that are efficient in describing and interacting with the external world information. An ant, e. However, not all interactions may call for a self-model e. Can we associate the characteristics of electrophysiological or metabolic spatiotemporal patterns in brains to conscious level?
Although somewhat counterintuitive, in KT agents that run simple models in conscious brains may appear to generate Shannon apparently complex data. The context for this apparent paradox is the aforementioned hypothesis the deterministic, simple physics hypothesis that the universe is ruled by simple, highly recursive programs which generate entropic data. In such a world, a brain tracking—and essentially simulating—high entropy data from its interaction with the world will itself produce complex looking data streams. Simple, deep programs will model and therefore generate entropic, fractal-looking data, and one whose structure is characterized by power laws, small world Gallos et al.
While a brain capable of universal computation may produce many different types of patterns—both simple e. We summarize this as follows:. Consequence 1. Conscious brains generate apparently complex entropic but compressible data streams data of low algorithmic complexity. Thus, in principle, the level of consciousness can be estimated from data generated by brains, by comparing its apparent and algorithmic complexities. Although providing improved bounds on algorithmic complexity remains a challenge, an apparently complex data stream generated from a low algorithmic complexity model should in principle be distinguishable from a truly random one, leaving traces on metrics such as entropy rate, LZW, power law exponents and fractal dimension.
If brain data are generated by a model we know e. As an example, consider a subject whom we ask to imagine, with eyes closed, parabolic trajectories from cannonballs. As discussed, such apparent complexity from simplicity points to the EEG data being generated by deep programs embedded in biological networks that are modeling real work data.
This is the second book of three volumes with selected articles by A. N. Kolmogorov ( - ) including the most important papers on mathematics and. This third volume contains original papers dealing with information theory and the theory of algorithms. Comments on these papers are included. The material.
A related consequence of the above is that the MAI between world and brain generated data should be high. A model is a compressed representation of the external world. We may also expect, in addition, that world data will not be obviously simple i. A simple example is the use of electrophysiology or fMRI data to reconstruct images projected on the retina Stanley et al. Indeed, the information stemming from such a cell would allow us to compress the input stream more effectively.
Consequence 2. Furthermore, the information about x in y will be in compressed form. High MAI between an external visual input and the state of the optical nerve or thalamus is also expected in a subject with eyes open. Our hypothesis is that information will be compressed in the cortex and present even if sensory inputs are disconnected, represented as a model—e. As models are presumably implemented in synaptic connectivity and neuronal dynamics, compressed representations of past input streams will be present in neuroimaging data. It is in this sense that we expect MAI between world and agent to increase with its actual or potential conscious level.
KT is closely related to theories of consciousness that place information at their core, and it actually provides conceptual links among them. It maintains that when we experience a conscious state, we rule out a huge number of possibilities. KT is strongly related to but not equivalent to ITT. IIT emphasizes causal structure of information processing systems and the physical substrate of consciousness. KT agrees well with other aspects of IIT. KT provides a mechanism for binding of information: a good, succinct model will by definition integrate available information streams into a coherent whole.
Economy of description implies both a vast repertoire reduction of uncertainty or information and integration of information.
We note that simple programs in the limit of Kolmogorov are irreducible and Platonic mathematical objects as in, e. This is another link with IIT and its central claim that an experience is identical to a conceptual structure that is maximally irreducible intrinsically. By definition, the model encoded by a network specifies which value holders nodes to use, how to connect them, and an initial state. Loosely, if it is an effective simple encoding, we would expect interconnectivity and intercausality in the elements of the network.
It turns out that we should also expect that perturbations of nodes of such a network, when activated in detecting a matching pattern i. Since the original work of Baars, Dehaene and others see, e. According to KT, the experience is associated to successful modeling validation events. Crucially, such events require integration of information from a variety of sensory and effective systems that must come together for model validation. Data must thus flow from a variety of sub-systems involving separate brain areas and merge—perhaps at different stages—for integrative error checking against a running model.
It maintains that to support adaptation, the Bayesian brain must discover information about the likely external causes of sensory signals using only information in the flux of the sensory signals themselves. According to PP, perception solves this problem via probabilistic, knowledge-driven inference on the causes of sensory signals.
Virtual reality VR technology offers a powerful way to induce and manipulate Presence. KT Hypothesis 1 predicts that given available models for existing data past and present , the simplest will be chosen Ruffini Binocular rivalry is a well-established paradigm in the study of the neuroscience of consciousness. Briefly, two different images are presented to each eye and the experience of subjects typically fluctuates between two models, i.
Kolmogorov is one of the founders of the Soviet school of probability theory, mathematical statistics, and the theory of turbulence. In these areas he obtained a number of central results, with many applications to mechanics, geophysics, linguistics and biology, among other subjects. This edition includes Kolmogorov's most important papers on mathematics and the natural sciences.
The material of this edition was selected and compiled by Kolmogorov himself. The first volume consists of papers on mathematics and also on turbulence and classical mechanics. The second volume is devoted to probability theory and mathematical statistics. The focus of the third volume is on information theory and the theory of algorithms.
Add to Basket. Book Description Springer, Netherlands, Condition: New. Language: English. Brand new Book. In his studies on trigonometric and orthogonal series, the theory of measure and inte-gral, mathematical logic, approximation theory, geometry, topology, functional analysis, classical mechanics, ergodic theory, superposition of functions, and in- formation theory, he solved many conceptual and fundamental problems and posed new questions which gave rise to a great deal of further research.
It does not include his philosophical and ped-agogical studies, his articles written for the "Bolshaya Sovetskaya Entsiklopediya", his papers on prosody and applications of mathematics or his publications on general questions. Softcover reprint of the original 1st ed. Seller Inventory LIE More information about this seller Contact this seller.
Seller Inventory AAV Book Description Springer, New Book. Delivered from our UK warehouse in 4 to 14 business days. Established seller since