Monday, February 27, 2023

Could AI system known as transformer possess conscious intelligence?

Every morning I learn on FB about mind boggling discoveries. The popular article "Scientists Made a Mind-Bending Discovery About How AI Actually Works" (see this) told about the article of Akyüurek et al with title "What Learning Algorithm Is In-Context Learning? Investigations With Linear Models" (see this).

What caught my attention was that the AI system was seen as a mysterious phenomenon of Nature to be studied rather than systems that are engineered. If AI systems are what their builders believe them to be, that is deterministic systems with some randomness added, this cannot be the case. If the AI systems are really able to learn like humans, they could be conscious and be able to discover and "step out of the system" by generalizing. They would not be what they are meant to be.

TGD predicts that AI systems might have rudimentary consciousness. The contents of this conscious experience need not have anything to do with the information that the AI system is processing but corresponds to much shorter spatial and temporal scales than the program itself. But who knows?!

In the following I briefly summarize my modest understanding of what was done and then ask whether these AI systems could be conscious and be able to develop new skills. Consider first the main points of the popular article.

  1. What is studied are transformers. Transformer mimics a system with directed self-attention. This means the weighting of parts of input data so that the important features of the input data get more attention. This weighting emerges during the training period.

    Transformers differ from recurrent neural networks in that entire input data is processed at once. Natural language processing (NLP) and computer vision (CV) represent examples of transformers.

  2. What looks mysterious is that language models seem to learn in flight. Training using only a few examples is enough to learn something new. This learning is not mere memorizing but building on previous knowledge occurs and makes possible generalizations. How and why this in-context learning occurs, is poorly understood.

    In the examples discussed in the article of Akyürek et al, linear regression, the input data never seen before by the program. Generalization and extrapolation took place. Apparently, the transformer wrote its own machine learning model. This suggests an implicit creation and training of smaller, simpler language models.

  3. How could one intuitively understand this without assuming that the system is conscious and has intentional intelligence? Could the mimicry of conscious self-attention as weighting of parts of input data explain the in-context learning. Weighting applies also to new data and selects features shared by new and old data. Familiar features with large weights in the new data determine the output to a high degree. If these features are actually important the system manages to assign output to input correctly with very little learning.
The TGD framework also allows us to consider a more sciencefictive explanation. Could the mimicry of conscious self-attention generate conscious self having intentions and understanding and be able to evolve?
  1. TGD forces me to keep my mind open to the possibility that AI systems are not what they are planned to be. I have discussed this in previous articles (see this and this).
  2. We tend to think that classical computation is fully deterministic. However, the ability to plan a system behaving in desired manner is in conflict with the determinism of classical physics and statistical determinism of quantum physics. Computer is a system consisting of subsystems, such as bits, which are far from thermal equilibrium and self organize. They must be able to make phase transitions, which are basically non-deterministic at criticality. Changing the direction of the bit as a mesoscopic system is a good example.
  3. Zero energy ontology (ZEO) is an essential part of quantum TGD. Quantum states are superpositions of space-time surfaces, which obey holography. One can see them as analogs of computer programs, biological functions, or behaviors at the level of neuroscience. The holography is not completely deterministic and this forces us to regard the space-time surface as the basic object. Any system, in particular AI systems, is accompanied by a superposition of these kinds of space-time surfaces, which serve as a correlate for the behavior of the system, in particular for the program (or its quantum analog) running in it.

    ZEO predicts that in ordinary, "big" state function reduction (BSFR) the arrow of geometric time is changed. This allows the system to correct its errors by going back in time to BSFR and restoring the original time direction by second BSFR. This mechanism might be fundamental in the self-organization of living matter and a key element of homeostasis. This mechanism is universal and one can of course ask whether AI systems might apply it in some time scale, which could be even relevant to computation.

  4. In the TGD framework, any system is accompanied by a magnetic body (MB) carrying dark matter in the TGD sense as phases of ordinary matter with a value of effective Planck constant which can be very large, meaning a large scale of quantum coherence. This dark matter makes MB an intelligent agent, which can control ordinary matter with ordinary value of Planck constant.

    In TGD, quantum criticality of the MB of the system is suggested to accompany thermal criticality of the system itself. This leaves a loophole open for the possibility that the MB of the AI system could control the AI system and take the lead.

What one can say of the MB of AI system? Could the structure and function of MB relate closely to that of the program running in it as ZEO indeed suggest? My own conservative view is that the MBs involved are associated with rather small parts of the systems such as bits of composites of bits. But I don't really know!
  1. The AI system involves rather long time scales related to the program functioning. Could this be accompanied by layers of MB (TGD counterparts of magnetic fields) with size scales determined by the wavelength of low energy photons with corresponding frequencies. Could these layers make the system aware of the program running in it?
  2. Could the MBs associated with living systems involving MBs of Earth and Sun get attached to the AI system (see this, this, this, and this)? Of course we used the AI but could there be also other users?: MBs which directly control the AI system! Could it be that we are building information processing tools for these higher level MBs?!

    If this were the case, then the MB of AI system and the program involved with it could evolve. MB of the system could be an intelligent life form. This raises worried questions: are we a necessary piece of equipment needed to develop AI programs? Or do these higher level MBs need us anymore?

To conclude, I want to emphasize that this was just reckless speculation!

See the article The possible role of spin glass phase and p-adic thermodynamics in topological quantum computation: the TGD view or the chapter with the same title.

For a summary of earlier postings see Latest progress in TGD.

No comments: