An Implication of Godel's Incompleteness Theorem for Self-Learning Agents

Chris Lengerich
February 24th, 2023

From Godel’s 2nd Incompleteness Theorem, any formal system that is interesting enough to formulate its own consistency can prove its own consistency iff it is inconsistent. This is frustrating if you only have this formal system - some things will always unprovable.

Now assume that the tokens in your formal system (X) arise as a contrastive distillation of a latent system (Z) found in a dataset D, and we can add new datapoints d’ to D to form D’ = {d’ U D} by sampling new data, which is then distilled by the self-learning process into a larger latent system (Z’, |Z’| > |Z|) and observed system (X', |X'| > |X|). In the larger formal systems, it may now be possible to prove the consistency of the smaller system, which was unprovable otherwise without inconsistency. In other words, add more data and expand the formal systems through distillation to reconcile the seeming violation of consistency in the smaller system.

As a simple example, how can it be that the statements “birds can fly” and “birds can’t fly” are both true? If you only had the statements “birds can fly” and “birds can’t fly” in your system, you’re stuck - either birds can AND can’t fly (inconsistent) or it is not provable whether birds can/can’t fly (incomplete)). But if you distilled a larger system from new data, say, by observing the birds for a longer period of time, you may find that “when birds are young, they can’t fly”, “When they are older, they can fly”. The additional latent variable (time) manifest into the observed prefix “when birds are young/old” allows you to reconcile formulas that require either inconsistency or incompleteness in the smaller system.

Why do we think? Although binding of variables to reduce prediction loss under distribution shift is a likely more plausible evolutionary motivation for constantly thinking (esp. binding new variables for transfer learning), Godel’s incompleteness theorems mean there may be another one, assuming we strive for consistency (I know I do). Since the theorems hold for *any* interesting formal system we currently use, even if we do nothing (no distribution shift), we will always have a reason to expand the formal system via sampling new evidence from memory when we encounter inconsistency (ie. thinking).

We are indeed a strange loop.

Post originally appeared at