https://demo.formulasearchengine.com/index.php?title=Kolmogorov_complexity&feed=atom&action=historyKolmogorov complexity - Revision history2019-11-14T21:59:18ZRevision history for this page on the wikiMediaWiki 1.34.0-alphahttps://demo.formulasearchengine.com/index.php?title=Kolmogorov_complexity&diff=218577&oldid=prevAdmin: Reverted edits by MediaWiki spam cleanup (talk) to last revision by Jochen Burghardt2014-07-31T13:43:12Z<p>Reverted edits by <a href="/wiki/Special:Contributions/MediaWiki_spam_cleanup" title="Special:Contributions/MediaWiki spam cleanup">MediaWiki spam cleanup</a> (<a href="/index.php?title=User_talk:MediaWiki_spam_cleanup&action=edit&redlink=1" class="new" title="User talk:MediaWiki spam cleanup (page does not exist)">talk</a>) to last revision by <a href="/index.php?title=User:Jochen_Burghardt&action=edit&redlink=1" class="new" title="User:Jochen Burghardt (page does not exist)">Jochen Burghardt</a></p>
<a href="https://demo.formulasearchengine.com/index.php?title=Kolmogorov_complexity&diff=218577&oldid=218576">Show changes</a>Adminhttps://demo.formulasearchengine.com/index.php?title=Kolmogorov_complexity&diff=218576&oldid=prevMediaWiki spam cleanup: All revisions contained links to *.com, blanking2014-04-03T09:12:58Z<p>All revisions contained links to *.com, blanking</p>
<a href="https://demo.formulasearchengine.com/index.php?title=Kolmogorov_complexity&diff=218576&oldid=218575">Show changes</a>MediaWiki spam cleanuphttps://demo.formulasearchengine.com/index.php?title=Kolmogorov_complexity&diff=218575&oldid=preven>Jochen Burghardt: /* References */ removed redundant ref.s appearing also as footnote 1,22014-02-18T08:22:56Z<p><span dir="auto"><span class="autocomment">References: </span> removed redundant ref.s appearing also as footnote 1,2</span></p>
<a href="https://demo.formulasearchengine.com/index.php?title=Kolmogorov_complexity&diff=218575&oldid=218574">Show changes</a>en>Jochen Burghardthttps://demo.formulasearchengine.com/index.php?title=Kolmogorov_complexity&diff=218574&oldid=preven>Gracefool: emphasis2012-08-30T02:38:48Z<p>emphasis</p>
<p><b>New page</b></p><div>{{Merge from|Invariance theorem|date=July 2010}}<br />
In [[algorithmic information theory]] (a subfield of [[computer science]]), the '''Kolmogorov complexity''' of an object, such as a piece of text, is a measure of the [[computation]]al resources needed to specify the object. It is named after the [[Andrey Kolmogorov]] who first published on the subject in 1963.<ref>{{cite journal|authorlink=Andrey Kolmogorov|first=Andrey|last=Kolmogorov|year=1963|title=On Tables of Random Numbers| journal=[[Sankhyā]] Ser. A.|volume=25|pages=369–375|mr=178484}}</ref><ref>{{cite journal|authorlink=Andrey Kolmogorov|first=Andrey|last=Kolmogorov|year=1998|title=On Tables of Random Numbers| journal=Theoretical Computer Science|volume=207|issue=2|pages=387–395|doi=10.1016/S0304-3975(98)00075-9 |mr=1643414}}</ref><br />
<br />
Kolmogorov complexity is also known as "descriptive complexity" (not to be confused with [[descriptive complexity theory]]), '''Kolmogorov–[[Gregory Chaitin|Chaitin]] complexity''', '''algorithmic entropy''', or '''program-size complexity'''.<br />
<br />
For example, consider the following two [[string (computer science)|strings]] of length 64, each containing only lowercase letters and digits:<br />
<br />
<pre>abababababababababababababababababababababababababababababababab</pre><br />
<pre>4c1j5b2p0cv4w1x8rx2y39umgw5q85s7traquuxdppa0q7nieieqe9noc4cvafzf</pre><br />
<br />
The first string has a short English-language description, namely "ab 32 times", which consists of '''11''' characters. The second one has no obvious simple description (using the same character set) other than writing down the string itself, which has '''64''' characters.<br />
<br />
[[Image:Mandelpart2 red.png|300px|right|thumb|This image illustrates part of the [[Mandelbrot set]] [[fractal]]. Simply storing the 24-bit color of each pixel in this image would require 1.62 million bits, but a small computer program can reproduce these 1.62 million bits using the definition of the Mandelbrot set and the coordinates of the corners of the image. Thus, the Kolmogorov complexity of the raw file encoding this bitmap is much less than 1.62 million.]]<br />
<br />
More formally, the [[complexity]] of a string is the length of the shortest possible description of the string in some fixed [[Turing complete|universal]] [[description language]] (the sensitivity of complexity relative to the choice of description language is discussed below). It can be shown that the Kolmogorov complexity of any string cannot be more than a few bytes larger than the length of the string itself. Strings whose Kolmogorov complexity is small relative to the string's size are not considered to be complex.<br />
<br />
The notion of the Kolmogorov complexity can be used to state and prove impossibility results akin to [[Gödel's incompleteness theorem]] and [[halting problem|Turing's halting problem]].<br />
<br />
==Definition==<br />
To define the Kolmogorov complexity, we must first specify a description language for strings. Such a description language can be based on any computer programming language, such as [[Lisp programming language|Lisp]], [[Pascal (programming language)|Pascal]], or [[Java Virtual Machine]] bytecode. If '''P''' is a program which outputs a string ''x'', then '''P''' is a description of ''x''. The length of the description is just the length of '''P''' as a character string, multiplied by the number of bits in a character (e.g. 7 for [[ASCII]]).<br />
<br />
We could, alternatively, choose an encoding for [[Turing machine]]s, where an ''encoding'' is a function which associates to each Turing Machine '''M''' a bitstring <'''M'''>. If '''M''' is a Turing Machine which, on input ''w'', outputs string ''x'', then the concatenated string <'''M'''> ''w'' is a description of ''x''. For theoretical analysis, this approach is more suited for constructing detailed formal proofs and is generally preferred in the research literature. [[Binary lambda calculus]] may provide the simplest definition of complexity yet. In this article, an informal approach is discussed.<br />
<br />
Any string ''s'' has at least one description, namely the program:<br />
<br />
'''function''' GenerateFixedString()<br />
'''return''' ''s''<br />
<br />
If a description of ''s'', ''d''(''s''), is of minimal length (i.e. it uses the fewest number of characters), it is called a '''minimal description''' of ''s''. Thus, the length of ''d''(''s'') (i.e. the number of characters in the description) is the '''Kolmogorov complexity''' of ''s'', written ''K''(''s''). Symbolically,<br />
<br />
:<math>K(s) = |d(s)|. \quad </math><br />
<br />
We now consider how the choice of description language affects the value of ''K'', and show that the effect of changing the description language is bounded.<br />
<br />
'''Theorem''': If ''K''<sub>1</sub> and ''K''<sub>2</sub> are the complexity functions relative to description languages ''L''<sub>1</sub> and ''L''<sub>2</sub>, then there is a constant ''c'' - which depends only on the languages ''L''<sub>1</sub> and ''L''<sub>2</sub> chosen - such that<br />
<br />
: <math>\forall s\ |K_1(s) - K_2(s)| \leq c.</math><br />
<br />
'''Proof''': By symmetry, it suffices to prove that there is some constant ''c'' such that for all bitstrings ''s''<br />
<br />
: <math> K_1(s) \leq K_2(s) + c. </math><br />
<br />
Now, suppose there is a program in the language ''L''<sub>1</sub> which acts as an [[interpreter (computing)|interpreter]] for ''L''<sub>2</sub>:<br />
<br />
'''function''' InterpretLanguage('''string''' ''p'')<br />
<br />
where ''p'' is a program in ''L''<sub>2</sub>. The interpreter is characterized by the following property:<br />
<br />
: Running InterpretLanguage on input ''p'' returns the result of running ''p''.<br />
<br />
Thus, if '''P''' is a program in ''L''<sub>2</sub> which is a minimal description of ''s'', then InterpretLanguage('''P''') returns the string ''s''. The length of this description of ''s'' is the sum of<br />
<br />
# The length of the program InterpretLanguage, which we can take to be the constant ''c''.<br />
# The length of '''P''' which by definition is ''K''<sub>2</sub>(''s'').<br />
<br />
This proves the desired upper bound.<br />
<br />
See also [[invariance theorem]].<br />
<br />
==History and context==<br />
Algorithmic information theory is the area of computer science that studies Kolmogorov complexity and other complexity measures on strings (or other [[data structure]]s).<br />
<br />
The concept and theory of Kolmogorov Complexity is based on a crucial theorem first discovered by [[Ray Solomonoff]], who published it in 1960, describing it in "A Preliminary Report on a General Theory of Inductive Inference"<ref>{{cite journal |authorlink=Ray Solomonoff | last=Solomonoff |first= Ray | url=http://world.std.com/~rjs/rayfeb60.pdf |format=PDF | title=A Preliminary Report on a General Theory of Inductive Inference | journal= Report V-131 |publisher= Zator Co. |location= Cambridge, Ma. | date= February 4, 1960 }} [http://world.std.com/~rjs/z138.pdf revision], Nov., 1960.</ref> as part of his invention of [[algorithmic probability]]. He gave a more complete description in his 1964 publications, "A Formal Theory of Inductive Inference," Part 1 and Part 2 in ''Information and Control''.<ref>{{cite journal | doi=10.1016/S0019-9958(64)90223-2 | last=Solomonoff |first= Ray | title=A Formal Theory of Inductive Inference Part I | journal = Information and Control | url=http://world.std.com/~rjs/1964pt1.pdf |volume=7 |issue= 1 |pages= 1&ndash;22 | month= March | year=1964 }}</ref><ref>{{cite journal | doi=10.1016/S0019-9958(64)90131-7 | last=Solomonoff |first= Ray | title=A Formal Theory of Inductive Inference Part II | journal = Information and Control | url=http://world.std.com/~rjs/1964pt2.pdf |volume=7 |issue= 2 |pages= 224&ndash;254 | month=June | year=1964 }}</ref><br />
<br />
Andrey Kolmogorov later [[multiple discovery|independently published]] this theorem in ''Problems Inform. Transmission'',<ref>{{cite journal | volume= 1| issue=1 |year=1965 | pages= 1–7 | title =Three Approaches to the Quantitative Definition of Information | url=http://www.ece.umd.edu/~abarg/ppi/contents/1-65-abstracts.html#1-65.2 | journal = Problems Inform. Transmission | first=A.N. | last=Kolmogoro }}</ref> Gregory Chaitin also presents this theorem in ''J. ACM'' - Chaitin's paper was submitted October 1966 and revised in December 1968, and cites both Solomonoff's and Kolmogorov's papers.<ref>{{cite journal | last1 = Chaitin | first1 = Gregory J. | title = On the Simplicity and Speed of Programs for Computing Infinite Sets of Natural Numbers| url=http://reference.kfupm.edu.sa/content/o/n/on_the_simplicity_and_speed_of_programs__94483.pdf | format=PDF | journal = Journal of the ACM | volume = 16 | pages = 407 | year = 1969 | doi = 10.1145/321526.321530 | issue = 3 }}</ref><br />
<br />
The theorem says that, among algorithms that decode strings from their descriptions (codes), there exists an optimal one. This algorithm, for all strings, allows codes as short as allowed by any other algorithm up to an additive constant that depends on the algorithms, but not on the strings themselves. Solomonoff used this algorithm, and the code lengths it allows, to define a "universal probability" of a string on which inductive inference of the subsequent digits of the string can be based. Kolmogorov used this theorem to define several functions of strings, including complexity, randomness, and information.<br />
<br />
When Kolmogorov became aware of Solomonoff's work, he acknowledged Solomonoff's priority.<ref>{{cite journal | last1=Kolmogorov | first1=A. | title=Logical basis for information theory and probability theory | journal=IEEE Transactions on Information Theory | volume=14|issue=5 | pages=662–664 | year=1968 | doi =10.1109/TIT.1968.1054210 }}</ref> For several years, Solomonoff's work was better known in the Soviet Union than in the Western World. The general consensus in the scientific community, however, was to associate this type of complexity with Kolmogorov, who was concerned with randomness of a sequence, while Algorithmic Probability became associated with Solomonoff, who focused on prediction using his invention of the universal ''a priori'' probability distribution.<br />
<br />
There are several other variants of Kolmogorov complexity or algorithmic information. The most widely used one is based on [[self-delimiting program]]s, and is mainly due to [[Leonid Levin]] (1974).<br />
<br />
An axiomatic approach to Kolmogorov complexity based on [[Blum axioms]] (Blum 1967) was introduced by Mark Burgin in the paper presented for publication by Andrey Kolmogorov (Burgin 1982).<br />
<br />
Some consider that naming the concept "Kolmogorov complexity" is an example of the [[Matthew effect (sociology)|Matthew effect]].<ref>{{Cite book<br />
| edition = 2nd<br />
| publisher = Springer<br />
| isbn = 0-387-94868-6<br />
| last = Li<br />
| first = Ming<br />
| coauthors = Paul Vitanyi<br />
| title = An Introduction to Kolmogorov Complexity and Its Applications<br />
| date = 1997-02-27<br />
}}</ref><br />
<br />
==Basic results==<br />
In the following discussion, let ''K''(''s'') be the complexity of the string ''s''.<br />
<br />
It is not hard to see that the minimal description of a string cannot be too much larger than the string itself - the program GenerateFixedString above that outputs ''s'' is a fixed amount larger than ''s''.<br />
<br />
'''Theorem''': There is a constant ''c'' such that<br />
<br />
:<math> \forall s \ K(s) \leq |s| + c. \quad </math><br />
<br />
===Incomputability of Kolmogorov complexity===<br />
The first result is that there is no way to compute ''K''.<br />
<br />
'''Theorem''': ''K'' is not a [[computable function]].<br />
<br />
In other words, there is no program which takes a string ''s'' as input and produces the integer ''K''(''s'') as output. We show this by contradiction by making a program that creates a string that should only be able to be created by a longer program. Suppose there is a program<br />
<br />
'''function''' KolmogorovComplexity('''string''' ''s'')<br />
<br />
that takes as input a string ''s'' and returns ''K''(''s''). Now, consider the program<br />
<br />
'''function''' GenerateComplexString('''int''' ''n'')<br />
'''for''' i = 1 '''to''' infinity:<br />
'''for each''' string s '''of''' length exactly i<br />
'''if''' KolmogorovComplexity(''s'') >= ''n''<br />
'''return''' ''s''<br />
'''quit'''<br />
<br />
This program calls KolmogorovComplexity as a subroutine. The program tries every string, starting with the shortest, until it finds a string with complexity at least ''n'', then returns that string. Therefore, given any positive integer ''n'', it produces a string with Kolmogorov complexity at least as great as ''n''. The program itself has a fixed length ''U''. The input to the program GenerateComplexString is an integer ''n''. Here, the size of ''n'' is measured by the number of bits required to represent ''n'', which is log<sub>2</sub>(''n''). Now, consider the following program:<br />
<br />
'''function''' GenerateParadoxicalString()<br />
'''return''' GenerateComplexString(''n''<sub>0</sub>)<br />
<br />
This program calls GenerateComplexString as a subroutine, and also has a free parameter<br />
''n''<sub>0</sub>. The program outputs a string ''s'' whose complexity is at least ''n''<sub>0</sub>. By an auspicious choice of the parameter ''n''<sub>0</sub>, we will arrive at a contradiction. To choose this value, note that ''s'' is described by the program<br />
<br />
GenerateParadoxicalString whose length is at most<br />
<br />
:<math> U + \log_2(n_0) + C \quad </math><br />
<br />
where ''C'' is the "overhead" added by the program GenerateParadoxicalString. Since ''n'' grows faster than log<sub>2</sub>(''n''), there must exist a value ''n''<sub>0</sub> such that<br />
<br />
:<math> U + \log_2(n_0) + C < n_0. \quad </math><br />
<br />
But this contradicts the definition of ''s'' as having a complexity at least ''n''<sub>0</sub>. That is, by the definition of ''K''(''s''), the string ''s'' returned by GenerateParadoxicalString is only supposed to be able to be generated by a program of length ''n''<sub>0</sub> or longer, but GenerateParadoxicalString is shorter than ''n''<sub>0</sub>. Thus the program named "KolmogorovComplexity" cannot actually computably find the complexity of arbitrary strings.<br />
<br />
This is proof by contradiction, where the contradiction is similar to the [[Berry paradox]]: "Let ''n'' be the smallest positive integer that cannot be defined in fewer than twenty English words". It is also possible to show the non-computability of K by reduction from the non-computability of the halting problem H, since K and H are [[turing degree|Turing-equivalent]].[http://www.daimi.au.dk/~bromille/DC05/Kolmogorov.pdf]<br />
<br />
In the programming language community there is a corollary known as the [[full employment theorem]], stating that there is no perfect size-optimizing compiler.<br />
<br />
===Chain rule for Kolmogorov complexity===<br />
{{Main| Chain rule for Kolmogorov complexity}}<br />
The chain rule for Kolmogorov complexity states that<br />
<br />
:<math> K(X,Y) = K(X) + K(Y|X) + O(\log(K(X,Y))).\quad</math><br />
<br />
It states that the shortest program that reproduces ''X'' and ''Y'' is [[Big-O notation|no more]] than a logarithmic term larger than a program to reproduce ''X'' and a program to reproduce ''Y'' given ''X''. Using this statement, one can define [[Mutual information#Absolute mutual information|an analogue of mutual information for Kolmogorov complexity]].<br />
<br />
==Compression==<br />
It is straightforward to compute upper bounds for <math>K(s)</math> - simply [[data compression|compress]] the string <math>s</math> with some method, implement the corresponding decompressor in the chosen language, concatenate the decompressor to the compressed string, and measure the length of the resulting string.<br />
<br />
A string ''s'' is compressible by a number ''c'' if it has a description whose length does not exceed <math>|s|-c</math>. This is equivalent to saying that <math>K(s) \le |s|-c</math>. Otherwise, ''s'' is incompressible by ''c''. A string incompressible by 1 is said to be simply ''incompressible'' - by the [[pigeonhole principle]], which applies because every compressed string maps to only one uncompressed string, [[incompressible string]]s must exist, since there are <math>2^n</math> bit strings of length ''n'', but only 2<sup>''n''</sup>&nbsp;&minus;&nbsp;1 shorter strings, that is, strings of length less than ''n'', (i.e. with length ''0,1,...,n-1'').<ref>As there is {{nobr|1=''N''<sub>''L''</sub> = 2<sup>''L''</sup>}} strings of length ''L'', the number of strings of lengths {{nowrap|1=''L''=0..(n−1)}} is {{nobr|''N''<sub>0</sub> + ''N''<sub>1</sub> + ... + ''N''<sub>''n''−1</sub>}} = {{nobr|2<sup>0</sup> + 2<sup>1</sup> + ... + 2<sup>''n''−1</sup>}}, which is a finite [[geometric series]] with sum {{nobr|2<sup>0</sup> + 2<sup>1</sup> + ... + 2<sup>''n''−1</sup>}} = {{nobr|1 = 2<sup>0</sup> × (1 − 2<sup>''n''</sup>) / (1 − 2) = 2<sup>''n''</sup> − 1}}.</ref><br />
<br />
For the same reason, most strings are complex in the sense that they cannot be significantly compressed - <math>K(s)</math> is not much smaller than <math>|s|</math>, the length of ''s'' in bits. To make this precise, fix a value of ''n''. There are <math>2^n</math> bitstrings of length ''n''. The [[Uniform distribution (discrete)|uniform]] [[probability]] distribution on the space of these bitstrings assigns exactly equal weight <math>2^{-n}</math> to each string of length ''n''.<br />
<br />
'''Theorem''': With the uniform probability distribution on the space of bitstrings of length ''n'', the probability that a string is incompressible by ''c'' is at least <math>1-2^{-c+1}+2^{-n}</math>.<br />
<br />
To prove the theorem, note that the number of descriptions of length not exceeding <math>n-c</math> is given by the geometric series:<br />
<br />
:<math> 1 + 2 + 2^2 + \cdots + 2^{n-c} = 2^{n-c+1}-1.\ </math><br />
<br />
There remain at least<br />
<br />
:<math> 2^n-2^{n-c+1}+1\ </math><br />
<br />
bitstrings of length ''n'' that are incompressible by ''c''. To determine the probability, divide by <math>2^n</math>.<br />
<br />
==Chaitin's incompleteness theorem==<br />
We know that, in the set of all possible strings, most strings are complex in the sense that they cannot be described in any significantly "compressed" way. However, it turns out that the fact that a specific string is complex cannot be formally proved, if the complexity of the string is above a certain threshold. The precise formalization is as follows. First, fix a particular [[axiomatic system]] '''S''' for the [[natural number]]s. The axiomatic system has to be powerful enough so that, to certain assertions '''A''' about complexity of strings, one can associate a formula '''F'''<sub>'''A'''</sub> in '''S'''. This association must have the following property:<br />
<br />
if '''F'''<sub>'''A'''</sub> is provable from the axioms of '''S''', then the corresponding assertion '''A''' must be true. This "formalization" can be achieved, either by an artificial encoding such as a [[Gödel numbering]], or by a formalization which more clearly respects the intended interpretation of '''S'''.<br />
<br />
'''Theorem''': There exists a constant ''L'' (which only depends on the particular axiomatic system and the choice of description language) such that there does not exist a string ''s'' for which the statement<br />
<br />
: <math> K(s) \geq L \quad </math> (as formalized in '''S''') can be proven within the axiomatic system '''S'''.<br />
<br />
Note that, by the abundance of nearly incompressible strings, the vast majority of those statements must be true.<br />
<br />
The proof of this result is modeled on a self-referential construction used in [[Berry's paradox]]. The proof is by contradiction. If the theorem were false, then<br />
<br />
:'''Assumption (X)''': For any integer ''n'' there exists a string ''s'' for which there is a proof in '''S''' of the formula "''K''(''s'')&nbsp;≥&nbsp;''n''" (which we assume can be formalized in '''S''').<br />
<br />
We can find an effective enumeration of all the formal proofs in '''S''' by some procedure<br />
<br />
'''function''' NthProof('''int''' ''n'')<br />
which takes as input ''n'' and outputs some proof. This function enumerates all proofs. Some of these are proofs for formulas we do not care about here, since every possible proof in the language of '''S''' is produced for some ''n''. Some of these are complexity formulas of the form ''K''(''s'')&nbsp;≥&nbsp;''n'' where ''s'' and ''n'' are constants in the language of '''S'''. There is a program<br />
<br />
'''function''' NthProofProvesComplexityFormula('''int''' ''n'')<br />
<br />
which determines whether the ''n''th proof actually proves a complexity formula ''K''(''s'')&nbsp;≥&nbsp;''L''. The strings ''s'', and the integer ''L'' in turn, are computable by programs:<br />
<br />
'''function''' StringNthProof('''int''' ''n'')<br />
<br />
'''function''' ComplexityLowerBoundNthProof('''int''' ''n'')<br />
<br />
Consider the following program<br />
<br />
'''function''' GenerateProvablyComplexString('''int''' ''n'')<br />
'''for''' i = 1 to infinity:<br />
'''if''' NthProofProvesComplexityFormula(i) '''and''' ComplexityLowerBoundNthProof(i) ≥ ''n''<br />
'''return''' StringNthProof(''i'')<br />
'''quit'''<br />
<br />
Given an ''n'', this program tries every proof until it finds a string and a proof in the [[formal system]] '''S''' of the formula ''K''(''s'')&nbsp;≥&nbsp;''L'' for some ''L''&nbsp;≥&nbsp;''n''. The program terminates by our '''Assumption (X)'''. Now, this program has a length ''U''. There is an integer ''n''<sub>0</sub> such that ''U''&nbsp;+&nbsp;log<sub>2</sub>(''n''<sub>0</sub>)&nbsp;+&nbsp;''C''&nbsp;<&nbsp;''n''<sub>0</sub>, where ''C'' is the overhead cost of<br />
<br />
'''function''' GenerateProvablyParadoxicalString()<br />
'''return''' GenerateProvablyComplexString(''n''<sub>0</sub>)<br />
'''quit'''<br />
<br />
The program GenerateProvablyParadoxicalString outputs a string ''s'' for which there exists an ''L'' such that ''K''(''s'')&nbsp;≥&nbsp;''L'' can be formally proved in '''S''' with ''L''&nbsp;≥&nbsp;''n''<sub>0</sub>. In particular, ''K''(''s'')&nbsp;≥&nbsp;''n''<sub>0</sub> is true. However, ''s'' is also described by a program of length ''U''&nbsp;+&nbsp;log<sub>2</sub>(''n''<sub>0</sub>)&nbsp;+&nbsp;''C'', so its complexity is less than ''n''<sub>0</sub>. This contradiction proves '''Assumption (X)''' cannot hold.<br />
<br />
Similar ideas are used to prove the properties of [[Chaitin's constant]].<br />
<br />
==Minimum message length==<br />
The [[minimum message length]] principle of statistical and inductive inference and machine learning was developed by [[Chris Wallace (computer scientist)|C.S. Wallace]] and D.M. Boulton in 1968. MML is [[Bayesian probability|Bayesian]] (i.e. it incorporates prior beliefs) and information-theoretic. It has the desirable properties of statistical invariance (i.e. the inference transforms with a re-parametrisation, such as from polar coordinates to Cartesian coordinates), statistical consistency (i.e. even for very hard problems, MML will converge to any underlying model) and efficiency (i.e. the MML model will converge to any true underlying model about as quickly as is possible). C.S. Wallace and D.L. Dowe (1999) showed a formal connection between MML and algorithmic information theory (or Kolmogorov complexity).<br />
<br />
==Kolmogorov randomness==<br />
:{{See also|algorithmically random sequence}}<br />
''Kolmogorov randomness'' - also called ''algorithmic randomness'' - defines a string (usually of [[bit]]s) as being [[randomness|random]] if and only if it is shorter than any [[computer program]] that can produce that string. This definition of randomness is critically dependent on the definition of Kolmogorov complexity. To make this definition complete, a computer has to be specified (usually a Turing machine). According to the above definition of randomness, a random string is also an "incompressible" string in the sense that it is impossible to give a representation of the string using a program whose length is shorter than the length of the string itself. However, according to this definition, most strings shorter than a certain length end up being Chaitin-Kolmogorovically random because the best one can do with very small strings is to write a program that simply prints these strings.<br />
<br />
== Relation to entropy ==<br />
It can be shown<ref>[http://arxiv.org/pdf/cs.CC/0404039]</ref> that for the output of [[Markov information source]]s, Kolmogorov complexity is related to the [[entropy]] of the information source. More precisely, the Kolmogorov complexity of the output of a Markov information source, normalized by the length of the output, converges almost surely (as the length of the output goes to infinity) to the entropy of the source.<br />
<br />
==See also==<br />
* [[Berry paradox]]<br />
* [[Data compression]]<br />
* [[Inductive inference]]<br />
* [[Kolmogorov structure function]]<br />
* [[List_of_important_publications_in_theoretical_computer_science#Algorithmic_information_theory|Important publications in algorithmic information theory]]<br />
* [[Levenshtein distance]]<br />
* [[Grammar induction]]<br />
<br />
==Notes==<br />
{{Reflist|group=Note}}<br />
<br />
==References==<br />
{{Reflist}}<br />
<br />
* {{cite journal | authorlink=Manuel Blum|last=Blum | title=On the size of machines | journal=Information and Control |first= M. | volume=11 | issue=3 | pages=257 | year=1967 | doi = 10.1016/S0019-9958(67)90546-3 }}<br />
* Burgin, M. (1982), "Generalized Kolmogorov complexity and duality in theory of computations", ''Notices of the Russian Academy of Sciences'', v.25, No. 3, pp.&nbsp;19&ndash;23.<br />
* Cover, Thomas M. and Thomas, Joy A., ''Elements of information theory'', 1st Edition. New York: Wiley-Interscience, 1991. ISBN 0-471-06259-6. 2nd Edition. New York: Wiley-Interscience, 2006. ISBN 0-471-24195-4.<br />
* {{cite journal|authorlink=Andrei N. Kolmogorov|first=Andrei N.|last=Kolmogorov|year=1963|title=On Tables of Random Numbers| journal=[[Sankhyā]] Ser. A.|volume=25|pages=369–375|mr=178484}}<br />
* {{cite journal|authorlink=Andrei N. Kolmogorov|first=Andrei N.|last=Kolmogorov|year=1998|title=On Tables of Random Numbers| journal=Theoretical Computer Science|volume=207|issue=2|pages=387–395|doi=10.1016/S0304-3975(98)00075-9 |mr=1643414}}<br />
* Lajos, Rónyai and Gábor, Ivanyos and Réka, Szabó, ''Algoritmusok''. TypoTeX, 1999. ISBN 963-279-014-6<br />
* Li, Ming and Vitányi, Paul, ''An Introduction to Kolmogorov Complexity and Its Applications'', Springer, 1997. [http://citeseer.ist.psu.edu/li97introduction.html Introduction chapter full-text].<br />
* Yu Manin, ''A Course in Mathematical Logic'', Springer-Verlag, 1977. ISBN 978-0-7204-2844-5<br />
* Sipser, Michael, ''Introduction to the Theory of Computation'', PWS Publishing Company, 1997. ISBN 0-534-95097-3.<br />
* [[Chris Wallace (computer scientist)|Wallace, C. S]]. and Dowe, D. L., [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.17.321 Minimum Message Length and Kolmogorov Complexity], Computer Journal, Vol. 42, No. 4, 1999).<br />
<br />
==External links==<br />
* [http://www.kolmogorov.com/ The Legacy of Andrei Nikolaevich Kolmogorov]<br />
* [http://www.cs.umaine.edu/~chaitin/ Chaitin's online publications]<br />
* [http://www.idsia.ch/~juergen/ray.html Solomonoff's IDSIA page]<br />
* [http://www.idsia.ch/~juergen/kolmogorov.html Generalizations of algorithmic information] by [[Juergen Schmidhuber|J. Schmidhuber]]<br />
* [http://homepages.cwi.nl/~paulv/kolmogorov.html Ming Li and Paul Vitanyi, An Introduction to Kolmogorov Complexity and Its Applications, 2nd Edition, Springer Verlag, 1997.]<br />
* [http://homepages.cwi.nl/~tromp/cl/cl.html Tromp's lambda calculus computer model offers a concrete definition of K()]<br />
* Universal AI based on Kolmogorov Complexity ISBN 3-540-22139-5 by [[Marcus Hutter|M. Hutter]]: ISBN 3-540-22139-5<br />
* [http://www.csse.monash.edu.au/~dld David Dowe]'s [http://www.csse.monash.edu.au/~dld/MML.html Minimum Message Length (MML)] and [http://www.csse.monash.edu.au/~dld/Occam.html Occam's razor] pages.<br />
* P. Grunwald, M. A. Pitt and I. J. Myung (ed.), [http://mitpress.mit.edu/catalog/item/default.asp?sid=4C100C6F-2255-40FF-A2ED-02FC49FEBE7C&ttype=2&tid=10478 Advances in Minimum Description Length: Theory and Applications], M.I.T. Press, April 2005, ISBN 0-262-07262-9.<br />
<br />
{{Compression Methods}}<br />
<br />
{{DEFAULTSORT:Kolmogorov Complexity}}<br />
[[Category:Algorithmic information theory|*]]<br />
[[Category:Information theory|*]]<br />
[[Category:Computability theory]]<br />
[[Category:Descriptive complexity]]<br />
[[Category:Measures of complexity]]<br />
<br />
[[ca:Complexitat de Kolmogórov]]<br />
[[de:Kolmogorow-Komplexität]]<br />
[[es:Complejidad de Kolmogórov]]<br />
[[fr:Complexité de Kolmogorov]]<br />
[[gl:Complexidade de Kolmogorov]]<br />
[[he:סיבוכיות קולמוגורוב]]<br />
[[nl:Kolmogorov-complexiteit]]<br />
[[ja:コルモゴロフ複雑性]]<br />
[[pl:Złożoność Kołmogorowa]]<br />
[[pt:Complexidade de Kolmogorov]]<br />
[[ru:Колмогоровская сложность]]<br />
[[tr:Kolmogorov karmaşıklığı]]<br />
[[zh:柯氏复杂性]]</div>en>Gracefool