# Talk:Kolmogorov complexity

## Must description languages be Turing complete?

This article does not say whether the description language is required to be Turing complete. But it links to minimum message length, which makes the claim that Kolmogorov complexity allows only Turing complete description languages. I don't know whether that claim is true or not. If that claim is true, then this requirement should be described in the first paragraph that mentions description languages (the existing link to description language is useless, and links to a page listing languages like XML which are not Turing complete). If that claim is not true, then the first theorem here should be fixed, by adding a phrase requiring that L1 and L2 be Turing complete. Clearly, there are cases where the theorem is false, if L1 and L2 don't have to be Turing complete.

It is understood that "the" Kolmogorov complexity is with respect to a universal description system -- i.e. the description language is Turing complete. In the literature there are also definitions of Kolmogorov complexity relative to a particlar (not necessarily universal) description system, but that is not what is generally intended by "the" Kolmogorov complexity. The latter expression is still a slight abuse of terminology, because the KC will still depend on which universal system is used.
--r.e.s. (talk) 19:19, 28 July 2008 (UTC) I've revised the introduction of the article to reflect this. --r.e.s. (talk) 19:28, 28 July 2008 (UTC)

## Spelling

My bot semiautomatically fixed a mistake in this page, see [2] doing the replacement

parameterisation ->
parametrisation

I got a an anonimous message today that the former is also valid spelling. Is that true? I myself would think that it is not in this context. Oleg Alexandrov 6 July 2005 15:02 (UTC)

I think your bot spells correctly, although I am an American, and would use a 'z' instead of an 's'. The former spelling looks incorrect to me in any case. -- 130.94.162.61 23:59, 21 February 2006 (UTC)
See these pages [3] [4] in Merriam-Webster Online Dictionary. :-) CiaPan 01:17, 22 February 2006 (UTC)

## Opening definition

The first sentence reads as follows:

In computer science, algorithmic information theory is a field of study which attempts to capture the concept of complexity by using tools from theoretical computer science.

Typically the first sentence of an article is some definition or a thumbnail sketch of one. The above statement is not. Additionally, it could have equally applied to the better known field of Computational complexity theory.

I plan to fix this as time permits. Vonkje 22:42, 23 July 2005 (UTC)

## Suggestion to move

I propose that this page be moved to Kolmogorov complexity. This is the main subject of the article and would simplify the article's introduction.--CSTAR 01:56, 3 October 2005 (UTC)

## Compression

"incompressible strings must exist, since there are ${\displaystyle 2^{n}}$ bit strings of length n but only ${\displaystyle 2^{n-1}}$ shorter strings, that is strings of length n − 1."

I'm no expert, but I don't think this makes sense... there must be something like ${\displaystyle 2^{(n-1)!}}$ shorter strings (more than ${\displaystyle 2^{n}}$ for n > 3), and there is no reason to expect a string to be compressed in exactly one less symbol. Can anyone restate this proof more clearly?

Take a set ("an alphabet") of A symbols. There is ${\displaystyle A^{N}}$ different N-letter strings (sequences of length N with values from that set). For binary strings (two-symbol set, A=2) there is ${\displaystyle 2^{N}}$ strings.
There is, of course ${\displaystyle 2^{N-1}}$ strings of length N-1, ${\displaystyle 2^{N-2}}$ strings of length N-2, ${\displaystyle 2^{N-3}}$ strings of length N-3, and so on, down to ${\displaystyle 2^{1}=2}$ strings of length 1. These terms add up to the sum known as a finite geometric series:
${\displaystyle \sum _{n=1}^{N-1}2^{n}=2^{1}\cdot {\frac {1-2^{N-1}}{1-2}}=2\cdot (2^{N-1}-1)=2^{N}-2}$
which is less than the number of N-bit strings.
Actually it may be enough to compare just the number of different strings of length N and N-1, because the same problem arises on every N level — if you want to compress N-bit strings into strings shorter than N-1, then you use some subset of compression function's codomain, that might be occupied by compressed N-1-bit strings. But whichever way you calculate, you will get the same result: for every N>0 there is more strings of length N (or shorter) to be compressed than strings of length N-1 (or shorter), that they might be compressed into.
CiaPan 07:55, 8 February 2006 (UTC) (Hope my explanation is clear enough - I'm not as fluent in English as in maths.)

What about zero length strings? therefore sum from ${\displaystyle 2^{N-1}}$ to ${\displaystyle 2^{0}}$

${\displaystyle \sum _{n=0}^{N-1}2^{n}={\frac {1-2^{N}}{1-2}}=2^{N}-1}$

The fact that there are more strings with exactly length N than all of the strings with length < N is not immediately obvious. Although the quoted sentence is correct ie there are only ${\displaystyle 2^{N-1}}$ strings of length N-1, the use of the word shorter makes this confusing.
I suggest
"incompressible strings must exist, since there are ${\displaystyle 2^{n}}$ bit strings of length n but only ${\displaystyle 2^{n}-1}$ shorter strings, that is strings of length n − 1 or less."
Yes No?
Spacepickle 16:45, 31 May 2006 (UTC)

Short answer: If you feel your version is more clear, insert it. Be bold.
Explanation: In the previous notes, when I used '(or shorter)' I meant 'whether you count N-bit and (N-1)-bit strings or all strings with length up to N and up to (N-1) bits, you get the same result'. In other words I tried to merge two sentences in one: 'there is more binary strings with length N than strings with length (N-1), and by recursion, there is more strings with length N or less than strings with length (N-1) or less'.
So it does not matter, whether you consider only strings with given length, or include also all shorter string—there is always more strings to be compressed, than possible results of compression.
More comments: There are three kinds of mathematicians: those who can count, and those who don't...
And among those who can, there are 10 kinds: those who understand binary, and those who don't.
For those who can count and understand binary, the proposition you expressed above—that there is more strings of length N than all strings shorter than N— is immediately obvious.
For length N There is # strings of length
exactly N any, less than N
binary decimal binary decimal
0 1 1 0 0
1 10 2 1 1
2 100 4 11 3
3 1000 8 111 7
4 10000 16 1111 15
... ... ... ... ...
N 1 and N zeros 2N N ones 2N-1
because there is exactly one (1) empty string and no (0) strings shorter than that:
NumEqual( 0 ) = 1
NumShorter( 0 ) = 0
which is displayed by the first row in the table, and for any natural N:
NumEqual( N+1 ) = 2 × NumEqual( N )
NumShorter( N+1 ) = NumShorter( N ) + NumEqual( N ).
And these recurssions resolve to what the last row of table shows. --CiaPan 17:00, 1 June 2006 (UTC)

--

The derivation of 2n -1 needs a source, or needs to be corrected. The issue is the 0-length string. One author -- Martin Davis in his "What is a computation?" (pages 241-267) in Lynn Arthur Steen 1980 Mathematics Today: Twelve Informal Essays, Vintage Books, NY, ISBN 0-394-74503-5, excludes the 0-length string and derives from the geometric series that there are strings of length less than or equal to n-10:

2 + 4 + ... + 2n-10 = 2n-10 - 2 (cf page 265)

So there are two things wrong in the article: First, because the little factoid is not readily apparent, the geometric series should be presented before the proof; Second, either the fact that the 0-length string is included or (since at least one author excludes it) the formula should be corrected. I vote for the correction of the formula (as noted in the beginning of this section, the matter has come up before). I'll leave this here for a while and then, because it's bugging me, I'll fix it if no one objects. BillWvbailey (talk) 14:32, 2 June 2010 (UTC)

The 0-length string is an element of ${\displaystyle 2^{<\omega }}$ and should be included. The standard reference text on Kolmogorov complexity by Li and Vitanyi, and the new book by Downey and Hirschfeldt that will become the new standard reference, both present the same proof of the existence of random strings as here, and both these texts count ${\displaystyle 2^{n}-1}$ strings of length less than $n$. It is not surprising that a popularization might exclude the empty string for pedagogical purposes, but this is not done in the actual literature on Kolmogorov complexity. — Carl (CBM · talk) 17:26, 10 June 2010 (UTC)
Every real (practical, existent, whatever you want to call it) system of information is able to represent empty string – for example "" in C and '' in Pascal programming languages source (a zero-byte alone terminating the empty string and a zero-length field alone in popular implementations, respectively) or a pair of signals Template:Nobr in any serial communication system. So there is no reason (except possibly someone's phobia) to avoid considering empty string. No such issue actualy exist. Article fixed. [5] --CiaPan (talk) 10:21, 5 January 2011 (UTC)

Chiapan's explanation is actually rather important. Chaitin discusses this in his [I know, I know ... ] popularization, MetaMath!, and as I recall so did Martin Davis (with an actual example). Should we add something about (the notion of [the problem of]) self-delimited, variable-length information strings (aka "messages")? Perhaps an example? It would help us show-me types, I think. In other words, the string, if it exists, exists between two end-markers "BEGIN" and "END", or is otherwise "delimited". But the difficulty arises because of the strings' (unknown) variable lengths, which could be 0. The trivial (upper-value) solution is similar to a biphase modulation serial communication link: e.g. the bit string 11 indicates the BEGIN and END markers, 10 = "value = 0", 01 = "value = 1", 00 = blank tape, empty channel, no transmission. So the smallest string 1111 indicates the "empty string" on an otherwise-blank tape. This is an "expensive" solution (it requires 2x bandwidth i.e. 2*N+4 tape-squares). But it does represent the "upper value" number of squares (bits) required to encode any message of length N. Another approach is to start with a "preamble" that contains, e.g. in biphase-encoded binary, the number N of binary string-bits to follow. So no end-marker is required, technically -- the Turing machine's program has to count the number of bits in the string. But always there will be a preamble C, in this case C=4+2*log2(N), with a minimum length e.g. 4 [ 1111 ] (so the total message-length would be N+C, where C = 4+2*log2(N)). [O.R. alert: I'd be willing to wager that there exists (and this is it) a theoretically-shortest preamble length, ergo a shortest uncompressed message length.] BillWvbailey (talk) 16:12, 5 January 2011 (UTC)

The point about there necessarily being one uncompressed string to which a compressed string must map seems to be dependent on the assumption that we are working with a fixed universal Turning machine. If we expand our view to allow for multiple universal Turing machines, then it seems that a compressed string could map to more than one uncompressed string and furthermore allow for the possibility that a string that is incompressible giving one UTM is compressible given another. This situation appears to have a very interesting connection to the "No Free Lunch Theorem" in machine learning, with the UTM representing assumptions or biases required to make induction possible. Someone who has a better grasp of the math should confirm that this connection exists and add it to the main page, or at least point out that the application of the pigeonhole principle in this section assumes a fixed UTM. TLP (talk) 17:00, 27 August 2013 (UTC)

We must be operating with a fixed TM (not necessarily a UTM). To bring this to absurdity, we could chose a different TM for each target, with all mapping from the 0-length string. — Arthur Rubin (talk) 02:42, 7 December 2013 (UTC)

## Some issues

Why does Talk:Algorithmic information theory forward to Talk:Kolmogorov complexity?

Fixed. The following comment copied over to appropriate talk page.--CSTAR 17:31, 17 June 2006 (UTC)

The following remarks pertain to the article Algorithmic Information Theory. The article as it stands has some issues that I would like to bring to discussion. I hope that some consensus can be reached about how to deal with them.

• There are no references.
• Komogorov complexity is an active research area in mathematics as well as computer science.
• The statement
Unlike classical information theory, algorithmic information theory gives formal,
rigorous definitions of a random string.....


is a point of view that is not universally shared, although it has been championed by Chaitin in popularizations of the area. It is well known that the AIT definition of random string is actually only asymptotically well defined, so the viewpoint that AIT does not give a formal definition of a random string is equally valid (and, among mathematicians I have spoken with, a common viewpoint). The issue is the lack of a canonical choice of a universal machine.

• The definition of random infinite sequence has the same extent as the one given by Martin Lof, and this should be mentioned.

CMummert 04:37, 16 June 2006 (UTC)

## Example strings

"The first string admits a short English language description namely "50 repetitions of '01'", which consists of 20 characters. The second one has no obvious simple description other than writing down the string itself, which has 100 characters." <----- This is not true. One could reference this string as eg. "the uncompressable 100 character string in http://en.wikipedia.org/wiki/Kolmogorov_complexity" which would have 95 characters.

What do you wiki-authors do with this now? I didn't dare to put it directly in the main page and wanted to spur some discussion. As far as I remember Martin Garder is talking about this paradox, but I cannot find the source right now. (Markus Gaelli)

This is just a red herring. The paradox is just because Kolmogorov complexity is only defined relative to a particular prefix free universal machine. There are plenty of machines that make the repeated 01 string have higher Kolmogorov complexity than the other string, by carefully choosing indices. So the correct reading of the intro paragraph is to view it as an informal motivating example, which is fine for an intro paragraph. Once the formal definition is given, it becomes clear that the intro paragraph is speaking of some intuitive idea of description which is made formal via universal machines. CMummert 23:57, 13 August 2006 (UTC)

## Mandelbrot set

The picture is very nice, but I don't see what this has to do with Kolmogorov complexity, since the object in question is not encoded by a finite sequence in any obvious way. Could you provide a reference?--CSTAR 02:02, 24 October 2006 (UTC)

If you look at the article for Mandelbrot set, it should be clear that this is encodable by a short program. The information theory textbook:
• Thomas M. Cover, Joy A. Thomas. Elements of information theory, 1st Edition. New York: Wiley-Interscience, 1991. ISBN 0-471-06259-6.
2nd Edition. New York: Wiley-Interscience, 2006. ISBN 0-471-24195-4.
uses such a picture for its front cover and makes a similar point. Calbaer 18:43, 24 October 2006 (UTC)
I think CSTAR's point is that Kolmogorov complexity is defined for finite strings, not for subsets of the plane. Talking about the Kolmogorov complexity of the Mandelbrot set is, on its face, a category error. The Kolmogorov complexity of a program that generates the Mandelbrot set is another matter. Though even there, the claim that the complexity is "close to zero" is confusing. Is a thousand close to zero? I doubt its complexity is less than that, using a standard definition of universal Turing machine.
Since the Mandelbrot set is closed, you can code it naturally by a single real. It's probably true that the initial segments of that real have bounded Kolmogorov complexity. Is that what you meant? --Trovatore 20:21, 4 November 2006 (UTC)
The question of what it means for a real to code the Mandelbrot set doesn't have a unique answer. The Mandelbrot set can be coded via a computable enumeration of all the basic open balls in its complement (the fact that there is a computable sequence to do this is not obvious but has been proved). The Kolmogorov complexity of initial segments of a real can never be bounded, but ${\displaystyle K(x\upharpoonright n)}$ will be bounded by ${\displaystyle K(n)+C}$ when x is a computable real. CMummert 19:40, 8 November 2006 (UTC)
Well, if you think it's confusing, it need not be there. However, I adjusted the caption to hopefully make it more understandable and accurate. By the way, back in the 80s, I made some programs to create 640x480 bitmaps of fractals; on a PC, they compiled to approximately 2500 bits. Compared to most image-creating programs, that's "almost zero," but if there's a better way to phrase the caption, I've no objection to its being rephrased. Calbaer 03:59, 5 November 2006 (UTC)
OK this assertion is quite different; you're talking about bitmaps of fixed size n of which there are 2n. Note also that in your caption, f really isn't a function on Cn, but a function on some floating point type, elements of which are also representable as finite bitstrings. To say "small" you would have to make an asymptotic assertion as n -> ∞ I suppose this is true, although I dont havce a reference or a proof.--CSTAR 17:20, 8 November 2006 (UTC)
The following question is open: whether for each n you can (in a uniform way) compute a bitmap with resolution ${\displaystyle 2^{-n}}$ of black, white, and undefined pixels such that any pixel entirely contained in the Mandelbrot set is black and any pixel entirely outside is white. This is hinted at in Mandelbrot set#Further results. There is a paper here [6] CMummert 19:40, 8 November 2006 (UTC)
The current caption is nice, but is inaccurate, since all versions of the picture are JPEGs. Where's this 17 KB PNG? Unless we replace the JPEGs with PNGs, the caption should be reverted. Calbaer 20:13, 6 December 2006 (UTC)
You're right - it's a JPEG. I fixed the caption. CMummert 20:24, 6 December 2006 (UTC)
I still find it problematic. The underlying picture itself is 189 KB, not 17 KB. Also, to create it would require a JPEG encoder in addition to the fractal generator. While I'm sure it could be generated in under 17 KB, this isn't obvious, and, moreover, gzip can reduce its size (by about 1.8%) so saying the complexity is merely "much less" is saying very little. (And, yes, I realize my initial caption had the same problem.) What would be great is if someone could contrast a significant difference between size of code with size of (losslessly compressed) file. (If someone has a DOS emulator handy, I can sent them my 80s-era Mandelbrot code to create a 640x480 image on screen.) But, as it is, things might be a bit confusing. Calbaer 21:12, 6 December 2006 (UTC)
I wouldn't mind it if the image were removed, and User:CSTAR has made similar comments. The new caption is an attempt to find an informally correct way to keep the image while relating it to the article material.
The fundamental question is whether it is possible to make a caption that is technically correct and relevant to the article. There are more serious technical issues with the current one than you mentioned, namely that Kolmogorov complexity is only defined up to a constant and so, relative to certain universal machines, the bitmap is a random string. The caption I wrote today was only intended to make the image fit into the context of the article - the previous caption gave little indication of why the image was there. Feel free to edit it. CMummert 21:21, 6 December 2006 (UTC)
This exemple does not describe kolmogorov complexity. The image has a finite size in bits. A fractal contain infinite information in the same domain. There is a loss of information, whe cannot reconstruct the fractal from the image, so the objects represented are not the same. —Preceding unsigned comment added by 81.67.80.174 (talk) 09:22, 8 June 2008 (UTC)
JPEG seems inappropriate here for another reason: it's lossy. You could reduce the size of the image down to less than 1KB by reducing the quality of the image, but that doesn't really say anything about the complexity of the Mandelbrot set. -LesPaul75 (talk) 17:11, 9 June 2008 (UTC)
I think the important thing is that kolmogorov complexity is defined in for a given "descriptive language", and that language can certainly have a save_to_file_as_jpeg() function. The original mandelbrot set is defined with a very simple equation/algorithm, and all the stuff to make it pretty and compact the picture doesn't add very much to the small kolmogorov complexity. The picture *looks* complex, but it isn't. The bit string example also used at the beginning of the article, on the other hand, can't be significantly compressed short of the descriptive language having a function that returns that exact string. However, then there will always be other random bit string that can't be so shortened, since the descriptive language has to be finite (by definition) and the language would fill up with longer and longer function names. Wrs1864 (talk) 20:16, 9 June 2008 (UTC)
To avoid the technical issue of specifying a description language that can deal with JPEG decoding (most concrete description languages in theory are based on Turing machines) I've rewritten the caption to talk about the size of the raw image file. We are not presenting this file for download, but it obviously exists, and it obviously can be generated by a much smaller program in a typical general-purpose description language. I'm still eliding the technical detail that the specific image we offer contains JPEG artifacts that cannot necessarily be duplicated by a program using only the definition of the Mandelbrot set; but I get around this by claiming that this image is "just a picture of" the bitmap in question, rather than literally the bitmap in question. Dcoetzee 20:06, 21 June 2008 (UTC)

## Uncomputability of Kolmogorov complexity

It seems to me that the program GenerateParadoxicalString is completely unnecessary in the proof, as the desired contradiction is immediate from GenerateComplexString itself. All that's needed is to note the following contradictory consequences:

(1) GenerateComplexString, with input n, returns a string whose Kolmogorov complexity is at least n; consequently, for all n the combined length of GenerateComplexString and n, is at least n.

(2) The combined length of GenerateComplexString and n, is just a constant plus the length of a numeral for n; consequently, for sufficiently large n this combined length is strictly less than n. --r.e.s. 05:37, 11 February 2007 (UTC)

The reason for the GenerateParadoxicalString function is to make it clear that there really is a "sufficiently large" value of n that works. Also, you need to be careful in (1) because adding two strings together can give you a string with much lower Kolmogorov complexity than the sum of the complexities of the strings. CMummert · talk 13:25, 11 February 2007 (UTC)
Can't the point concerning "sufficiently large n" be made just as clearly using GenerateComplexString itself, rather than GenerateParadoxicalString? Also, doesn't the caution about Kolmogorov complexity when "adding" two strings apply with equal force to GenerateParadoxicalString? If so, the suggested change would yield a proof that's not only clearer and more direct, but also shorter by about half. --r.e.s. 14:24, 11 February 2007 (UTC)
Feel free to rewrite the proof; if there is something wrong with your revised version, either I or someone else will point it out. My comment above points out the two main areas where you need to take caution in your rewrite. If you want to be more cautious, you can put a new version here on the talk page for others to comment on. CMummert · talk 15:17, 11 February 2007 (UTC)
It doesn't make sense to rewrite it only to have the logic disputed afterwards, which is why I asked the two questions above (they were not merely rhetorical). I would prefer to discuss the basis before building on it. However, another fly in the ointment is the likelihood that someone will soon come along arguing that these "original" proofs don't belong in Wikipedia (and they'd be right, I guess, as I don't see any source cited for the present proof either), and all effort will have been wasted. Maybe the wiser course (for someone energetic) would be to locate an authoritative source and just cite it, paraphrasing the proof only if it's very short. --r.e.s. 16:36, 11 February 2007 (UTC)
Reply to r.e.s: Your proof sketch seems OK to me, although note that all the proofs in this page are informal, and when written down using a formalization of string descriptions (using Turing machines,) I'm not sure there is going to be any substantive difference in clarity or length. Also, for an expository and informal article, brevity of proofs is not necessarilly always better (although it usually is). --CSTAR 17:21, 11 February 2007 (UTC)
The proof in this article (which is really the same as what you sugggested above, just phrased differently) is not "original" in the WP sense; a proof of the result in question following the same lines could be found in Li and Vitanyi's book or Calude's book. There is broad consensus in mathematics articles that it is acceptable to include proofs of well-known results when those proofs are useful to explain what is going on. The proof we are discussing here is exactly that sort of proof. So there is no reason to remove the proof entirely.
The two bullets you gave above are not enough on their own to consitute a proof for an intended reader of this article, which is why I can't say what else is "needed" for an alternate proof. I just pointed out what the touchy points will be. CMummert · talk 18:45, 11 February 2007 (UTC)
I disagree that (1) and (2) are not enough, in that the proposed proof is at least as solid as the one in the article. Concerning your word of caution on the complexity of two "added" strings ... The two strings GenerateComplexString and its input n together constitute the program that's supposed to return a string with the property that all programs returning it have length at least n. (Taking the input as part of the program in this case is consistent with the way it's done more formally with TMs; e.g., as the article puts it, "if M is a TM which on input w outputs string x, then the concatenated string <M> w is a description of x".) Thus the inequality in (1) must hold by the very definition of Kolmogorov complexity. On the other hand, to question bullet (2) is to question an argument that's already used in the article, though applied there to the obfuscating program GenerateParadoxicalString.
However, I'll leave the suggested improvement to others if they wish to pursue it, as I doubt whether I have the energy to dispute the matter much further. --r.e.s. 02:04, 12 February 2007 (UTC)
The difficulty with the argument you just presented is: what is the value of n that actually leads to a contradiction? In order to turn an input into a program, you have to have a particular value of the input to encode (in binary, for example). The purpose of GenerateParadoxicalString is to obtain a bound that lets you choose a specific value of n. CMummert · talk 02:55, 12 February 2007 (UTC)
There is no difficulty, since all that's needed is that the inequality in (1) holds for all n, and the inequality in (2) holds for some n. Throughout, it's understood that some encoding has been chosen for all inputs n, and that it's the length of the encoding that's meant by "the length of n". At the same level of detail and/or informality as in the current proof, one can establish (1) and (2), and hence the desired contradiction -- with no need of GenerateParadoxicalString.--r.e.s. 12:29, 12 February 2007 (UTC)
{{sofixit}}. That's what I said above, as well. CMummert · talk 12:55, 12 February 2007 (UTC)
Thanks for the advice to feel free about doing a rewrite. At this point my choice is to decline.--r.e.s. 13:59, 12 February 2007 (UTC)

## Two kinds of Kolmogorov complexity?

Here it says, if I understand correctly, that there are two different definitions of Kolmogorov complexity (plain and prefix). It would be nice to be able to read about that in the article. --Tgr 12:21, 18 September 2007 (UTC)

I don't believe, as claimed, that the leading texts distinguish between these two definitions, which I've never seen before reading this blog post. In the classroom, generally one definition is given, so that should suffice here, barring a reliable source to the contrary. I suspect that "prefix complexity" means that the program includes information about how long it is, whereas "plain complexity" would not include this information. This is not generally an issue. I believe "plain complexity" is usual definition. However, although the program description is generally not assumed to include a termination indicator, the program contains information about when to terminate during its runtime. Calbaer 00:34, 19 September 2007 (UTC)
Yes, there is plain Kolmogorov complexity and prefix-free complexity. The latter is crucial for things like Chaitin's omega. There is some description at Chaitin's constant. This article only covers the regular Kolmogorov complexity. — Carl (CBM · talk) 02:48, 19 September 2007 (UTC)

## Example in the intro

The English phrase "The binary representation of 14,439,066,989,718,820,758" has less than 64 characters. Army1987 (talk) 00:19, 21 June 2008 (UTC)

Fixed. The new string was made of the lowest 7 bits of bytes from /dev/random, discarding those < 32 or = 127. So, it's very unlikely to be described in less than 64 characters. --Army1987 (talk) 11:05, 21 June 2008 (UTC)
I think it would be preferable to rephrase the introductory example in terms of bits, because "character" is so poorly specified (after all, if we're talking Unicode characters, you could probably encode even your random string in half the space). I'm going to attempt this. Dcoetzee 19:22, 21 June 2008 (UTC)
In the end I decided binary strings were infeasible because it's really hard to describe anything in 64 bits. So I went with specifying the alphabet more carefully: only lowercase letters, numbers, and spaces are allowed. This ensures that the second example is likely to be an incompressible string, while still providing enough characters for English text. Dcoetzee 19:47, 21 June 2008 (UTC)

## Incomputible?

But, what if there was a program that counts in binary and executes the result until it outputs the given string?

int getComplexity(string s){
string output;
int number=0;
program prog;

while(1){
prog = number.toBinary();
prog.execute();
output = prog.output;
if(output==s){break;}
number++;
}

return length(number.toBinary());
}


Since this goes through all possible binary values, if there exists a program which outputs the string, it will find it. It will take an enormous amount of processing power, of course. —Preceding unsigned comment added by 69.208.73.205 (talk) 17:20, 10 July 2008 (UTC)

Well, let's assume your "program" class is carefully constructed so that this thing doesn't just immediately coredump. Then what will happen is, before long, you'll hit a program that never terminates, and then you'll never get to the next value of "number". Nor can you add a program::checkTerminate() method that will figure this out before you call execute(); this is by the unsolvability of the halting problem. --Trovatore (talk) 17:36, 10 July 2008 (UTC)
As written, the program won't work for the reasons stated by Trovatore; however, it is possible to simulate all possible programs simultaneously by simulating one step of the first program in lexicographical order, then one step of each of the first 2 programs, then one step of the first 3 programs, and so on. The problem with this is that you wouldn't know when to stop running the simulation; it may so happen that a longer program produces the string and halts before a shorter program has yet produced any output. Dcoetzee 20:52, 28 July 2008 (UTC)
Yes, nicely put. Of course there are infinitely many ways you might try to fix the approach, none of which will work, but this is a reasonably natural one to try. --Trovatore (talk) 21:48, 28 July 2008 (UTC)

## Question on Vertical Bars in the Equations

In these two equations,

${\displaystyle K(s)=|d(s)|.\quad }$

and

${\displaystyle \forall s\ |K_{1}(s)-K_{2}(s)|\leq c,\quad }$

what do the vertical bars mean in each? It looked like to me that maybe they meant "length" in the first and "absolute value" in the second, but I wasn't sure.

Thank you,

Cat Dancer WS (talk) 14:09, 3 August 2008 (UTC)

You are correct, on both counts. Dcoetzee 21:04, 4 August 2008 (UTC)
In the same proof, can you explain how
${\displaystyle \forall s\ |K_{1}(s)-K_{2}(s)|\leq c\quad }$
follows from
${\displaystyle \forall s\ K_{1}(s)\leq K_{2}(s)+c\quad }$
Thank you, 91.140.75.204 (talk) 17:37, 30 July 2009 (UTC)
It doesn't. It follows from
${\displaystyle \forall s\ K_{1}(s)\leq K_{2}(s)+c\quad }$ and ${\displaystyle \forall s\ K_{2}(s)\leq K_{1}(s)+c\quad }$
That's what the "by symmetry" means. —Preceding unsigned comment added by 130.195.86.38 (talk) 04:17, 16 October 2009 (UTC)

## Kolmogorov Complexity & Information

Dear all,

I’m rather a fan of Kolmogorov complexity, than an expert. I believe however that Kolmogorov was on the right track to mathematically describe information and its impact. Unfortunately, there is so much confusion about what is information, especially from IT perspective. And this confusion is reflected in very messy Information article.

I have outlined how the article should present the concept on the related discussion page, but without much of an echo. If there are mathematicians among you who would like to get involved, they will be welcomed.

Kind regards, Damir Ibrisimovic (talk) 04:15, 18 February 2009 (UTC)

## Complexity Density or Entropy Density Comment (tentative)

What happens if which divide the Kolmogorov Complexity by the length of the digit, ie : K(s)/|s|, where |s| is the size of s (note, it may be that an appropriate notion of entropy density would be arrived at by dividing by some function of s, or by another complexity metric aside from the Kolmogorov complexity). Do we find that, in some sense, there is a theoretical limitation to the entropy density which we can obtain? Does this not provide a theoretical limitation upon the degree of complexity which can be attained by both formal systems and also applied systems? Is the actual maximum entropy density attainable related to the Chaitin constant?

ConcernedScientist (talk) 18:19, 28 February 2009 (UTC)

## Binary Lambda Calculus

I added a link to my recently created page about Binary Lambda Calculus, which provides a relatively simple and easy to work with concrete definition of both plain and prefix complexity.

--Tromp (talk) 14:50, 13 March 2009 (UTC)

## Problem with main example

the article suggests that the 64 characters "4c1j5b2p0cv4w1 8rx2y39umgw5q85s7ur qbjfdppa0q7nieieqe9noc4cvafzf" is a description of itself, which is highly misleading. There absolutely needs to be an indication that the string is to be taken literally.

How else are we going to describe the string "ab 32 times" ?

Here we also get into issues with quoting. How do we describe (in English) the string "\"ab 32 times\"" ? Probably the best thing is to avoid character escapes and use a description like "the 13 character string "ab 32 times""

Thus, I suggest the example in the article should have description: "the 64 character string 4c1j5b2p0cv4w1 8rx2y39umgw5q85s7ur qbjfdppa0q7nieieqe9noc4cvafzf"

--216.223.242.211 (talk) 17:28, 13 March 2009 (UTC)

I see your point, but I don't think it that much of a problem. While the example can be made more accurate with a more formal "description language" than English, the example works to get the point across to people who don't really understand this subject. Wrs1864 (talk) 10:59, 16 March 2009 (UTC)

## Incomputability of Kolmogorov complexity: termination of GenerateComplexString

Hello, the function GenerateComplexString(int n) will only terminate for any n if K has no upper bound. Of course, there is no upper bound for K, because if there was, every string (out of about aleph-0 strings) could be generated by a program with a length less than or equal to that bound (of which only a finite number exist). While the lack of proof for the termination of the function is thus just a technicality, it nevertheless requires the reader to think a bit on his own (or, more likely, ignore it). As I am no expert on the topic and would rather not complicate the proof due to a misunderstanding, I will ignore WP:BB and just wait for an expert to verify or refute my concerns. X127 (talk) 07:09, 16 March 2009 (UTC)

I think the Pigeonhole principle is pretty obvious, but I guess in the top of the "basic results" section where it discusses the maximum length of K(s), something could be added about this. Wrs1864 (talk) 11:18, 16 March 2009 (UTC)

### HOW Incomputable is Kolmogorov complexity/HOW undecidable is the issue of determining the Kolmogorov complexity?

This may sound a silly point. And I may be well off the mark in my understanding of what the 'Incomputability of Kolmogorov complexity' entails. BUT, HOW incomputable is Kolmogorov complexity? According to the article Oracle machine, even undecidable problems can be solved by an oracle based equipped Turing machine. So what is the minimal complexity class of decision problems which an oracle attached to a Turing machine (for the purpose of computing the Kolmogorov complexity of numbers) must be capable of solving?

As a separate point, do there exist Polynomial-time approximation schemes which can determine approximations to the Kolmogorov complexity to within an arbitrary constant value of the actual Kolmogorov complexity (it would seem clear that the answer is no in the sense that having a computed Kolmogorov complexity value within 1% of the actual value would enable the Kolmogorov complexity to be calculated - however, some type of probabilistic polynomial-time randomized approximation scheme might enable Kolmogorov complexity evaluation to some probabilistic bound of the 'answer').

Are there any fundamental problems with any of this?

ConcernedScientist (talk) 18:19, 4 April 2009 (UTC)

First off, as per WP:NOTFORUM, talk pages are supposed to be for discussing the article itself, not for general discussions on the subject matter. As far as what can be computed/proved with atypical systems, I think they would be best discussed on those article pages. E.g., something on Kolmogorov complexity on an Oracle machine, would be best discussed on the Oracle machine page. (The same goes for all other proofs that may change when you can assume you have an oracle.) As far as approximations, as discussed in the article, all strings have a Kolmogorov complexity that is at most the length of the string plus a (small) constant, so yes, there is an obvious polynomial approximation. Wrs1864 (talk) 19:26, 4 April 2009 (UTC)

1) Oracle machines by definition can solve the halting problem for Turing machines in one operation. There is an infinite hierarchy of them: see Turing degree. The notion of Kolmogorov complexity can be generalized to machines with oracles; algorithmically random sequence has some info. 2) There is no general way (without an oracle) to approximate the Kolmogorov complexity of a string in finite time (much less in P-time). Example: saying "string S has complexity 1 million bits +/- 5%" asserts that none of the programs shorter than 950,000 bits long generate S. But that is a conjunction of a lot of assertions that are undecidable by Rice's theorem. 69.228.170.24 (talk) 20:46, 31 May 2010 (UTC)

The original poster's question has a precise answer that no one seems to have given: The degree of unsolvability of the problem of computing (non-relativized) Kolmogorov complexity is precisely 0' (read zero-jump), the Turing degree of the halting problem itself. At least I would be very surprised if that were not true (I haven't thought it through at the level of detail that would give a proof). --Trovatore (talk) 21:19, 31 May 2010 (UTC)

Yes, it is true. Actually, Ω wtt-computes the halting problem [7]. — Carl (CBM · talk) 04:11, 1 June 2010 (UTC)

## what is a "description language"?

Do we need a more formal definition/specification of what qualifies as a description language in this article? Right now, the article is pretty vague, just using a link to the vague description language article. Say we have a description language that consists of only Hollerith constants. We can generate any string, but we also do such things as calculate the exact Kolmogorov complexity for each string (the length string, plus one, plus the log base 10 of the string). I thought about throwing in a "turning complete" requirement, but realized that maybe the proofs don't require that powerful of a language, for example, a Linear bounded automaton might still work. Thoughts? Wrs1864 (talk) 02:38, 5 April 2009 (UTC)

I think one normally defines Kolmogorov complexity relative to a fixed universal Turing machine U. That is, first choose U, and then for a given string s, KU(s), the complexity of s, is the shortest program for U that generates s. Yes, this should be explained more precisely in the article. 69.228.170.24 (talk) 20:36, 31 May 2010 (UTC)

## Chaitin's incompleteness theorem and "minimum message length" ?

I don't see why MML is refered to in this section. The notation of MML there is so vague it shows nothing about the relation with CIT, I almost erased it, but I'd rather put it in another section for now. --210.231.57.36 (talk) 12:11, 19 July 2009 (UTC)

The Wallace/Dowe paper[8] explains it. 69.228.170.24 (talk) 16:42, 29 May 2010 (UTC)

## Merge with Invariance theorem

The Invariance theorem article is a stub with little promise of expansion. The material could easily be merged here. Also, the term "Invariance theorem" pretty generic and could refer to several theorems in a variety of area. It seem to me that once the merge is complete then the existing Invariance theorem article should be changed to either a redirect to Invariant (mathematics) or a dab page.--RDBury (talk) 22:14, 12 July 2010 (UTC)

• Yes, merge. My little mind cannot imagine how that article can be expanded beyond what it already says, which is about the same as what this article says. linas (talk) 16:56, 6 June 2012 (UTC)

## The "Theorem" in the Definition section

"Theorem. If K1 and K2 are the complexity functions relative to description languages L1 and L2, then there is a constant c (which depends only on the languages L1 and L2) such that

${\displaystyle \forall s\ |K_{1}(s)-K_{2}(s)|\leq c.}$"

I do not understand how this can be true. The problem is that it is always possible to create a sufficiently stupid, but Turing-complete, language that the minimal program for generating a particular string grows more than linearly with the string length in the worst case. An example of such a stupid language would be one in which all elements in the alphabet are equal, but context-dependent, and the only possibility there is for encoding anything lies in the length of the program (Unary expansion of BrainFuck is an example), such that program size grows exponentially with the length of s. At the same time, a language can always be created which only grows linearly with the length of s. This will violate the above theorem.

I suppose, with added conditions, that a similar real theorem exists, and I would be interested in knowing how it should be.

Another question which discloses a rather obvious limitation of this article, is: "What interprets the interpreter program?" (not to say "What interprets the interpreter of the interpreter?", which looks like a vicious regress.) It seems as though some objective basis is needed at the bottom, or at least a real explanation of why the choice of basis does not matter much. —Preceding unsigned comment added by 129.240.72.41 (talk) 14:07, 6 October 2010 (UTC)

The "proof" has the key: That there is a program in L1 which interprets L2.
A problem with your "language" is that we may no longer have K(s) ≤ |s| + c, (under Basic results), which is generally considered a requirement for a complexity. — Arthur Rubin (talk) 14:15, 6 October 2010 (UTC)
Stupid question: Isn't the proof not a proof? It supposes the existence of a program in L1 which interprets L2, but never proves that such a program necessarily exists. 66.102.14.9 (talk) 22:06, 7 July 2011 (UTC)
Since L1 is supposed to be Turing-Complete, then we can assume such a program exists by Church-Turing Thesis.186.206.10.100 (talk) 16:35, 27 January 2014 (UTC)

## Shannon information

What is the relation of Kolmogorov complexity to Information theory? It seems very close to Shannon entropy, in that both are maximised by randomness, although one seems to deal more with the message than the context. Cesiumfrog (talk) 00:03, 14 October 2010 (UTC)

## Inconsistencies in the pseudocode?

The pseudocode:

  function GenerateComplexString(int n)
for i = 1 to infinity:
for each string s of length exactly i
if KolmogorovComplexity(s) >= n
return s
quit


and:

  function GenerateParadoxicalString()
return GenerateComplexString(n0)


Seem to use two different languages. In the first case "return" means something like "save this value for the calling function but continue executing like normal", which necessitates the "quit" statement directly after. The second example does not have a "quit" statement and instead works like normal programming languages like C, Pascal, Python, etc.

Suggestion: remove the "quit" statements in all the examples to make the pseudocode function more in line with how "return" is normally (always?) interpreted. --Boxed (talk) 12:28, 20 January 2011 (UTC)

---

I'd like to remove the pseudocode and just write in English and standard math notation. The pseudocode is distracting and doesn't help the article, IMHO. 06:05, 11 February 2011 (UTC) —Preceding unsigned comment added by 71.141.88.54 (talk)

Okay by me. Personally I don't like pseudocode, especially "structured" pseudocode, because it doesn't lend itself to a rapid translation into a flowchart. What I've done is create a "sandbox page" in my namespace, here: User:Wvbailey/Kolmogorov complexity. There you can experiment to your heart's content before inserting it in the article, plus others can go there too if they want and comment. If you find the need for a more formal algorithm I'd recommend unstructured Basic with line numbers coupled with Knuth's presentation style (see the Euclid's algorithm example at Algorithm). Bill Wvbailey (talk) 16:06, 11 February 2011 (UTC)

## Pi

Is pi a good example of a Kolmogorov-simple string? After all, it's an infinite, nonrepeating string that can be defined by a very simple rule. Does this warrant inclusion in the article? 212.68.15.66 (talk) 12:39, 2 February 2011 (UTC)

We can easily find infinite set of strings, which are infinite, nonrepeating and have short definition, but we certainly can not place them all in the article... ;) --CiaPan (talk) 11:40, 3 February 2011 (UTC)
True dat, but pi is good ^^ 212.68.15.66 (talk) 07:03, 4 February 2011 (UTC)
is there some notion of time involved in the program that generates the string? E.g. a program to generate the infinite list of Pi digits can be written but we know (believe?) that it will never finish.
On a related note, the short program that generates the mandalbrot image may take a long time to run. Are two programs (string descriptions) of identical length of equal kolmogorov complexity even if we know (believe?) that the second program takes <K> times longer to run? Perhaps this last question is only answerable if we state assumptions about the underlying abstract machine used to run the program? Funkyj (talk) 20:29, 4 May 2011 (UTC)
It's kind of trivial that a program to list all the decimal digits of π will never finish. A program to list all the decimal digits of the real number zero will never finish either. Granted, they're sort of monotonous.... --Trovatore (talk) 21:03, 4 May 2011 (UTC)
thanks for providing a trivial and uninteresting response to my question. Feel free to make an effort at an interesting comment.
E.g. you might comment on how the string "pi to 10 digits of accuracy" and "pi to 100000000 digits of accuracy" can be represented by the same program with slightly different inputs. Does this mean that the string 3.14 is no more complex than 100 digits of pi?
It takes less information to represent 3 than it takes to represent 100. "11" for three vs "1100100" for one-hundred. The complexity of the representation will increase by log(n) where n is the number of digits of pi you want. I believe the article does mention this, actually. — Preceding unsigned comment added by 67.42.85.224 (talk) 19:06, 17 June 2011 (UTC)
Also, a simple answer like "the measure of K. complexity does not care how long a program that represents a string takes to run, it only cares bout the length of the program" (or what ever is the true statement) would be nice. Funkyj (talk) 21:19, 5 May 2011 (UTC)

## Chaitin

An anonymous editor deleted all mentions of Gregory Chaitin from this article, alleging it to be self-promotion. Is there some evidence of self-promotion in editing this article by Gregory Chaitin? Since long before Wikipedia existed, I've heard the name Chaitin associated with this concept. It seems bizarre to exclude Chaitin from an account of this topic. Michael Hardy (talk) 16:36, 4 February 2011 (UTC)

I'm in total agreement with you. I've read his popularization MetaMath! The quest for Omega and learned a bunch from it (and pp. 204-206 it has an interesting Bibliography). Also I have an old source ca 1980's written by Martin Davis that mentions Chaitin prominently -- that was where I first encountered his name (if we need the source I'll produce it + I think it's in a talk discussion a few up from this one. But I don't think we will. We can discount this edit as grafitti). Bill Wvbailey (talk) 18:48, 4 February 2011 (UTC)
Chaitin is given to self-promotion (read any of his popular writings to see this) and the article may have gotten into a state that featured his work too prominently compared to that of others. But eliminating all mention of him is going overboard. 71.141.88.54 (talk) 08:15, 10 February 2011 (UTC)
I'd like to shorten the stuff about Chaitin's incompleteness theorem and AIT and add a summary/citation of some of Panu Raatikainen's exposition[9] about those topics. I'd also like to get rid of the stuff about Mark Burgin and super-recursiveness, which seems irrelevant to the subject of this article. 71.141.88.54 (talk) 08:29, 10 February 2011 (UTC)
I have no strong feelings one way or the other, but I worry about self-promotion; we'd have to look at your edits. What I'd suggest is leave the Chaitin material alone but edit out Burgin and add material that you think would benefit the article. But rather than do a major edit as "anonymous", why not reveal yourself first by signing up as an editor? You may notice that Michael Hardey and I aren't hiding behind an IP number. BillWvbailey (talk) 15:21, 10 February 2011 (UTC)

## Maximum minimum description length

To explain my edit from: "It can be shown that the Kolmogorov complexity of any string cannot be too much larger than the length of the string itself." to "It can be shown that the Kolmogorov complexity of any string cannot be more than a few bytes larger than the length of the string itself."

The former phrase was vague, and might mean a multiplicative factor rather than an additive factor. Any incompressible string can be wrapped in a print function. The extra length is just the length of the print function, which is a few bytes in any reasonable description language. A description language can be specified that automatically prints any string that starts without a recognized language keyword, so nearly all incompressible strings will be printed without any overhead. The remainder require wrapping with a print function, but on average the overhead will be negligible. Any string that has a significant proportion of description language keywords will not be incompressible. At any rate, a string that is designed to require maximum overhead beyond its own length in a given description language can be represented in another description language with no more than a few bytes to specify the choice of that description language.Enon (talk) 02:49, 26 July 2010 (UTC)

Hmmm. Depending on quoting characters required, it might actually require multiples. Depending on the description language, it may be possible that the simple description might end up being a multiple. You may be right that in "a natural" programming language, the shortest description would be a bounded quantity longer than the string, but it isn't obvious, and would require a source. — Arthur Rubin (talk) 07:34, 23 September 2011 (UTC)
It's true in any universal code because it's true in the trivial code $f$ such that $f(s) =s$ for every string $s$. By the definition of universality, for any effective code there is a constant such that the universal system is no worse than that fixed code plus a constant. So for any universal system used to define $K$ there is a constant $C$ such that $K(s) \leq |s| + C$. — Carl (CBM · talk) 11:52, 23 September 2011 (UTC)

## minimum desciption length

""namely "ab 32 times", which consists of 11 characters"" according to w.org measuring unit rules should write as ""namely "ab 32 time"s, which consists of 10 characters. ...just an observation 86.121.67.182 (talk) 19:23, 22 September 2011 (UTC)Paul

Actually, not, if you're rational. And, besides, it should be "'ab' 32 times", making it 13 characters, because of the required quotes. — Arthur Rubin (talk) 07:34, 23 September 2011 (UTC)
Why not just say "ab×32", which is 5 characters? - dcljr (talk) 20:03, 6 December 2013 (UTC)
That would be abbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb. One might accept "'ab'x32" which is 7 characters. Specific representation schemes do not have any place in this article, though. — Arthur Rubin (talk) 02:38, 7 December 2013 (UTC)

## Simpler semi-formal proof of incomputability of Kolmogorov complexity

I suggest the following rephrasing of section Kolmogorov complexity#Incomputability of Kolmogorov complexity. My wording may still need improvement, but I'd like to suggest the following changes:

1. The proof is considerably simpler by using fictitious concrete length values, and I believe the technical subtleties about n growing faster than log(n) in the current version of the proof are not really needed, since the proof is (and should be) somewhat informal, anyway.
2. The argument for the termination of GenerateComplexString should be stated as an explicit theorem (the 1st one in my version below).
3. I experimented with a light-blue "word count" in the Berry-paradox sentence, and I think it looks nice.

All these issues are independent of each other.

******************** BEGIN OF SUGGESTED TEXT ********************

Theorem: There exist strings of arbitrary large Kolmogorov complexity. Formally: for each n ∈ ℕ, there is a string s with K(s) ≥ n.[note 1]

Otherwise all possible strings (infinitely many) could be generated by the (finitely many)[note 2] programs with lower complexity.

Theorem: K is not a computable function.

In other words, there is no program which takes a string s as input and produces the integer K(s) as output. The following indirect proof uses a simple Pascal-like language to denote programs; for sake of proof simplicity assume its description (i.e. an interpreter) to have a length of 1'400'000 bits. Assume for contradiction there is a program

  function KolmogorovComplexity(string s)


which takes as input a string s and returns K(s); for sake of proof simplicity, assume its length to be 7'000'000'000 bits.[note 3] Now, consider the following program of length 1'288 bits:

  function GenerateComplexString()
for i = 1 to infinity:
for each string s of length exactly i
if KolmogorovComplexity(s) >= 8000000000
return s


Using KolmogorovComplexity as a subroutine, the program tries every string, starting with the shortest, until it returns a string with Kolmogorov complexity at least 8'000'000'000 bits,[note 4] i.e. a string that cannot be produced by any program shorter than 8'000'000'000 bits. However, the overall length of the above program that produced s is only 7'001'401'288 bits, which is a contradiction. (If the code of KolmogorovComplexity has any other length, the constant used in GenerateComplexString can always be changed appropriately.)

This proof uses a contradiction similar to that of the Berry paradox: "1The 2smallest 3positive 4integer 5that 6cannot 7be 8defined 9in 10fewer 11than 12twenty 13English 14words". It is also possible to show the non-computability of K by reduction from the non-computability of the halting problem H, since K and H are Turing-equivalent.[1]

In the programming language community there is a corollary known as the full employment theorem, stating that there is no perfect size-optimizing compiler.

References

Notes

1. However, an s with K(s) = n needn't exist for every n. For example, if n isn't a multiple of 7 bits, no ASCII program can have length n.
2. There are 1 + 128 + 1282 + ... + 128n = {{ safesubst:#invoke:Unsubst||\$B=128n+1 − 1/127}} different 7-bit-ASCII program texts of length up to 7⋅n bits; cf. geometric series.
3. possibly padded with spaces or nops
4. By the previous theorem, such a string exists.

******************** END OF SUGGESTED TEXT ********************

Jochen Burghardt (talk) 16:58, 25 January 2014 (UTC)

As there were no objections, I moved the suggested text above to the article today. I intend to come up with a similar simplification of the proof in section Kolmogorov complexity#Chaitin's incompleteness theorem. - Jochen Burghardt (talk) 16:12, 7 February 2014 (UTC)
Template:U, I'm afraid I've been busy, but I do not like the new proof; it does not seem "simpler". If you could rewrite it so it does not use specific numbers, that would be acceptable to me. Otherwise, I feel the previous proof (with the addition of the theorem that there are strings of arbitrarily large complexity) is better. Furthermore, neither proof falls under WP:CALC, so a reference for (either) proof would be needed. Also, course notes are not generally a reliable source; unless the author is published author in the field, they should not be considered a reliable source. As an aside, the numeric notation you are using is a clear violation of WP:MOSNUM#Delimiting (grouping of digits). — Arthur Rubin (talk) 15:45, 13 February 2014 (UTC)
Sorry if I was too hasty; if you insist, I'll restore the original proof from the 31 Jan version. My intention was to get rid of the additional function GenerateParadoxicalString (with its "free parameter" n0 not belonging to the programming language), and of the fiddling with lengths constants n, n0, U, C, etc. which I find lengthy and difficult to read. If your reason for disliking the concrete-numbers version is that it doesn't mention that n grows faster than log(n), we could add some explanation after "...can always be changed appropriately".
I wasn't yet aware of WP:MOSNUM, thanks for the hint; I'd suggest to use the ((value)) template, if the numbers are kept.
Concerning a reference (the previous version didn't have one either, so I didn't search before your above message), I could only find:
• {{#invoke:citation/CS1|citation

|CitationClass=book }} Here: p.207

• {{#invoke:citation/CS1|citation

|CitationClass=book }} Here: Section 2.3, Theorem 2.3.2, p.121

Both proofs are almost literally equal to each other, but differ from each wikipedia version. Li+Vitányi attribute their proof to Kolmogorov, cited from the survey of
• {{#invoke:Citation/CS1|citation

|CitationClass=journal }}

They prove more generally that no partial recursive function, defined (i.e. its algorithm terminating) on an infinite set, can coincide with K over its whole domain of definition. The proof is quite short and indirect, but the resemblance to Berry's paradox is only implicit there. Therefore, I'd prefer (any version of) the wikipedia version. - Jochen Burghardt (talk) 10:15, 14 February 2014 (UTC)
Here is the proof from Li+Vitányi (1990):
Assume for contradition a partial recursive function φ, defined on an infinite set, and coinciding there with K. Using a Theorem from computability theory, select an infinite recursive subset A in the domain of the definition of φ. The function f(m) = min { x: K(x)≥mxA } is total recursive, since K(x)=φ(x) on A, and takes arbitrarily large values. Hence
 m ≤ K(f(m)) by definition of f(m) ≤ Kf(f(m)) + cf by the invariance theorem (which doesn't need Turing completeness for Kf in Li+Vitányi's version) ≤ |m| + cf since Kf(f(m)) = min { |p|: f(p)=f(m) } by Li+Vitányi's definition (p.198) = log(m) + cf since the number m is considered as a bit string
But this is impossible for arbitrarily large m, and cf independent of m.
I think, φ, f, cf, and m corresponds to KolmogorovComplexity, GenerateComplexString, U+C, and n0, respectively, in the article's 31 Jan version. - Jochen Burghardt (talk) 11:43, 16 February 2014 (UTC)