Gires–Tournois etalon: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Yobot
m →‎References: WP:CHECKWIKI error fixes - Replaced endash with hyphen in sortkey per WP:MCSTJR using AWB (9100)
en>Tom.Reding
m →‎References: Gen fixes (page/s, endash, &nbsp, et al., unicodify, concising wikilinks, etc.), ref cleanup using AWB
 
Line 1: Line 1:
{{Refimprove|date=August 2009}}
Oscar is what my wife loves to contact me and I completely dig that name. Doing ceramics is what love doing. California is our beginning place. For many years he's been working as a receptionist.<br><br>Feel free to visit my web blog ... over the counter std test; [http://www.pornteub.com/user/NSVL visit the following web page],
{{no footnotes|date=December 2013}}
In [[statistics]] and [[information theory]], a '''maximum entropy probability distribution''' is a [[probability distribution]] whose [[information entropy|entropy]] is at least as great as that of all other members of a specified class of distributions.  
 
According to the [[principle of maximum entropy]], if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.
 
== Definition of entropy ==
{{further2|[[Entropy (information theory)]]}}
 
If ''X'' is a [[discrete random variable]] with distribution given by
:<math>\operatorname{Pr}(X=x_k) = p_k \quad\mbox{ for } k=1,2,\ldots</math>
then the entropy of ''X'' is defined as
:<math>H(X) = - \sum_{k\ge 1}p_k\log p_k .</math>
 
If ''X'' is a [[continuous random variable]] with [[probability density function|probability density]] ''p''(''x''), then the entropy of ''X'' is sometimes defined as<ref>Williams, D. (2001) ''Weighing the Odds''  Cambridge UP ISBN 0-521-00618-X  (pages 197-199)</ref><ref>Bernardo, J.M., Smith, A.F.M. (2000) ''Bayesian Theory'.'  Wiley. ISBN 0-471-49464-X (pages 209, 366)</ref><ref>O'Hagan, A. (1994) ''Kendall's Advanced Theory of statistics, Vol 2B, Bayesian Inference'', Edward Arnold. ISBN 0-340-52922-9 (Section 5.40)</ref>
:<math>H(X) = - \int_{-\infty}^\infty p(x)\log p(x) dx</math>
where ''p''(''x'') log ''p''(''x'') is understood to be zero whenever ''p''(''x'') = 0. In connection with maximum entropy distributions, this form of definition is often the only one given, or at least it is taken as the standard form. However, it is recognisable as the special case ''m''=1 of the more general definition
:<math>H^c(p(x)\|m(x)) = -\int p(x)\log\frac{p(x)}{m(x)}\,dx,</math>
which is discussed in the articles [[Entropy (information theory)]] and [[Principle of maximum entropy]].
 
The base of the [[logarithm]] is not important as long as the same one is used consistently: change of base merely results in a rescaling of the entropy. Information theoreticians may prefer to use base 2 in order to express the entropy in [[bit]]s; mathematicians and physicists will often prefer the [[natural logarithm]], resulting in a unit of [[Nat (information)|nat]]s or [[neper]]s for the entropy.
 
== Examples of maximum entropy distributions ==
 
A table of examples of maximum entropy distributions is given in Park & Bera (2009)<ref>{{cite journal |last1=Park |first1=Sung Y. |last2=Bera |first2=Anil K. |year=2009 |title=Maximum entropy autoregressive conditional heteroskedasticity model |journal=Journal of Econometrics |volume= |issue= |pages=219–230 |publisher=Elsevier |doi= |url=http://www.wise.xmu.edu.cn/Master/Download/..%5C..%5CUploadFiles%5Cpaper-masterdownload%5C2009519932327055475115776.pdf |accessdate=2011-06-02 }}</ref>
 
=== Given mean and standard deviation: the normal distribution ===
 
The [[normal distribution]] N(μ,σ<sup>2</sup>) has maximum entropy among all [[real number|real]]-valued distributions with specified [[mean]] μ and [[standard deviation]] σ. Therefore, the assumption of normality imposes the minimal prior structural constraint beyond these moments.(See the [[Differential_entropy#Maximization_in_the_normal_distribution|differential entropy]] article for a derivation.)
 
=== Uniform and piecewise uniform distributions ===
 
The [[Uniform distribution (continuous)|uniform distribution]] on the interval [''a'',''b''] is the maximum entropy distribution among all continuous distributions which are supported in the interval [''a'', ''b''] (which means that the probability density is 0 outside of the interval).
 
More generally, if we're given a subdivision ''a''=''a''<sub>0</sub> < ''a''<sub>1</sub> < ... < ''a''<sub>''k''</sub> = ''b'' of the interval [''a'',''b''] and probabilities ''p''<sub>1</sub>,...,''p''<sub>''k''</sub> which add up to one, then we can consider the class of all continuous distributions such that
:<math>\operatorname{Pr}(a_{j-1}\le X < a_j) = p_j \quad \mbox{ for } j=1,\ldots,k</math>
The density of the maximum entropy distribution for this class is constant on each of the intervals [''a''<sub>''j''-1</sub>,''a''<sub>''j''</sub>); it looks somewhat like a [[histogram]].
 
The uniform distribution on the finite set {''x''<sub>1</sub>,...,''x''<sub>''n''</sub>} (which assigns a probability of 1/''n'' to each of these values) is the maximum entropy distribution among all discrete distributions supported on this set.
 
=== Positive and given mean: the exponential distribution ===
 
The [[exponential distribution]] with mean 1/λ is the maximum entropy distribution among all continuous distributions supported in [0,∞] that have a mean of 1/λ.
 
In physics, this occurs when gravity acts on a gas that is kept at constant pressure and temperature: if ''X'' describes the distance of a molecule from the bottom, then the variable ''X'' is exponentially distributed (which also means that the density of the gas depends on height proportional to the exponential distribution). The reason: ''X'' is clearly positive and its mean, which corresponds to the average [[potential energy]], is fixed. Over time, the system will attain its maximum entropy configuration, according to the [[second law of thermodynamics]].
 
=== Discrete distributions with given mean ===
 
Among all the discrete distributions supported on the set {''x''<sub>1</sub>,...,''x''<sub>''n''</sub>} with mean μ, the maximum entropy distribution has the following shape:
:<math>\operatorname{Pr}(X=x_k) = Cr^{x_k} \quad\mbox{ for } k=1,\ldots, n</math>
where the positive constants ''C'' and ''r'' can be determined by the requirements that the sum of all the probabilities must be 1 and the expected value must be μ.
 
For example, if a large number ''N'' of dice are thrown, and you are told that the sum of all the shown numbers is ''S''. Based on this information alone, what would be a reasonable assumption for the number of dice showing 1, 2, ..., 6? This is an instance of the situation considered above, with {''x''<sub>1</sub>,...,''x''<sub>6</sub>} = {1,...,6} and μ = ''S''/''N''.
 
Finally, among all the discrete distributions supported on the infinite set {''x''<sub>1</sub>,''x''<sub>2</sub>,...} with mean μ, the maximum entropy distribution has the shape:
:<math>\operatorname{Pr}(X=x_k) = Cr^{x_k} \quad\mbox{ for } k=1,2,\ldots ,</math>
where again the constants ''C'' and ''r'' were determined by the requirements that the sum of all the probabilities must be 1 and the expected value must be μ. For example, in the case that ''x<sub>k</sub> = k'', this gives
:<math>C = \frac{1}{\mu - 1} , \quad\quad r = \frac{\mu - 1}{\mu} ,</math>
 
such that respective maximum entropy distribution is the [[geometric distribution]].
 
=== Circular random variables ===
 
For a continuous random variable <math>\theta_i</math> distributed about the unit circle, the [[Von Mises distribution]] maximizes the entropy when given the real and imaginary parts of the first [[Directional statistics|circular moment]]<ref name="SRJ">{{cite book |title=Topics in circular statistics |last=Jammalamadaka |first=S. Rao |authorlink= |coauthors=SenGupta, A.|year=2001 |publisher=World Scientific |location=New Jersey |isbn=981-02-3778-2 |url=http://books.google.com/books?id=sKqWMGqQXQkC&printsec=frontcover&dq=Jammalamadaka+Topics+in+circular&hl=en&ei=iJ3QTe77NKL00gGdyqHoDQ&sa=X&oi=book_result&ct=result&resnum=1&ved=0CDcQ6AEwAA#v=onepage&q&f=false |accessdate=2011-05-15}}</ref> or, equivalently, the [[circular mean]] and [[circular variance]].
 
When given the mean and variance of the angles <math>\theta_i</math> modulo <math>2\pi</math>, the [[wrapped normal distribution]] maximizes the entropy.<ref name="SRJ"/>
 
== A theorem by Boltzmann ==
 
All the above examples are consequences of the following theorem by [[Ludwig Boltzmann]].
 
=== Continuous version ===
 
Suppose ''S'' is a [[closed set|closed subset]] of the [[real number]]s '''R''' and we're given ''n'' [[measurable function]]s ''f''<sub>1</sub>,...,''f''<sub>''n''</sub> and ''n'' numbers ''a''<sub>1</sub>,...,''a''<sub>''n''</sub>. We consider the class ''C'' of all continuous random variables which are supported on ''S'' (i.e. whose density function is zero outside of ''S'') and which satisfy the ''n'' [[expected value]] conditions
:<math>\operatorname{E}(f_j(X)) = a_j\quad\mbox{ for } j=1,\ldots,n</math>
 
If there is a member in ''C'' whose density function is positive everywhere in ''S'', and if there exists a maximal entropy distribution for ''C'', then its probability density ''p''(''x'') has the following shape:
:<math>p(x)=c \exp\left(\sum_{j=1}^n \lambda_j f_j(x)\right)\quad \mbox{ for all } x\in S</math>
where the constants ''c'' and λ<sub>''j''</sub> have to be determined so that the integral of ''p''(''x'') over ''S'' is 1 and the above conditions for the expected values are satisfied.
 
Conversely, if constants ''c'' and λ<sub>''j''</sub> like this can be found, then ''p''(''x'') is indeed the density of the (unique) maximum entropy distribution for our class ''C''.
 
This theorem is proved with the [[calculus of variations]] and [[Lagrange multipliers]].
 
=== Discrete version ===
 
Suppose ''S'' = {''x''<sub>1</sub>,''x''<sub>2</sub>,...} is a (finite or infinite) discrete subset of the reals and we're given ''n'' functions ''f''<sub>1</sub>,...,''f''<sub>''n''</sub> and ''n'' numbers ''a''<sub>1</sub>,...,''a''<sub>''n''</sub>. We consider the class ''C'' of all discrete random variables ''X'' which are supported on ''S'' and which satisfy the ''n'' conditions
:<math>\operatorname{E}(f_j(X)) = a_j\quad\mbox{ for } j=1,\ldots,n</math>
 
If there exists a member of ''C'' which assigns positive probability to all members of ''S'' and if there exists a  maximum entropy distribution for ''C'', then this distribution has the following shape:
:<math>\operatorname{Pr}(X=x_k)=c \exp\left(\sum_{j=1}^n \lambda_j f_j(x_k)\right)\quad \mbox{ for } k=1,2,\ldots</math>
where the constants ''c'' and λ<sub>''j''</sub> have to be determined so that the sum of the probabilities is 1 and the above conditions for the expected values are satisfied.
 
Conversely, if constants ''c'' and λ<sub>''j''</sub> like this can be found, then the above distribution is indeed the maximum entropy distribution for our class ''C''.
 
This version of the theorem can be proved with the tools of ordinary [[calculus]] and [[Lagrange multipliers]].
 
=== Caveats ===
 
Note that not all classes of distributions contain a maximum entropy distribution. It is possible that a class contain distributions of arbitrarily large entropy (e.g. the class of all continuous distributions on '''R''' with mean 0 but arbitrary standard deviation), or that the entropies are bounded above but there is no distribution which attains the maximal entropy (e.g. the class of all continuous distributions ''X'' on '''R''' with E(''X'') = 0 and E(''X''<sup>2</sup>) = E(''X''<sup>3</sup>) = 1 (See Cover, Ch 11)).
 
It is also possible that the expected value restrictions for the class ''C'' force the probability distribution to be zero in certain subsets of ''S''. In that case our theorem doesn't apply, but one can work around this by shrinking the set ''S''.
 
==See also==
* [[Exponential family]]
* [[Gibbs measure]]
* [[Partition function (mathematics)]]
 
== Notes ==
{{Reflist}}
{{More footnotes|date=August 2009}}
 
== References ==
* T. M. Cover and J. A. Thomas, ''Elements of Information Theory'', 1991. Chapter 11.
* I. J. Taneja, ''[http://www.mtm.ufsc.br/~taneja/book/book.html Generalized Information Measures and Their Applications]'' 2001. [http://www.mtm.ufsc.br/~taneja/book/node14.html Chapter 1]
 
{{ProbDistributions|families}}
 
{{DEFAULTSORT:Maximum Entropy Probability Distribution}}
[[Category:Entropy and information]]
[[Category:Continuous distributions]]
[[Category:Discrete distributions]]
[[Category:Particle statistics]]
[[Category:Types of probability distributions]]

Latest revision as of 07:44, 28 December 2014

Oscar is what my wife loves to contact me and I completely dig that name. Doing ceramics is what love doing. California is our beginning place. For many years he's been working as a receptionist.

Feel free to visit my web blog ... over the counter std test; visit the following web page,